Initially, Litmus was hosted on a combination of in-house hardware and dedicated servers. The company grew quickly, and soon they outgrew their hardware. As a team who is dedicated to simplifying processes and saving money, they began hunting for better alternatives. Co-Founder, Paul Farnell tells us about the process, “We looked for solutions that would meet our needs of scalability and cost. We chose Amazon S3 because there was nothing else like it when we first started. For Amazon EC2 we initially trialed a competitor to Amazon, but found it to be tremendously unreliable. Comparatively, Amazon has been a dream. We especially liked the fact that there was no minimum spend and we could get started without going through a lengthy sales process – which wasn’t the case with one of the other providers we looked at.”
“Amazon Web Services have become a key part of our technology strategy. With their easy setup, economical costs, and helpful support staff, we've been able to migrate almost all of our infrastructure over to AWS. In the process we’ve managed to cut down the time we spent on server administration and scalability, in general.”
Today, Litmus uses Amazon S3 to store over 6TB of customers’ images and Amazon EC2 for running customers’ tests. Farnell shares the details, “When we first started out we stored the images on our own hardware, but as we grew we realized this was quickly going to become a headache. By using S3 we were able to focus on improving our product, not worrying about scaling up our storage. We also use Amazon EC2 to run the automated email tests for our customers; we currently have 400 EC2 servers. By using EC2 we’re able to add more servers to our grid during the busy periods of the day, and remove them during quieter periods. Before using EC2, Litmus used to run more slowly during our busiest periods, but with EC2’s flexibility we’re able to offer a consistently fast experience for our customers – and it costs us less.”
Litmus CTO Matt Brindley adds, “Spot Instances in particular have helped us gain significant EC2 cost savings. We are mainly running an asynchronous backend batch process on EC2, which is perfectly suited for Spot. Specifically, we have a queue-based architecture where a worker node will pull a job from the queue and then process it. As worker nodes appear after a Spot bid is accepted, they can just take jobs off of the queue. If any instance is interrupted, the work will re-appear on the queue after a short period of time. To incorporate Spot into our software, we just had to make some minor changes to our script to try to launch Spot Instances first, and then to request On-Demand instances if we don’t get any Spot Instances after 20 minutes. To minimize the chance of interruption, we typically bid higher than the On-Demand price, because we are only charged the market Spot price, which is typically lower than the On-Demand price. Simply put, we are willing to take the chance of paying a little more on occasion in order to not have our jobs interrupted. Based on this set of changes, we have been able to save around 57% per month off our EC2 bill. Our biggest surprise when using Spot Instances is how rarely our instances have been interrupted.”
Farnell concludes: “AWS has helped us to expand our service to a much larger number of users more quickly than we otherwise could have done.”
Published February 2010. Updated April 2011.