Prior to signing up with Amazon Web Services (AWS), GoSquared’s operations were hosted on reserved, dedicated machines at another service provider. With this arrangement, there were constraints on capacity planning, compute resource acquisition, and the number of tools and services available.
Geoff Wagstaff, Cofounder and Engineer at GoSquared, notes, “AWS is the exact opposite, providing the primary motive for our migration. Another motive was the rapid release cycle of AWS, and the many areas in which they are pioneering cloud computing. Since we signed on, new tools and services have been released that make our job easier.”
AWS forms the backbone of all of the compute and storage resources necessary for GoSquared’s high performance real-time applications. Wagstaff says, “Amazon Elastic Compute Cloud [Amazon EC2] is the workhorse driving our compute operations. This enables us to independently scale and provision compute resources at each tier in the stack, from the top-level Web servers, through the application servers, right down to the data pool lying underneath.”
Traffic is balanced evenly across instances and availability zones by Amazon Elastic Load Balancing (Amazon ELB), which monitors instance health and reacts accordingly without the need for user intervention. Working in tandem with Amazon ELB is Amazon EC2 Auto Scaling, which provides enough compute capacity to help break the waves of incoming data. Wagstaff notes, “Sourcing this data is our tracking code, which is globally deployed via Amazon CloudFront’s edge locations to ensure code distribution is as fast and low-latency as possible.” GoSquared also uses Amazon Simple Storage Service (Amazon S3) coupled with Amazon Elastic Block Store (Amazon EBS) to enable its Amazon EC2 stack to run high-performance data storage, processing, and redundancy cost-effectively, with data integrity assured by snapshots and backups to Amazon S3. Wagstaff says, “We run our own in-house site thumbnail image generation service, which is also a fully scalable and distributed system built on top of Amazon Simple Queue Service [Amazon SQS].”
A really nice architectural feature of the GoSquared platform is the integration of Spot Instances. For their data analysis and tracking platform, GoSquared balance incoming data across a collection of Amazon EC2 instances. Some of those instances are under Auto Scaling control, which means they automatically scale up and down based on demand, but the remainder are provisioned as Spot Instances. A low bid price ensures that costs stay down, and a collection of CloudWatch metrics and alarms gracefully replace terminated Spot Instances to ensure availability should the EC2 spot price exceed the bid price.
The team used a variety of tools to build their solution, including NginX, PHP (and extensive use of the official AWS PHP SDK), Node.JS, Memcached, Redis, MongoDB, and MySQL. They also automate numerous server processes with Bash scripts that use the command-line tools for AWS services.
The team is in the process of integrating Amazon Simple Email Service (Amazon SES) for its LiveStats Traffic Alerts service.
The company indicates that its applications handle the aggregated traffic of over 8,000 Websites. Wagstaff comments, “AWS has undoubtedly reduced our time to market and development time.”
Wagstaff observes that developing for AWS challenges developers and system administrators. He says, “They must craft a solution that seamlessly integrates application logic, system processes, and virtualized hardware to produce a fault-tolerant, highly available, scalable, and distributed system, where work is delegated and coordinated in a controlled manner.” He adds, “This is a positive side effect, as hosting in the cloud encourages good software and system design paradigms.”
Overall, in working with AWS, the team particularly appreciates the flexibility of AWS. Wagstaff says, “There’s a great sense of freedom when using the service. You can provision and use what you need, with no lock-in contracts. The release cycle for new tools and products is impressive, which diversifies the range of tools available to engineers allowing them to remain agile and flexible in the systems that they can deploy.”
To learn more, visit http://www.gosquared.com/.
Updated September 21, 2011