How AWS Customers Are Running Containerized Environments on Amazon EC2 Spot Instances
By Amiram Shachar, CEO at Spotinst
Amazon EC2 Spot Instances, at up to 90 percent off the On-Demand price, are one of the best ways to dramatically cut your Amazon Elastic Compute Cloud (Amazon EC2) costs on Amazon Web Services (AWS). With the new pricing model, there has never been a better time to start leveraging Spot Instances.
Spotinst is an AWS Partner Network (APN) Advanced Technology Partner with AWS Competencies in both Containers and Cloud Management Tools. Our DevOps Automation Platform helps businesses reduce operational overhead with automation, better utilize their cloud infrastructure, and cut costs by reliably leveraging Spot Instances.
In this post, we share a few stories from Spotinst customers outlining how they maximized infrastructure efficiency at minimum cost by using our flagship products, Elastigroup and Ocean.
Elastigroup is a fully automated application scaling service that runs any AWS workload on the best possible mix of Reserved, On-Demand, and Spot Instances. Ocean is a serverless container engine allowing customers to reap the benefits of Kubernetes and containers without having to worry about managing and scaling infrastructure.
With Spot Instances, you pay the Spot price that’s in effect for the time period your instances are running. Spot Instance prices are set by Amazon EC2 and adjust gradually based on long-term trends in supply and demand for Spot Instance capacity.
Container workloads are a great candidate for running on Spot Instances, thanks to their stateless nature and ability to run on a diverse set of resources. Container orchestration tools include Kubernetes, HashiCorp Nomad, Apache Mesos, Amazon Elastic Container Service (Amazon ECS), Amazon Elastic Container Service for Kubernetes (Amazon EKS), and Docker Swarm.
To see this natural fit in action, let’s walk through a few examples of how AWS customers are running containers at scale on Spot Instances across their development and production environments.
ClearCare Saves with Amazon ECS & EC2 Spot Instances
Scaling infrastructure without scaling costs can be challenging. As a healthcare software-as-a-service (SaaS) product, ClearCare’s major focus is on security and availability, which doesn’t leave much time for cost optimization projects.
When scaling up or down, ensuring full CPU utilization was time-consuming for ClearCare. Automating the Tetris game of matching the right workloads to the right instance types would have been a major boost to their productivity and cost structure.
Within a few weeks, ClearCare’s DevOps team implemented Amazon ECS and Spotinst Autoscaler behind the scenes. This ensured high availability, lower costs, and maximal utilization when scaling. “It was really an out-of-the-box solution,” says Glenn Poston, manager of systems reliability and DevOps at ClearCare. “All we had to do was turn it on and we were done.”
Spotinst’s Amazon ECS integration allowed ClearCare to fully utilize the underlying infrastructure of their clusters, and leverage Spot Instances while doing so. The Spotinst ECS Autoscaler automatically recognizes task and service needs, and adjusts the underlying EC2 instances to make sure all workloads have the capacity to run.
Furthermore, Spotinst makes sure that every instance is utilized to the fullest extent. With the combination of better EC2 utilization and the intelligent use of Spot instances, ClearCare saves over $40,000 monthly on their compute bill. Figure 1 shows ClearCare’s cost trend before and after migrating to EC2 Spot Instances in their Amazon ECS environments.
Figure 1 – Cost over time for ClearCare.
Ticketmaster Runs Kubernetes on EC2 Spot Instances
Ticketmaster relies on Kubernetes, often referred to as k8s, to orchestrate their mission-critical applications, such as online ticketing platforms.
Running on Kubernetes allowed Ticketmaster to move towards a microservices model, which in turn helped them gain a faster time to market. Running Kubernetes on AWS also meant that Ticketmaster’s many applications became more resilient, thanks to automatic scheduling capabilities. However, even with the many benefits of Kubernetes, costs were still rising, and the underlying infrastructure could be better utilized.
Since downtime or service interruptions are not an option for Ticketmaster, they were hesitant to adopt Spot Instances as a way to reduce costs, so they turned to Spotinst to reliably leverage the savings potential of Spot Instances. With Spotinst, Ticketmaster reduced their Kubernetes compute costs by approximately 70 percent by reliably leveraging Spot Instances on production workloads at scale.
Here’s how it works for Ticketmaster behind the scenes:
- Spotinst Elastigroup predicts Spot terminations in the k8s cluster before they happen.
- Elastigroup communicates with the K8s API server to mark the soon-to-be-terminated host as “unschedulable,” and in parallel spins up a new Spot or on-demand instance. Then, k8s start shuffling containers across the hosts.
“We are not thinking about cost anymore; Elastigroup does that for us,” says Ticketmaster’s Shane Savoie.
Figure 2 shows the process of EC2 Spot interruption and a replacement activity that drains an existing k8s node, evicts its existing containers, and schedules the running containers on a new host.
Figure 2 – Kubernetes node draining process.
Demandbase Easily Scales Spot Hours with Rancher & Kubernetes
Demandbase, a leader in account-based marketing, uses Rancher to manage their containers and Kubernetes to orchestrate them. Running on Amazon EC2 Spot effectively was a long-time goal for Demandbase, as Spot’s new hibernation feature is a no-brainer solution for development environments.
“When rolling it through once on our dev environment, we saw the results and were very excited,” says Josh Schlanger, vice president of DevOps and Architecture at Demandbase.
After integrating Spotinst Elastigroup to Rancher and Kubernetes, and giving it Identity and Access Management (IAM) access to manage these workloads within their AWS account, Demandbase was able to scale their workloads 10x on Amazon EC2 Spot.
Figure 3 shows the usage trend of Demandbase in EC2 Spot Instances throughout the year. Demandbase managed to grow exponentially in compute consumption while keeping a minimal linear growth in costs.
Figure 3 – Potential On-Demand costs vs actual Spot costs over time for Demandbase.
SimilarWeb Cuts Costs By Leveraging Spotinst’s Nomad Integration
In early 2017, the web analytics services business SimilarWeb faced the task of migrating a massive on-premises system to the cloud. They quickly decided to host their workloads on AWS, and leverage reserved instances (RIs) and Amazon EC2 Spot Instances to cost optimize.
Oz Katz, head of production engineering at SimilarWeb, opted for Nomad from HashiCorp because SimilarWeb was already utilizing HashiCorp’s full range of products, including Consul and Vault, for orchestrating large-scale workloads.
While Nomad is a great orchestration tool for containers, Oz found his team still incurred significant overhead from managing and scaling the underlying instances. By leveraging Spotinst Nomad Autoscaler, SimilarWeb relieved themselves from dealing with the cluster, and was able to spend more time building their applications.
“We might have been able to tool our own solution for running some workloads on Spot, but Spotinst’s focus and expertise made it a turn-key solution,” says Oz.
Figure 4 describes the flow of scaling additional jobs into the cluster, how jobs are waiting to be scheduled, trigger a scale up activity, and being placed on a new host.
Figure 4 – Scale up and down activity flow in HashiCorp Nomad.
Feature.fm Runs Mesosphere DC/OS on Spot Instances to Reduce Costs
Feature.fm provides musicians with a one-stop shop for all their marketing and advertising needs. As the company grew, Feature.fm looked for potential ways to reduce costs, ideally with a solution that did not require committing to capacity in advance. “Other happy customers of Spotinst got us interested, and suggested we checked the platform out,” says Zohar Aharoni, Feature.fm’s co-founder and CTO.
Feature.fm was particularly interested in the Spotinst Mesosphere integration, which automates node management by communicating with the DC/OS Master to ensure only active nodes are receiving new tasks, and that interrupted Spot Instances are automatically “cycled out.”
Feature.fm started leveraging the Spotinst Mesosphere integration and has never looked back. “We are running 90 percent of our workloads with Spotinst, and couldn’t be happier,” says Zohar. “Spotinst takes over when Amazon CloudFormation finishes provisioning, provides us with added flexibility, and seamlessly runs our workloads on Spot Instances to achieve 80 percent cost savings.”
Voyager Labs Run Docker Swarm Workloads on Spot Instances
Docker Swarm is Docker’s container orchestration solution, which schedules services and tasks on a cluster. Docker Swarm works great for task placement, but still requires that customers manage the underlying cluster.
Cluster management can mean constant monitoring and active interaction with the Docker Swarm Master, especially when the cluster consists of Amazon EC2 Spot Instances, which can get interrupted and need to be replaced unexpectedly.
AWS customers such as Voyager Labs, a leading artificial intelligence and cognitive deep learning company, have found a solution with Spotinst to achieve the cost benefits of running on EC2 Spot while offloading the overhead and cluster management responsibilities.
Figure 5 describes the process of interrupting an existing worker in a Docker Swarm cluster, how a worker is being drained out, while a new worker is launched and picking up containers.
Figure 5 – Docker Swarm flow for EC2 Spot Instance interruption.
Utilizing Amazon EC2 Spot Instances is a great way to dramatically reduce your compute costs on AWS. Containerized workloads are ideal for Spot, as they should be ephemeral, stateless, and fault tolerant by design.
By leveraging both EC2 Spot and Spotinst for your containerized environments, you can run production-grade containerized workloads that are optimally and dynamically sized with zero to little overhead.
The content and opinions in this blog are those of the third party author and AWS is not responsible for the content or accuracy of this post.
Spotinst – APN Partner Spotlight
Spotinst is an APN Advanced Technology Partner. They help companies save on their AWS computing costs by leveraging EC2 Spot Instances.
*Already worked with Spotinst? Rate this Partner
*To review an APN Partner, you must be an AWS customer that has worked with them directly on a project.