Containers

Welcome to the AWS Containers Blog

Welcome to the AWS Containers Blog! We’re excited to start this channel to give builders a closer look under the hood of all things container-related at AWS. In the past, we’ve published on other popular blog channels at AWS such as the compute blog, the architecture blog, and open source blog. Now with the containers blog, you can find all containers-related posts in one place.

Our blogs will be authored by software engineers, product managers, solutions architects, and developer advocates. They will share engineering deep dives, technical guidance, best practices, and how-to guides to help you build on containers at AWS. We’ll share the work we’re doing in the open source community, customer use cases, and guest posts from our container heroes on their experiences with using containers to build applications.

Customers drive the majority of our roadmap and investments. The breadth of container services available today are a result of these conversations over the past few years. Our services help hundreds of thousands of customers build applications across use cases such as machine learning, batch processing, web applications, and line-of-business applications. Today, I wanted to talk to you about how the AWS Container Services portfolio has come about to its current state.

Customers love running their applications on Amazon EC2 because it eliminates the time and complexity necessary to provision and manage servers. Using APIs, they can easily provision their virtual machines to obtain and configure compute capacity with minimal effort. As more customers used EC2, they started looking for new ways to innovate within their organization. Companies like Netflix started decoupling teams and applications, moving quickly by building automation into steps across the software lifecycle.

Our customers realized that by building on AWS, they could take advantage of a global infrastructure that allowed them to build applications with a higher standard of security, reliability, and availability than before. However, they were still missing the tooling and processes that Amazon had built over the years to use this infrastructure most effectively. Specifically, these companies wanted our help to improve engineering velocity and deployment safety. They asked us to help them (1) make their product teams more nimble (2) decouple development across teams so they operated more independently, and (3) deploy code more securely, efficiently, and quickly. Our customers also told us that they wanted to focus more on development that drove business value while eliminating as much of the undifferentiated heavy lifting required to manage cloud infrastructure.

We launched Amazon ECS in preview at re:Invent 2014 (GA in April 2015) to address the needs of customers who were rapidly adopting containers as a mechanism to package, ship, and run applications. With ECS, customers could run their containerized applications at scale. ECS automated the managing and scaling of containers while removing a large part of the operational heavy lifting. With simple API calls, you could launch and stop containers, query the complete state of your clusters, and access many familiar features such as IAM roles, security groups, load balancers, Amazon CloudWatch Events, and AWS CloudTrail logs. Running on ECS was a familiar experience for customers running on EC2. As AWS launched new services, a number of them such as Amazon SageMaker, Amazon Lex, Amazon Polly, AWS Batch, and Amazon.com’s recommendation engine were built on ECS. Customers are now launching millions of containers every hour on ECS, and today, ECS is a battle-tested service that supports mission-critical services internally at AWS and for our customers (including Amazon.com). For example, Duolingo achieved 4 9’s of reliability and reduced their compute cost by 60% by migrating to ECS. McDonalds was able to scale to about 500K order per hour with less than 100ms response time with ECS.

ECS, like many other orchestration/scheduling systems, is optimized for placing containers on a cluster of machines that are owned by customers. It was clear to us that while this addressed immediate customer requirements, it added an unnecessary layer of abstraction. To get the full benefits of containerization, our customers wanted to interact with and pay for just their containerized artifacts, e.g. an ECS Task. They also wanted to avoid the operational and cognitive overhead of managing a fleet of instances to run their applications. In 2017, we launched AWS Fargate, a compute engine optimized to run containerized applications. With Fargate, there are no EC2 instances or servers to manage, you simply define your Task Definitions and used the ECS Task APIs or schedulers to launch those tasks. Customers no longer have to worry about right sizing or autoscaling any underlying fleet, patching servers, or trying to figure out the cluster operations model for their application teams. Tens of thousands of customers are now launching millions of containers on Fargate every week. Samsung migrated its developer portal to Fargate and experienced 40%+ reduction in compute cost. Corteva, the agricultural division of DowDuPont, used Fargate to deploy a machine learning algorithm for scoring genetic markers across labs spread across six continents to analyze over 1.4 billion data points annually. Because of the automation Fargate provides, we’re seeing more and more customers that are going “all in” on Fargate by running the majority of their containerized applications on Fargate.

To make it easy for customers to rapidly and safely integrate and deploy their containerized applications, we’ve built integration with services such as Spinnaker and AWS CodePipeline. You can use AWS CodePipeline to orchestrate software release workflow using these services and third-party tools or integrate each service independently with your existing tools. Additionally, we’re continuing to enhance the capabilities of our open source ecs-cli tool. For example, in the last few weeks, we’ve improved our CLI’s capabilities for testing ECS task definitions locally and we’re also working on some other exciting new developments.

We also noticed there were a number of our customers were continuing to run applications on-premises. They were looking for ways to effectively refactor these applications to move these to the cloud while also becoming nimble with their applications on-premises. Over the last few years, Kubernetes gained popularity with such customers and they were looking for ways to run their Kubernetes workloads on AWS. This would allow them to use the same tooling and processes they use with AWS for their on-premises applications. In 2017, we announced Amazon EKS to address the challenges of managing Kubernetes reliably. EKS, which uses upstream Kubernetes, is the most reliable way to run Kubernetes. It is built for redundancy and resiliency; we run the EKS control plane using a cellular architecture and the Kubernetes control plane and etcd run across multiple AWS Availability Zones. EKS automatically manages the availability and scalability of the Kubernetes control plane nodes and also detects and replaces unhealthy control plane nodes for each cluster. EKS has grown rapidly since launch with customers such as Snap, Intuit, GoDaddy, and Autodesk using it to manage their containers.

Customers are choosing to build on containers on AWS for their most sensitive and mission critical applications across all industry verticals including finance, healthcare, government, and retail. These applications include machine learning training and inference, batch jobs, web applications, and line-of-business applications. Customers choose AWS because of the security, scalability, and availability we provide. They like that AWS provides the broadest choice of container services, ensuring that customers are able to use the constructs, APIs, tools, and processes they’re most comfortable with. As I talk to more customers, I am reasonably confident that anyone building a new application today is either using containers or AWS Lambda. Even for existing applications that are deployed on-premises, customers are rapidly containerizing them to move them to the cloud.

As customers build distributed systems, it is becoming increasingly complex to manage and observe the communication between services. To address these issues, customers are adopting service meshes. A service mesh is a dedicated infrastructure layer that decouples the communication between services from application logic, and manages service-to-service communication over a network.  In April 2019, we launched AWS App Mesh. Our goal with App Mesh is to provide application-level networking. Our thinking with App Mesh is that customers should be able to develop their application and define how these applications communicate directly with App Mesh. We believe App Mesh will be the principal way in which applications are managed over time.

We are working on a number of exciting new projects and the best way to stay up-to-date is to visit our public roadmap for AWS container services on GitHub. Here you can add issues, stay current with recent launches, and provide your comments and feedback. The engineers, product managers, and developer advocates on the container and App Mesh teams love talking to customers and appreciate your feature requests and feedback.

I’m looking forward to the containers blog. In the coming months, you’ll hear from senior engineers that have built and shaped AWS container services. They are excited to share their insights with you. To be notified when there’s a new post, subscribe to our blog using the RSS feed button at the top of the page. If you have any questions, comments, or topic requests, let us know in the comments.