AWS Architecture Blog

Use Amazon EKS and Argo Rollouts for Progressive Delivery

A common hurdle to DevOps strategies is the manual testing, sign-off, and deployment steps required to deliver new or enhanced feature sets. If an application is updated frequently, these actions can be time-consuming and error prone. You can address these challenges by incorporating progressive delivery concepts along with the Amazon Elastic Kubernetes Service (Amazon EKS) container platform and Argo Rollouts.

Progressive delivery deployments

Progressive delivery is a deployment paradigm in which new features are gradually rolled out to an expanding set of users. Real-time measurements of key performance indicators (KPIs) enable deployment teams to measure customer experience. These measurements can detect any negative impact during deployment and perform an automated rollback before it impacts a larger group of users. Since predefined KPIs are being measured, the rollout can continue autonomously and alleviate the bottleneck of manual approval or rollbacks.

These progressive delivery concepts can be applied to common deployment strategies such as blue/green and canary deployments. A blue/green deployment is a strategy where separate environments are created that are identical to one another. One environment (blue) runs the current application version, while the other environment (green) runs the new version. This enables teams to test on the new environment and move application traffic to the green environment when validated. Canary deployments slowly release your new application to the production environment so that you can build confidence while it is being deployed. Gradually, the new version will replace the current version in its entirety.

Using Kubernetes, you already can perform a rolling update, which can incrementally replace your resource’s Pods with new ones. However, you have limited control of the speed of the rollout, and can’t automatically revert a deployment. KPIs are also difficult to measure in this scenario, resulting in more manual work validating the integrity of the deployment.

To exert more granular control over the deployments, a progressive delivery controller such as Argo Rollouts can be implemented. By using a progressive delivery controller in conjunction with AWS services, you can tune the speed of your deployments and measure your success with KPIs. During the deployment, Argo Rollouts will query metric providers such as Prometheus to perform analysis. (You can find the complete list of the supported metric providers at Argo Rollouts.) If there is an issue with the deployment, automatic rollback actions can be taken to minimize any type of disruption.

Using blue/green deployments for progressive delivery

Blue/green deployments provide zero downtime during deployments and an ability to test your application in production without impacting the stable version. In a typical blue/green deployment on EKS using Kubernetes native resources, a new deployment will be spun up. This includes the new feature version in parallel with the stable deployment (see Figure 1). The new deployment will be tested by a QA team.

Figure 1. Blue/green deployment in progress

Figure 1. Blue/green deployment in progress

Once all the tests have been successfully conducted, the traffic must be directed from the live version to the new version (Figure 2). At this point, all live traffic is funneled to the new version. If there are any issues, a rollback can be conducted by swapping the pointer back to the previous stable version.

Figure 2. Blue/green deployment post-promotion

Figure 2. Blue/green deployment post-promotion

Keeping this process in mind, there are several manual interactions and decisions involved during a blue/green deployment. Using Argo Rollouts you can replace these manual steps with automation. It automatically creates a preview service for testing out a green service. With Argo Rollouts, test metrics can be captured by using a monitoring service, such as Amazon Managed Service for Prometheus (Figure 3).

Prometheus is a monitoring software that can be used to collect metrics from your application. With PromQL (Prometheus Query Language), you can write queries to obtain KPIs. These KPIs can then be used to define the success or failure of the deployment. Argo Rollout deployment includes stages to analyze the KPIs before and after promoting the new version. During the prePromotionAnalysis stage, you can validate the new version using preview endpoint. This stage can verify smoke tests or integration tests. Upon meeting the desired success (KPIs), the live traffic will be routed to the new version. In the postPromotionAnalysis stage, you can verify KPIs from the production environment. After promoting the new version, failure of the KPIs during any analysis stage will automatically shut down the deployment and revert to the previous stable version.

Figure 3. Blue/green deployment using KPIs

Figure 3. Blue/green deployment using KPIs

Using canary deployment for progressive delivery

Unlike in a blue/green deployment strategy, in a canary deployment a subset of the traffic is gradually shifted to the new version in your production environment. Since the new version is being deployed in a live environment, feedback can be obtained in real-time, and adjustments can be made accordingly (Figure 4).

Figure 4. An example of a canary deployment

Figure 4. An example of a canary deployment

Argo Rollouts supports integration with an Application Load Balancer to manipulate the flow of traffic to different versions of an application. Argo Rollouts can gradually and automatically increase the amount of traffic to the canary service at specific intervals. You can also fully automate the promotion process by using KPIs and metrics from Prometheus as discussed in the blue/green strategy. The analysis will run while the canary deployment is progressing. The success of the KPIs will gradually increase the traffic on the canary service. Any failure will stop the deployment and stop routing live traffic.

Conclusion

Implementing progressive delivery in your application can help you deploy new versions of applications with confidence. This approach mitigates the risk of rolling out new application versions by providing visibility into live error rate and performance. You can measure KPIs and automate rollout and rollback procedures. By leveraging Argo Rollouts, you can have more granular control over how your application is deployed in an EKS cluster.

For additional information on progressive delivery or Argo Rollouts:

Srinivasa Shaik

Srinivasa Shaik

Srinivasa Shaik is a Solutions Architect based in Boston. He works with enterprise customers to architect and design solutions for their business needs. His core areas of focus is containers, serverless, and machine learning. In his spare time, he enjoys spending time with his family, cooking, and traveling.

Matt Aylward

Matt Aylward

Matt Aylward is a Solutions Architect at Amazon Web Services (AWS) who is dedicated to creating simple solutions to solve complex business challenges. Prior to joining AWS, Matt worked on automating infrastructure deployments, orchestrating disaster recovery testing, and managing virtual application delivery. In his free time, he enjoys spending time with his family, watching movies, and going hiking with his adventurous dog.

Sasi Jayalekshmi

Sasi Jayalekshmi

Sasi is a Senior Solutions Architect with AWS based out of Boston, MA. She works with enterprise customers to achieve their business goals through cloud adoption and hybrid architecture solutions. Her current interests are container technologies and data & analytics. She holds Masters in Computer Science, and a Bachelors in Electronics.