Containers
How Condé Nast modernized its container platform on Amazon Elastic Kubernetes Service
This post was co-written with Emily Atkinson, Senior Engineering Manager at Condé Nast.
About Condé Nast
Condé Nast is a global media company home to iconic brands including Vogue, GQ, AD, Condé Nast Traveler, Vanity Fair, Wired, The New Yorker, Glamour, Allure, Bon Appétit, Self and many more. In 2014, Condé Nast started their journey in to AWS and have since then adopted a wide range of AWS services to support their fast-growing digital business. Condé Nast products and services running on AWS helps them to reach over 1 billion consumers across print, digital, video, and on social media.
In this post, we discuss how Condé Nast modernized their container platform with Amazon Elastic Kubernetes Service (Amazon EKS) to support their growth, improved operational efficiency, and developer experience. We’ll discuss their previous architecture running on self-managed Kubernetes, the key drivers for modernization, the new architecture, migration process and achieved benefits.
Container platform: Previous architecture
Condé Nast operated its container platform on self-managed Kubernetes on Amazon Elastic Compute Cloud (Amazon EC2). The Kubernetes control plane (i.e., API Server, controller manager, and ETCD) were running on Amazon EC2 instances. The team also maintained the AMIs for both control and data plane. From an operational standpoint, the team maintained the open-source Kubernetes platform along with necessary add-ons, identified and patched security vulnerabilities, and were responsible for integration with AWS ecosystem. Condé Nast used open-source tools such as kube2iam to manage AWS Identity and Access Management (AWS IAM) credentials for applications to authenticate with AWS services. Condé Nast also used a Content Delivery Network to connect users across the globe with the different brand applications.
Key drivers to modernize the container platform
In 2020 Condé Nast undertook the biggest technological transformation in its history, driving towards expanding both its operations and reach through diversifying the digital business models across all brands. A top business priority was to migrate and consolidate to a more centralized set of technologies, with capabilities easily used by any brand or product experience. Prior to the modernization effort, engineering teams deployed applications on self-managed Kubernetes clusters each with their own development ecosystem and with differing operational and deployment models, Kubernetes versions and access patterns. During the consolidation planning, Condé Nast analyzed the repetitive engineering effort required to build, secure, monitor, and upgrade the Kubernetes cluster across various teams and realized the need for standardization to derive both technical and business value. Condé Nast started to assess the value of continuing and rebuilding a new set of self-managed Kubernetes clusters versus adopting Amazon EKS.
How Amazon EKS solves the problem
Amazon Elastic Kubernetes Service (Amazon EKS) is a managed Kubernetes service to run Kubernetes in the AWS cloud and on-premises data centers. Amazon EKS provides a scalable and highly available control plane running across multiple Availability Zones (AZs), thereby removing significant operational burden for Condé Nast teams maintaining multiple control plane of different clusters. Once Condé Nast had made the decision to migrate to Amazon EKS, they also made the following design decisions:
- VPC CNI plugin to assign IP addresses from the VPC to pods
- AWS Load Balancer controller, which created load balancer target groups based on Kubernetes Ingress rules
- Securely connecting Kubernetes pods to AWS services using AWS IAM Roles for Service Accounts (IRSA)
- Using Fluentbit sidecar to send logs to centralized logging system such as Datadog
- Condé Nast uses Bottlerocket, which is a Linux-based open-source operating system that is purpose-built by Amazon Web Services for running containers. Bottlerocket includes only the essential software required to run containers, and ensures that the underlying software is always secure.
- Kubernetes pods are deployed with read-only access and no root filesystem access
- Constructs such as cluster auto-scaler and AWS node termination handler ensure that Condé Nast can seamlessly scale-in and scale-out depending on user traffic
- Read only access to users via Kubernetes Dashboard
- Using AWS Secrets Manager integration to store secrets securely
Condé Nast implemented multi-tenancy for their platform amongst applications using Kubernetes namespaces. The resource quotas and limits ensure the applications don’t create noisy neighbor problem. Using namespaces per application when compared to previous architecture resulted in reduced cost and operational overhead.
Migrating from self-managed Kubernetes clusters to Amazon EKS also hardened the security posture for Condé Nast. With Amazon EKS, security is a shared responsibility. AWS is responsible for managing of the EKS Kubernetes control plane. This includes the Kubernetes control plane nodes, the ETCD database, and other infrastructure necessary for AWS to deliver a secure and reliable service. Using Amazon EKS also helped Condé Nast to save time spent on generating operational runbooks and tabletop exercises for control plane components.
Container platform: New architecture
Condé Nast deployed applications on managed node groups across AZs and also utilized Amazon EKS add-ons such as VPC CNI plugin, CoreDNS, and IRSA to integrate natively with AWS ecosystem. Condé Nast also used edge services like Amazon CloudFront and AWS Global Accelerator to serve cacheable content locally and utilize AWS global backbone network for low latency access to Condé Nast users. AWS Global Accelerator is configured with health checks to redirect users to the closest healthy endpoint in case of regional and application failures. Within each region, Condé Nast deploys a production and non-production Amazon EKS clusters.
Migration process
Condé Nast implemented a collective responsibility model between the global platform team and application teams to migrate applications from existing self-managed Kubernetes clusters to the Amazon EKS.
Condé Nast’s global platform team is responsible for maintaining the global Amazon EKS platform, securing the clusters, managing authentication, development, and deployment experience through Continuous Integration/Continuous Deployment (CI/CD) tooling and enabling Site Reliability Engineering (SRE) practices.
Development team consists of product engineering and data engineering teams. They’re responsible for application development to meet business requirements and validating functionality and user experience of the application.
The cloud platform team enables the development teams to deploy their applications on the global cloud platform. The cloud platform team pairs with the application teams in providing guardrails and best practices and recommendations for using the platform.
The flow chart below describes the migration process Condé Nast followed a blue-green deployment strategy to migrate the applications from existing platform to the global cloud platform.
Once the application on the global cloud platform starts receiving traffic and the application teams have validated the functional and user experience, the user traffic is completely directed towards the global cloud platform.
Achieved outcomes
- Higher operational efficiency: Prior to the consolidation, Condé Nast had multiple small teams to maintain Kubernetes clusters across various markets while having varied interaction with developer teams responsible for deploying and maintaining the applications on these clusters. Condé Nast merged the teams maintaining the clusters into a single global cloud platform team. Amazon EKS helped the global cloud platform team to avoid the undifferentiated heavy lifting for managing multiple Kubernetes clusters across different markets.
- Improved developer agility: Development teams can spin up new clusters with guardrails based on specific requirements leading to increased developer velocity.
- Improved latency for end users: Condé Nast observed that end user latency was reduced up to 50% by using AWS Global Accelerator and Amazon CloudFront distribution. The end user requests were also directed to closest geographic locations hosting the content.
Conclusion
In this post, we showed you how Condé Nast improved the operational efficiency while meeting business outcomes. Migration from self-managed Kubernetes to Amazon EKS was a transformational step in that direction. It involved cross-functional collaboration and planning between application teams, global platform team and business leaders. Consolidating to a more centralized set of technologies and capabilities that could be easily used by any brand or product experience team within Condé Nast helped achieve top business priorities and harness the agility of the cloud.