AWS Partner Network (APN) Blog
Self-Service Platform for Standardized Amazon EKS Deployments Across the Organization
By Ashwinikanth, Solution Architect – Tech Mahindra
By K Jeyashri, AWS SME – Tech Mahindra
By Thooyavan, Sr. Solution Architect – Tech Mahindra
By Amit Kumar and Shonil Kulkarni, Partner Solutions Architect – AWS
Tech Mahindra |
Containers provide an easy and portable way to run and deploy application workloads. Many organizations have already adopted container-based architecture to run their application workload in an on-premises environment and benefit from increased agility, workload portability, and predictable deployments.
Amazon Web Services (AWS) provides a secure, reliable, and scalable environment for customers to run their container workloads. Customers running containers on premises are looking to move to AWS to gain agility benefits and reduce technical debt of managing their own infrastructure. AWS services help customers to reduce cost and operational overhead with increased scalability, security, and reliability.
This post describes how Tech Mahindra transitioned a customer from an on-premises self-managed Kubernetes environment to a managed Amazon Elastic Kubernetes Service (Amazon EKS) platform with centralized self-service deployment options using AWS Service Catalog.
Tech Mahindra is an AWS Premier Tier Services Partner with the Migration Competency that specializes in digital transformation, consulting, and re-engineering solutions. Tech Mahindra is also a member of the AWS Managed Cloud Service Provider (MSP) and AWS Well-Architected Partner Programs.
Project Scope and Background
The scope of this project included a migration and transformation journey for a customer who had their application workload running on self-managed Kubernetes in their on-premises data centers.
Multiple business units within the customer’s organization had already deployed their containerized workloads on Kubernetes, and each unit had their own strategy to select the deployment method for their container workload.
The customer designated Tech Mahindra as a lead cloud migration partner to achieve the following objectives:
- Standardized EKS deployment across the organization following AWS best practices and the customer’s compliance and security standards.
- Eliminate the operational complexity of managing the control plane and data plane of an on-premises container platform.
- Enable automated deployment of EKS using infrastructure as a code (IaC).
- Provide a self-service model to abstract developers from infrastructure complexity where the application team and developers could deploy their EKS infrastructure without the intervention of cloud support and infrastructure teams.
High-Level Solution Overview
Amazon EKS provides customers flexibility to run Kubernetes-based applications on AWS. It helps customers achieve highly-available and secure clusters to run workloads, and automates key admin tasks such as patching, node provisioning, and updates.
Amazon EKS also provides a highly available and resilient control plane cluster managed by AWS, which helps to reduce the operational overhead and cost for customers to manage their own self-managed control plane cluster.
Tech Mahindra’s solution included setting up a self-service shared platform for EKS using AWS Service Catalog and AWS CloudFormation, integrated with AWS Control Tower to provide a cohesive platform that’s offered as a service to internal business units.
A standardized EKS solution architecture blueprint was defined based on customer requirements and AWS Well-Architected design principles to be consumed as a service by different business units.
The following architecture diagram depicts how an AWS Service Catalog-based “hub and spoke” model was implemented using AWS Control Tower. The portfolio is created in the management account and shared to the member/linked accounts as Service Catalog products.
Figure 1 – Multi-account “hub and spoke” architecture using AWS Control Tower.
The following architecture diagram depicts Tech Mahindra’s Amazon EKS solution architecture.
Figure 2 – Amazon EKS architecture for application workload.
Detailed Solution Walkthrough
The solution consists of two major components:
- Self-service shared Amazon EKS platform (shown in Figure 1).
- Standardized and modular Amazon EKS blueprint (shown in Figure 2).
We will dive deep on both solution components in below sections.
Self-Service Shared Amazon EKS Platform
This is a self-service shared platform which enables the organization to centrally provision and manage product catalogs (one or more IT services). It adheres to the organization’s governance and compliance requirements in a self-service consumption model by different application teams within an organization.
AWS Service Catalog is used to provide a self-service shared platform to list products (services or applications) which are available to end users and application teams to consume Amazon EKS. It achieves a high degree of automation with adoption of standardized patterns.
AWS Service Catalog is also used to create a catalog of products that can be used to deploy the EKS patterns and blueprints. A Service Catalog-based “hub and spoke” model was implemented using AWS Control Tower where a portfolio was created with collection of products.
Service Catalog products were maintained in the centralized account (hub) under a single portfolio and shared to all of the member accounts (spoke) with fine-grained access control.
Standard deployment patterns were made available within a Service Catalog portfolio as 15 different products. Ten of these were made mandatory and need to be executed in a sequence to complete the EKS pattern deployment, after which the customer will have a highly available EKS environment that’s ready to deploy containerized application workloads.
Five of the products, meanwhile, are kept optional to give a modular “pick and choose” options for each business unit to consume as per their requirements and based on the various use cases and nature of the applications.
Below is the list of AWS Service Catalog products created under one portfolio and the function of each product.
Figure 3 – AWS Service Catalog portfolio and product details.
- Product 1 (Amazon EKS cluster role): Product uses AWS CloudFormation StackSets to create a role for Amazon EKS cluster provisioning. An AWS Identity and Access Management (IAM) role for cluster provisioning is added to the Kubernetes role-based access control (RBAC) authorization table as the administrator.
- Product 2 (Amazon EKS cluster creation): Product uses StackSets to launch an EKS cluster in the account.
- Product 3 (Amazon EFS creation): Product uses StackSets to create an Amazon Elastic File System (Amazon EFS) for persistent volume storage for the EKS cluster.
- Product 4 (EKS cluster management server creation): Product uses StackSets to create an Amazon Elastic Compute Cloud (Amazon EC2) instance which will act as a management node to execute automation scripts on the EKS cluster. An IAM role with EKS admin privileges is also attached as an instance profile to this EC2 management node to provide the required privileges to execute automation scripts and connect to the cluster using kubectl utility tools.
- Product 5 (Automation using AWS System Manager): Product uses StackSets to create AWS Systems Manager documents to automate the Kubernetes resource deployment and trigger an AWS Lambda function to execute the Systems Manager documents.
- Product 6 (CNI deployment): Product uses StackSets to deploy the custom customer network interface (CNI) in the EKS cluster for containers communication.
- Product 7 (Amazon EKS managed group creation): Product uses StackSets to create an EKS-managed group of EC2 nodes. This product was configured to be launched multiple times to create an EC2 managed node group when required.
- Product 8 (Resource deployment using AWS Systems Manager): Product uses StackSets to execute System Manager documents to deploy Kubernetes resources such as ingress controller, external DNS, Amazon EFS CSI controller, cluster autoscaler, metrics server, Amazon CloudWatch container insights, Prometheus server, and Grafana.
- Product 9 (Namespace creation): Product uses StackSets to create namespace along with resource quote for namespace (CPU limit, CPU request, memory limit, memory request), network policies, role and role binding for admin user and read-only users within the namespace. This product was configured to be launched multiple times as per requirements.
- Product 10 (Amazon ECR repository creation): Product uses StackSets to create an Amazon Elastic Container Registry (Amazon ECR) repository for container images. This product was configured to be launched multiple times as per requirements.
- Product 11 (AWS Fargate NGINX ingress controller): Product uses StackSets to deploy NGINX ingress controller for application which needs to be deployed on AWS Fargate. This is an optional product and must be launched only if there’s a requirement to run applications in a Fargate profile.
- Product 12 (AWS Fargate profile creation): Product uses StackSets to create an AWS Fargate profile to run the application workload. This product was configured to be launched multiple times as per requirements.
- Product 13 (Amazon API Gateway): Product uses StackSets to create an Amazon API Gateway for the application which requires secure APIs access.
- Product 14 (AWS Certificate Manager Private CA): Product uses StackSets to deploy AWS Certificate Manager Private Certificate Authority (CA) issue plugin in the EKS cluster for the application which requires certification for secure communications.
- Product 15 (Amazon EKS management server recovery): Product uses StackSets to recover the EC2 management server in case of server failure or downtime. It uses a backup Amazon Machine Image (AMI) of the management server for recovery.
Standardized and Modular EKS Blueprint
This is the second key component of the proposed solution which is based on Amazon EKS. The architecture includes the following key components:
- Amazon EKS control plane: A Kubernetes control plane managed by AWS runs inside an Amazon Virtual Private Cloud (VPC) and is designed to eliminate any single points of failure that may compromise the availability and durability of the control plane. AWS handles the operational complexity of deploying, operating, and upgrading control plane clusters, reducing the operational complexity and risk of managing self-managed cluster.
- Amazon EKS data plane: Compute capacity to run the application workload was provisioned using an EKS-managed node group, which helps to automate the provisioning and lifecycle management of the data plane. A managed node group was provisioned across AWS Availability Zones (AZs) to provide high availability.
- Cluster autoscaler: The cluster autoscaler solution was implemented to ensure the EKS cluster has the required numbers of nodes to run application workloads. The cluster autoscaler helps to monitor nodes for optimal utilization and application pods for any failure.
- Persistent volume: The application workload running on EKS required data persistence. Amazon EFS was defined as storage classes for persistent volume claims. The EFS container storage interface (CSI) driver was configured to allow Kubernetes clusters running on AWS to manage the lifecycle of EFS file systems.
- Networking:
- VPC and subnet: The EKS cluster was deployed in one AWS region in a VPC with subnet across AZs. Nodes and control plane communication was established with network interfaces in the subnet. A cluster security group was designed to allow all traffic from the control plane and managed node groups. All of the node group’s security groups must have an inbound rule to all traffic from the source EKS cluster security group.
- POD networking: Calico, WeaveNet, Cilium, and Amazon VPC CNI options were evaluated for the network CNI plugin. Calico was selected for POD networking to address the IP limitations and effective use of IPs for customer. The main parameter for the selection of Calico was to address limited CIDR IP range availability. Calico provides the capability to define separate private pool of IPs for high performance scalable pod networking, and also supports security policy enforcement through network policies.
- Network Load Balancer with NGINX ingress controller: Kubernetes ingress is an API object that provides a collection of routing rules that govern users access Kubernetes services running in a cluster. NGINX ingress controller was deployed to listen to all of the ingress events from the namespaces and add corresponding directives and rules into the NGINX configuration file. This made it possible to use a centralized routing file which includes all of the ingress rules, hosts, and paths. With the NGINX ingress controller, multiple ingress objects were available for multiple environments or namespaces with the same Network Load Balancer and frontend ingress controller.
- Cluster authentication: Amazon EKS used IAM to provide authentication to the Kubernetes cluster for authentication and relied on native Kubernetes RBAC for authorization. All permissions for interacting with your EKS cluster’s Kubernetes API is managed through the native Kubernetes RBAC system.
- Logging and monitoring: Amazon EKS control plane logging provides audit and diagnostic logs directly from the EKS control plane to CloudWatch logs in the account. Exact log types were configured and sent as log streams for each EKS cluster in CloudWatch. Deployment of container insights for monitoring the EKS cluster was also automated as part of AWS Service Catalog.
- Prometheus is an open-source application used for metrics-based monitoring and alerting. It calls out to your application, pulls real-time metrics, and compresses and stores them in a time-series database. Prometheus offers a powerful data model and query language and can provide detailed and actionable metrics.
- Grafana is an open-source multiplatform analytics and visualization web application that acts as a single pane of glass for displaying all of the metric data from Prometheus. AWS-managed Prometheus and Grafana was included in Service Catalog as an optional product to meet additional monitoring, alerting, and visualization capabilities.
Customer Benefits
- Reduction of deployment cycle and service creation time from days to minutes.
- Implement a common self-service Amazon EKS deployment pattern which is standardized across the organization, incorporating best practices, and in line with compliance and security requirements.
- Reduction in AWS infrastructure cost.
- Increased reliability and scalability as compared to on-premises data center.
- Shared service platform to automatically include security and operational best practices into every application across business unit.
- Reduced operational overhead, cost, and risk associated with managing EKS control plane.
- Includes a standardized approach for monitoring to ease troubleshooting of issues.
Conclusion
Customers running workloads on Kubernetes in their on-premises environment have been challenged with cluster plane complexity, lack of scalability, lack of resiliency, operational overhead, and high cost.
With Tech Mahindra’s expert guidance, customers can leverage the benefits of the AWS Cloud and transition their on-premise Kubernetes workload to Amazon EKS with the implementation of a shared self-service platform.
Tech Mahindra – AWS Partner Spotlight
Tech Mahindra is an AWS Premier Tier Services Partner and MSP that specializes in digital transformation, consulting, and business re-engineering solutions.