Blue/Green Kubernetes upgrades for Amazon EKS Anywhere using Flux

Introduction

Amazon EKS Anywhere (Amazon EKS-A) allows customers to run containerized workloads on customer-managed hardware. Amazon EKS-A cluster upgrades are performed in place using a rolling process (similar to Kubernetes Deployments). Upgrades can only happen one minor version at a time (e.g., version 1.20 to 1.21) and Control plane components will be upgraded before worker nodes. As Kubernetes continues to evolve, it’s important for you to keep your clusters up-to-date with the latest security enhancements, features, and bug fixes. For more details on the upgrade process see Amazon EKS-A documentation.

By default Amazon EKS-A cluster upgrades use an in-place rolling process, but in certain scenarios an alternative strategy may be preferred. The blue/green deployment method allows the customer to shift traffic between two nearly identical environments that are running different versions of Kubernetes. Customers may prefer a blue/green Amazon EKS-A upgrade strategy for a few reasons:

To reduce risk during in-place Kubernetes or platform component upgrades
To rollback easily to an older version by shifting traffic back
As a disaster recovery pattern (e.g., active-active, active-passive)
As a migration strategy to Amazon EKS in-region
To move a platform from old to newer hardware software versions
Ability to test applications with a new version of Kubernetes in a safe and non‑production environment

In a blue/green approach there is one environment (blue) running the current application and platform version and one environment (green) running the new application or platform version. Using Flux, which is bundled with Amazon EKS Anywhere, you can implement the blue/green pattern for Amazon EKS-A cluster upgrades. Flux is a GitOps operator for Kubernetes to sync your cluster state with a Git repository. On success, traffic can then be switched from the old environment to the new.

This post details a solution on how to achieve blue/green Kubernetes platform deployments on Amazon EKS-A deployed on vSphere using the bundled Flux controller. In addition, it discusses additional design considerations for the pattern.

Solution overview

In this solution, two Amazon EKS-A clusters (workload cluster A and workload cluster B) are provisioned with Flux, MetalLB, and Emissary Ingress controller installed. Workload cluster A represents the blue cluster and workload cluster B represents the green environment.

The following diagram shows the solution:

The diagram shows two Amazon EKS-A clusters (workload cluster A and workload cluster B) which are provisioned with Flux, MetalLB, and Emissary Ingress controller installed. Workload cluster A represents the blue cluster and workload cluster B represents the green environment.

Emissary Ingress controller is used to control layer 7 routing to Kubernetes services. MetalLB is a load balancer controller, which is a replacement for cloud-based load balancers. When creating Kubernetes services of type LoadBalancer the MetalLB controller provide a static IP address that can be used to route traffic to the Emissary Ingress controller. Emissary and MetalLB can be installed via the Amazon EKS-A Curated Packages. Amazon EKS-A curated packages are trusted, up-to-date, and compatible software that are supported by Amazon to extend your EKS-A clusters functionality, while reducing the need for multiple vendor support agreements.

When Flux is enabled, the Amazon EKS-A cluster configuration is stored in Git and version controlled. The Flux controller works in the following way:

A platform operator or automation tool commits new Kubernetes configuration change to an environment repository, which is used in this blog a Git repo on GitHub.
Flux detects changes to the environment repository and syncs the Kubernetes manifests to the Kubernetes cluster.

Flux makes it easy to version control and apply cluster configurations. For a production process, it’s recommended to use a Continuous Integration (CI) pipeline that validates any Kubernetes manifest changes before pushing to the environment Git repository. This minimizes accidental configuration changes that could impact the availability of your workloads running on the Amazon EKS-A cluster.

Walkthrough

Prerequisites

Administrative tool with the following tools installed:
- Helm
- eksctl
- eks-anywhere (v0.12.0 or later for curated packages support)
- kubectl
- curl
- Git
vSphere client credentials with administrator permissions
Follow setup authentication to use Amazon EKS Anywhere Curated Packages to create an IAM user with required ECR access. Amazon EKS Anywhere Curated Packages are only available to customers with the Amazon EKS Anywhere Enterprise Subscription. To request a free trial, talk to your Amazon representative or connect with one here.
A GitHub Personal Access Token (PAT) to access your provided GitHub repository. It must be scoped for all repo permissions with an SSH key set-up for cloning a repository.

Set-up environment

First, set up the environment by providing the vSphere username and password, GitHub token, and AWS keys for fetching Amazon EKS-A curated packages.

# EKSA Environment variables
export EKSA_VSPHERE_USERNAME=<Your-vSphere-Username>
export EKSA_VSPHERE_PASSWORD=<Your-vSphere-Password>

# GitHub personal access token
export EKSA_GITHUB_TOKEN=<Your-github-personal-access-token>

# Curated Packages
AWS_ACCESS_KEY_ID=<Your-AWS-Access-Key>
AWS_SECRET_ACCESS_KEY=<Your-AWS-Secret-Access>
export EKSA_AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID
export EKSA_AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY
export AWS_REGION=us-west-2 # Region of Availability for Curated Packages.

# Clone our eks-anywhere-addons github repo
git clone https://github.com/aws-samples/containers-blog-maelstrom.git

Next, create a GitOps-enabled Amazon EKS-A cluster. Use the following eksctl command to generate the vSphere EKS-A cluster configuration:

export CLUSTER_NAME_A=cluster-a
eksctl anywhere generate clusterconfig cluster-a \
   --provider vsphere > cluster-a.yaml

Complete the vSphere cluster configuration by referring to vSphere configuration, and specify the Kubernetes version (to simulate different Kubernetes versions you may want to specify an older version for Cluster A). Now, add the additional GitOps config to the cluster‑a.yaml file and replace <github-username> with your GitHub username. The gitOpsRef needs to be added to the existing Kind: Cluster definition configuration that was generated from the previous command rather than a new one.

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
  name: cluster-a # cluster name, change to cluster-b when creating blue cluster.
spec:
  ... # dots represent the exisiting generated cluster definition
  #GitOps Support
  gitOpsRef:
    name: gitops
    kind: FluxConfig
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: FluxConfig
metadata:
  name: gitops
spec:
  github:
    personal: true
    repository: environment-repository
    owner: <github-username>

Create Cluster A by executing the following:

eksctl anywhere create cluster -f ${CLUSTER_NAME_A}.yaml

Set the KUBECONFIG context for cluster A in an environment variable:

export K1=$PWD/$CLUSTER_NAME_A/$CLUSTER_NAME_A-eks-a-cluster.kubeconfig
export KUBECONFIG=$K1

Install MetalLB and Emissary Ingress controller Amazon EKS-A curated packages

The following instructions describe how to install MetalLB and Emissary Ingress controller via Amazon EKS-A curated packages.

eksctl anywhere generate package metallb --cluster $CLUSTER_NAME_A > metallb.yaml

Following the creation of the package specification YAML file, we need to configure the package. The following package Kubernetes specification creates an IPAddress Pool and L2Advertisement resource, which is required to inform MetalLB to advertise the allocated LoadBalancer IPs via the Address Resolution Protocol (ARP). The IPAddressPool is a user-specified host range on your vmware virtual network that should be excluded from DCHP allocation pool. Without the following configuration, any Kubernetes service of type Load Balancer created with the external IP allocation will be in a pending state.

Edit the metallb.yaml and configure the host address range. The Kubernetes specification should be similar to the following:

...
...
spec:
  packageName: metallb
  config: |
    IPAddressPools:
      - name: default
        addresses:
          - 10.220.0.97-10.220.0.120
    L2Advertisements:
      - ipAddressPools:
        - default

Next, create a Kubernetes namespace metallb-system and install the MetalLB package:

kubectl create namespace metallb-system
eksctl anywhere create packages -f metallb.yaml

Now to install the emissary ingress controller curated package:

eksctl anywhere generate package emissary --cluster $CLUSTER_NAME_A > emissary.yaml
eksctl anywhere create packages -f emissary.yaml

Validate the Emissary and MetalLB curated packages are installed:

> eksctl anywhere get packages --cluster $CLUSTER_NAME_A
NAME                  PACKAGE    AGE    STATE       CURRENTVERSION                                    TARGETVERSION                                              DETAIL
generated-emissary    emissary   163m   installed   3.0.0-0d4e0476a740b48a232041597ded2031595d9409    (latest)
 cluster-load-balancer metallb    163m   installed   0.12.1-0d4e0476a740b48a232041597ded2031595d9409   0.12.1-0d4e0476a740b48a232041597ded2031595d9409 (latest)

Next, configure the emissary-listener. The Listener Custom Resource Definition (CRD) defines where and how the Emissary-ingress should listen for requests from the network, and which Host definitions should be used to process those requests. You can learn more on this topic from Emissary-ingress listener resource.

> kubectl apply -f ./containers-blog-maelstrom/eksa-blue-green-upgrades/emissary/emissary-listeners.yaml
listener.getambassador.io/http-listener created
listener.getambassador.io/https-listener created

Validate if the emissary http-listener and https-listeners listener are configured correctly. The securityModel defines how the Listener decides whether a request is secure or insecure. You’ll notice our security model to be XFP for both the listeners, which means the requests are secure if — and only if — the X-Forwarded-Proto header indicates HTTPS.

❯ kubectl get listener -A                                                                                                                        
NAMESPACE         NAME             PORT   PROTOCOL   STACK   STATSPREFIX   SECURITY   L7DEPTH
 emissary-system   http-listener    8080   HTTPS                            XFP
 emissary-system   https-listener   8443   HTTPS                            XFP

Create GitOps enabled Amazon EKS Anywhere Cluster B

Following exactly the same procedure in the previous two sections, we create a GitOps-enabled Amazon EKS Anywhere Cluster B on the upgraded version of Kubernetes using the same GitHub repository. We install Amazon EKS-A curated packages, such as MetalLB and Emissary Ingress controller. Cluster B represents the green cluster. The eksctl anywhere installation process automatically creates a new folder for Cluster B under the clusters root folder in the Git environment repository on GitHub. Using this approach, you can modify the cluster configuration for all your Amazon EKS-A clusters in a single place.

Once the Cluster B is created, set the KUBECONFIG context for cluster B in an environment variable:

export CLUSTER_NAME_B=cluster-b
export K2=$PWD/$CLUSTER_NAME_B/$CLUSTER_NAME_B-eks-a-cluster.kubeconfig
export KUBECONFIG=$K2

Deploying a workloads and ingress mappings in Cluster A and B via Flux

Now, we’ll use Flux to deploy a sample workload to Cluster A (Blue) and Cluster B (Green) simultaneously. The following set of commands push a Kubernetes manifest for a sample workload along with emissary ingress controller mappings to the Git environment repository. The Flux operator detects these changes and applies the Kubernetes manifests to cluster A and B.

The <environment-repository> is a private GitHub repository created automatically when creating a GitOps-enabled Amazon EKS-A cluster. See the details here on how to clone a GitHub repository using SSH, as per the pre-requisites, it assumes you have Git set-up and SSH keys set-up for cloning and pushing to the repository from your local administrator machine.

# Create directories and copy manifest files for deploying sample app in your GitOps environment repo
git clone <environment-repository>
cd environment-repository
mkdir ./clusters/cluster-a/manifest-deployments
mkdir ./clusters/cluster-b/manifest-deployments
cp ../containers-blog-maelstrom/eksa-blue-green-upgrades/eks-anywhere-sample-app/hello-eks-a.yaml ./clusters/cluster-a/manifest-deployments ./clusters/cluster-b/manifest-deployments
cp ../containers-blog-maelstrom/eksa-blue-green-upgrades/eks-anywhere-sample-app/workload-mapping-a.yaml ./clusters/cluster-a/manifest-deployments 
cp ../containers-blog-maelstrom/eksa-blue-green-upgrades/eks-anywhere-sample-app/workload-mapping-b.yaml ./clusters/cluster-b/manifest-deployments
git add .
git commit -a -m "Adding a Sample workload to Cluster A and B"
git push
cd ..

Validate Cluster A and B have the same workload synced

Obtain the EXTERNAL-IP of the emissary load balancer from Cluster A using the below command:

❯ export KUBECONFIG=$K1
❯ export CLUSTER_A_EXTERNAL_IP=$(kubectl get svc generated-emissary -n emissary-system -o "go-template={{range .status.loadBalancer.ingress}}{{or .ip .hostname}}{{end}}")

Next, curl the application to validate it is accessible:

> curl -H "Host: hello.eksa-demo.cluster-a" -Lk http://$CLUSTER_A_EXTERNAL_IP


...

You have successfully deployed the hello-eks-a pod hello-eks-a-866ff6bbc7-xxxxx

For more information check out
https://anywhere.eks.amazonaws.com

…

Next, let’s test Cluster B. Obtain, the EXTERNAL-IP of the emissary load balancer from Cluster B using the below command:

❯ export KUBECONFIG=$K2
❯ export CLUSTER_B_EXTERNAL_IP=$(kubectl get svc generated-emissary -n emissary-system -o "go-template={{range .status.loadBalancer.ingress}}{{or .ip .hostname}}{{end}}")

Next, curl the application to validate it is accessible:

> curl -H "Host: hello.eksa-demo.cluster-a" -Lk http://$CLUSTER_B_EXTERNAL_IP


...

You have successfully deployed the hello-eks-a pod hello-eks-a-866ff6bbc7-xxxxx

For more information check out
https://anywhere.eks.amazonaws.com

…

With these steps, we’ve shown how Flux can be used to sync configuration between clusters. In a more complex scenario, where Cluster B is running on a newer version of Kubernetes, there may be more work required to get your workload running (e.g., if there is Kubernetes API deprecations).

Routing traffic across workloads in two clusters

To switch traffic between Cluster A (Blue) and Cluster B (Green) there are two options to consider. One approach using AWS Route 53 weighted routing policy, though you need to consider any DNS propagation delay when switching. Another approach is to use a top-level load-balancer like F5, Nginx, or HAProxy with dynamically updated routing. For approach two, there is no Domain Name System (DNS) propagation delay but the top-level load-balancer requires network connectivity to both Cluster A and B Emissary Ingress Kubernetes service. It’s recommended that client applications that invoke this endpoint, implement active retries in-case of temporary disconnection during traffic switch over.

Once you validated the workload is serving traffic on Cluster B, then delete Cluster A. You’ve just completed a blue/green upgrade.

Additional design considerations:

Network Load Balancer – A network load balancer was required to expose the Kubernetes ingress service to services outside the Kubernetes cluster. MetalLB was chosen as it provides Kubernetes service of type Load Balancer with a virtual IP address, from an IP address pool you can define. In this solution MetalLB was configured to advertise the allocated IP address via Layer2 using the Address Resolution Protocol (ARP) for IPv4 as the load balancers IP address wasn’t advertised outside of the network. MetalLB can also be configured to advertise via Border Gateway Protocol (BGP). A limitation of MetalB ARP deployment configuration is single-node bottlenecking. Only a single leader node is elected for MetalLB which limits Kubernetes service ingress bandwidth to the single node. An alternative load balancer you could consider is kube-vip.
Kubernetes Ingress Controller – Emissary was the chosen Kubernetes ingress controller. Alternative ingress controller solutions could have been used.
Fleet Capacity Management – To support blue/green approach to platform upgrades it is important to ensure in your data center or colocation facility there is adequate hardware and VM capacity, and the IP addresses available to support a second Amazon EKS-A cluster of a near production scale. For customers with a large number of clusters, this upgrade path may not be feasible due to the resources required.
GitOps Environment Management – It’s advised to have a shared environment Git repository per environment type (i.e., production and non-production) and in this repository have a directory per application environment. In this solution, the environment repository had a Cluster A and Cluster B subdirectory, where the configuration was stored for each respective Amazon EKS-A cluster. This enables environment specific configuration to be applied isolated from other environments. We used public GitHub for the Git repository hosting service, although you can use AWS CodeCommit or any other compatible Git repository that can be cloned via SSH.
Amazon EKS Anywhere Enterprise Subscriptions and Curated Packages – An Amazon EKS Anywhere Enterprise Subscription is required per Amazon EKS-A cluster you are running and would like support. An Amazon EKS Anywhere Enterprise Subscription is also required to make use of Amazon EKS-A curated packages.
Application Storage – Data management is more complex in a blue/green strategy. This post didn’t take into consideration application storage. If using an external database, then you would look at sharing these between the blue and green cluster.

Cleaning up

Use the following commands to clean up the resources created as part of this blog:

# Change the Cluster Context to Cluster-A
export KUBECONFIG=$K1
eksctl anywhere delete package cluster-load-balancer --cluster ${CLUSTER_A}
eksctl anywhere delete package generated-emissary --cluster ${CLUSTER_A}
eksctl anywhere delete cluster -f ${CLUSTER_NAME_A}.yaml

# Change the Cluster Context to Cluster-B
export KUBECONFIG=$K2
eksctl anywhere delete package cluster-load-balancer --cluster ${CLUSTER_B}
eksctl anywhere delete package generated-emissary --cluster ${CLUSTER_B}
eksctl anywhere delete cluster -f ${CLUSTER_NAME_B}.yaml

Finally, delete the GitOps repository created for this demo manually.

Conclusion

In this post, we showed you a blue/green Amazon EKS-A upgrade strategy using Flux as an alternative to an in-place upgrade. Customer might look at a blue/green upgrade strategy for various uses cases including reducing risks of upgrades, as part of a migration pattern to Amazon EKS in-region or part of a disaster recovery pattern.

For more information on getting started with Amazon EKS Anywhere checkout the Amazon EKS-A workshop and Amazon EKS-A documentation.

Containers

Blue/Green Kubernetes upgrades for Amazon EKS Anywhere using Flux

Introduction

Solution overview

Resources

Learn

Resources

Developers

Help