Field Notes: Managing an Amazon EKS Cluster Using AWS CDK and SHI’s Cloud Resource Property Manager
This post is contributed by Bill Kerr and Raj Seshadri
For most customers, infrastructure is hardly done with CI/CD in mind. However, Infrastructure as Code (IaC) should be a best practice for DevOps professionals when they provision cloud-native assets. Microservice apps that run inside an Amazon EKS cluster often use CI/CD, so why not the cluster and related cloud infrastructure as well?
This blog demonstrates how to spin up cluster infrastructure managed by CI/CD using CDK code and SHI’s Cloud Resource Property Manager (CRPM) property files. Managing cloud resources is ultimately about managing properties, such as instance type, cluster version, etc. CRPM helps you organize all those properties by importing bite-sized YAML files, which are stitched together with CDK. It keeps all of what’s good about YAML in YAML, and places all of the logic in beautiful CDK code. Ultimately this improves productivity and reliability as it eliminates manual configuration steps.
In this architecture, we create a six node Amazon EKS cluster. The Amazon EKS cluster has a node group spanning private subnets across two Availability Zones. There are two public subnets in different Availability Zones available for use with an Elastic Load Balancer.
Changes to the primary (master) branch triggers a pipeline, which creates CloudFormation change sets for an Amazon EKS stack and a CI/CD stack. After human approval, the change sets are initiated (executed).
Get ready to deploy the CloudFormation stacks with CDK
First, to get started with CDK you spin up an AWS Cloud9 environment, which gives you a code editor and terminal that runs in a web browser. Using AWS Cloud9 is optional but highly recommended since it speeds up the process.
Create a new AWS Cloud9 environment
- Navigate to Cloud9 in the AWS Management Console.
- Select Create environment.
- Enter a name and select Next step.
- Leave the default settings and select Next step again.
- Select Create environment.
Download and install the dependencies and demo CDK application
In a terminal, let’s review the code used in this article and install it.
# Install TypeScript globally for CDK npm i -g typescript # If you are running these commands in Cloud9 or already have CDK installed, then # skip this command npm i -g aws-cdk # Clone the demo CDK application code git clone https://github.com/shi/crpm-eks # Change directory cd crpm-eks # Install the CDK application npm i
Create the IAM service role
When creating an EKS cluster, the IAM role that was used to create the cluster is also the role that will be able to access it afterwards.
Deploy the CloudFormation stack containing the role
Let’s deploy a CloudFormation stack containing a role that will later be used to create the cluster and also to access it. While we’re at it, let’s also add our current user ARN to the role, so that we can assume the role.
# Deploy the EKS management role CloudFormation stack cdk deploy role --parameters AwsArn=$(aws sts get-caller-identity --query Arn --output text) # It will ask, "Do you wish to deploy these changes (y/n)?" # Enter y and then press enter to continue deploying
Notice the Outputs section that shows up in the CDK deploy results, which contains the role name and the role ARN. You will need to copy and paste the role ARN (ex. arn:aws:iam::123456789012:role/eks-role-us-east-1) from your Outputs when deploying the next stack.
Example Outputs: role.ExportsOutputRefRoleFF41A16F = eks-role-us-east-1 role.ExportsOutputFnGetAttRoleArnED52E3F8 = arn:aws:iam::123456789012:role/eksrole-us-east-1
Create the EKS cluster
Now that we have a role created, it’s time to create the cluster using that role.
Deploy the stack containing the EKS cluster in a new VPC
Expect it to take over 20 minutes for this stack to deploy.
# Deploy the EKS cluster CloudFormation stack # REPLACE ROLE_ARN WITH ROLE ARN FROM OUTPUTS IN ROLE STACK CREATED ABOVE cdk deploy eks -r ROLE_ARN # It will ask, "Do you wish to deploy these changes (y/n)?" # Enter y and then press enter to continue deploying
Notice the Outputs section, which contains the cluster name (ex. eks-demo) and the UpdateKubeConfigCommand. The UpdateKubeConfigCommand is useful if you already have
kubectl installed somewhere and would rather use your own to interact with the cluster instead of using Cloud9’s.
Example Outputs: eks.ExportsOutputRefControlPlane70FAD3FA = eks-demo eks.UpdateKubeConfigCommand = aws eks update-kubeconfig --name eks-demo --region us-east-1 --role-arn arn:aws:iam::123456789012:role/eks-role-us-east-1 eks.FargatePodExecutionRoleArn = arn:aws:iam::123456789012:role/eks-cluster- FargatePodExecutionRole-U495K4DHW93M
Navigate to this page in the AWS console if you would like to see your cluster, which is now ready to use.
kubectl with access to cluster
If you are following along in Cloud9, you can skip configuring
If you prefer to use
kubectl installed somewhere else, now would be a good time to configure access to the newly created cluster by running the UpdateKubeConfigCommand mentioned in the Outputs section above. It requires that you have the AWS CLI installed and configured.
aws eks update-kubeconfig --name eks-demo --region us-east-1 --role-arn arn:aws:iam::123456789012:role/eks-role-us-east-1 # Test access to cluster kubectl get nodes
Leveraging Infrastructure CI/CD
Now that the VPC and cluster have been created, it’s time to turn on CI/CD. This will create a cloned copy of github.com/shi/crpm-eks in CodeCommit. Then, an AWS CloudWatch Events rule will start watching the CodeCommit repo for changes and triggering a CI/CD pipeline that builds and validates CloudFormation templates, and executes CloudFormation change sets.
Deploy the stack containing the code repo and pipeline
# Deploy the CI/CD CloudFormation stack cdk deploy cicd # It will ask, "Do you wish to deploy these changes (y/n)?" # Enter y and then press enter to continue deploying
Notice the Outputs section, which contains the CodeCommit repo name (ex. eks-ci-cd). This is where the code now lives that is being watched for changes.
Example Outputs: cicd.ExportsOutputFnGetAttLambdaRoleArn275A39EB = arn:aws:iam::774461968944:role/eks-ci-cd-LambdaRole-6PFYXVSLTQ0D cicd.ExportsOutputFnGetAttRepositoryNameC88C868A = eks-ci-cd
Review the pipeline for the first time
Navigate to this page in the AWS console and you should see a new pipeline in progress. The pipeline is automatically run for the first time when it is created, even though no changes have been made yet. Open the pipeline and scroll down to the Review stage. You’ll see that two change sets were created in parallel (one for the EKS stack and the other for the CI/CD stack).
- Select Review to open an approval popup where you can enter a comment.
- Select Reject or Approve. Following the Review button, the blue link to the left of Fetch: Initial commit by AWS CodeCommit can be selected to see the infrastructure code changes that triggered the pipeline.
Clone the new AWS CodeCommit repo
Now that the golden source that is being watched for changes lives in a AWS CodeCommit repo, we need to clone that repo and get rid of the repo we’ve been using up to this point.
If you are following along in AWS Cloud9, you can skip cloning the new repo because you are just going to discard the old AWS Cloud9 environment and start using a new one.
Now would be a good time to clone the newly created repo mentioned in the preceding Outputs section Next, delete the old repo that was cloned from GitHub at the beginning of this blog. You can visit this repository to get the clone URL for the repo.
Review this documentation for help with accessing your private AWS CodeCommit repo using HTTPS.
Review this documentation for help with accessing your repo using SSH.
# Clone the CDK application code (this URL assumes the us-east-1 region) git clone https://git-codecommit.us-east-1.amazonaws.com/v1/repos/eks-ci-cd # Change directory cd eks-ci-cd # Install the CDK application npm i # Remove the old repo rm -rf ../crpm-eks
Deploy the stack containing the Cloud9 IDE with
kubectl and CodeCommit repo
If you are NOT using Cloud9, you can skip this section.
To make life easy, let’s create another Cloud9 environment that has
kubectl preconfigured and ready to use, and also has the new CodeCommit repo checked out and ready to edit.
# Deploy the IDE CloudFormation stack cdk deploy ide
Configuring the new Cloud9 environment
kubectl and the code are now ready to use, we still have to manually configure Cloud9 to stop using AWS managed temporary credentials in order for
kubectl to be able to access the cluster with the management role. Here’s how to do that and test kubectl:
1. Navigate to this page in the AWS console.
2. In Your environments, select Open IDE for the newly created environment (possibly named eks-ide).
3. Once opened, navigate at the top to AWS Cloud9 -> Preferences.
4. Expand AWS SETTINGS, and under Credentials, disable AWS managed temporary credentials by selecting the toggle button. Then, close the Preferences tab.
5. In a terminal in Cloud9, enter
aws configure. Then, answer the questions by leaving them set to None and pressing enter, except for Default region name. Set the Default region name to the current region that you created everything in. The output should look similar to:
AWS Access Key ID [None]: AWS Secret Access Key [None]: Default region name [None]: us-east-1 Default output format [None]:
6. Test the environment
kubectl get nodes
If everything is working properly, you should see two nodes appear in the output similar to:
NAME STATUS ROLES AGE VERSION ip-192-168-102-69.ec2.internal Ready <none> 4h50m v1.17.11-ekscfdc40 ip-192-168-69-2.ec2.internal Ready <none> 4h50m v1.17.11-ekscfdc40
You can use
kubectl from this IDE to control the cluster. When you close the IDE browser window, the Cloud9 environment will automatically shutdown after 30 minutes and remain offline until the next time you reopen it from the AWS console. So, it’s a cheap way to have a
kubectl terminal ready when needed.
Delete the old Cloud9 environment
If you have been following along using Cloud9 the whole time, then you should have two Cloud9 environments running at this point (one that was used to initially create everything from code in GitHub, and one that is now ready to edit the CodeCommit repo and control the cluster with
kubectl). It’s now a good time to delete the old Cloud9 environment.
- Navigate to this page in the AWS console.
- In Your environments, select the radio button for the old environment (you named it when creating it) and select Delete.
- In the popup, enter the word Delete and select Delete.
Now you should be down to having just one AWS Cloud9 environment that was created when you deployed the ide stack.
Trigger the pipeline to change the infrastructure
Now that we have a cluster up and running that’s defined in code stored in a AWS CodeCommit repo, it’s time to make some changes:
- We’ll commit and push the changes, which will trigger the pipeline to update the infrastructure.
- We’ll go ahead and make one change to the cluster nodegroup and another change to the actual CI/CD build process, so that both the eks-cluster stack as well as the eks-ci-cd stack get changed.
1. In the code that was checked out from AWS CodeCommit, open up res/compute/eks/nodegroup/props.yaml. At the bottom of the file, try changing minSize from 1 to 4, desiredSize from 2 to 6, and maxSize from 3 to 6 as seen in the following screenshot. Then, save the file and close it. The res (resource) directory is your well organized collection of resource properties files.
2. Next, open up res/developer-tools/codebuild/project/props.yaml and find where it contains computeType: ‘BUILD_GENERAL1_SMALL’. Try changing BUILD_GENERAL1_SMALL to BUILD_GENERAL1_MEDIUM. Then, save the file and close it.
3. Commit and push the changes in a terminal.
cd eks-ci-cd git add . git commit -m "Increase nodegroup scaling config sizes and use larger build environment" git push
4. Navigate to this page in the AWS console and you should see your pipeline in progress.
5. Wait for the Review stage to become Pending.
6. Following the Approve action box, click the blue link to the left of “Fetch: …” to see the infrastructure code changes that triggered the pipeline. You should see the two code changes you committed above.
7. After reviewing the changes, go back and select Review to open an approval popup.
8. In the approval popup, enter a comment and select Approve.
9. Wait for the pipeline to finish the Deploy stage as it executes the two change sets. You can refresh the page until you see it has finished. It should take a few minutes.
10. To see that the CodeBuild change has been made, scroll up to the Build stage of the pipeline and click on the AWS CodeBuild link as shown in the following screenshot.
11. Next, select the Build details tab, and you should determine that your Compute size has been upgraded to 7 GB memory, 4 vCPUs as shown in the following screenshot.
12. By this time, the cluster nodegroup sizes are probably updated. You can confirm with
kubectl in a terminal.
# Get nodes kubectl get nodes
If everything is ready, you should see six (desired size) nodes appear in the output similar to:
NAME STATUS ROLES AGE VERSION ip-192-168-102-69.ec2.internal Ready <none> 5h42m v1.17.11-ekscfdc40 ip-192-168-69-2.ec2.internal Ready <none> 5h42m v1.17.11-ekscfdc40 ip-192-168-43-7.ec2.internal Ready <none> 10m v1.17.11-ekscfdc40 ip-192-168-27-14.ec2.internal Ready <none> 10m v1.17.11-ekscfdc40 ip-192-168-36-56.ec2.internal Ready <none> 10m v1.17.11-ekscfdc40 ip-192-168-37-27.ec2.internal Ready <none> 10m v1.17.11-ekscfdc40
Cluster is now manageable by code
You now have a cluster than can be maintained by simply making changes to code! The only resources not being managed by CI/CD in this demo, are the management role, and the optional AWS Cloud9 IDE. You can log into the AWS console and edit the role, adding other Trust relationships in the future, so others can assume the role and access the cluster.
Do not try to delete all of the stacks at once! Wait for the stack(s) in a step to finish deleting before moving onto the next step.
1. Navigate to this page in the AWS console.
2. Delete the two IDE stacks first (the ide stack spawned another stack).
3. Delete the ci-cd stack.
4. Delete the cluster stack (this one takes a long time).
5. Delete the role stack.
Cloud Resource Property Manager (CRPM) is an open source project maintained by SHI, hosted on GitHub, and available through npm.
In this blog, we demonstrated how you can spin up an Amazon EKS cluster managed by CI/CD using CDK code and Cloud Resource Property Manager (CRPM) property files. Making updates to this cluster is as easy as modifying the property files and updating the AWS CodePipline. Using CRPM can improve productivity and reliability because it eliminates manual configurations steps.