Setting up a Bottlerocket managed node group on Amazon EKS with Terraform

Introduction

Kubernetes, an open-source container management system, has surged in popularity and adoption in the past several years. From startups to large established enterprises across industry verticals are rapidly adopting it for their mission critical tasks and workloads. It is declarative, open source, and highly pluggable.

In this blog, we will discuss what is, along with how to create Amazon Elastic Kubernetes Service (Amazon EKS) managed node groups using Bottlerocket and Terraform. By the end of the blog, you will have a clear understanding of EKS, managed node groups, and Terraform.You will also have fully functional terraform scripts with which you can create an EKS stack with Bottlerocket managed node group through a single terraform command.

What is Amazon EKS?

Amazon Elastic Kubernetes Service (Amazon EKS) is a managed service that makes it easy for enterprises to run Kubernetes on AWS without needing to install and operate their own Kubernetes cluster. It runs upstream Kubernetes and is a certified Kubernetes conformant. The EKS control plane is highly scalable and available, deployed across multiple Availability Zones, and fully managed by AWS. EKS automatically detects and replaces unhealthy instances, scales them based on load, provides automated version updates and patching for the cluster control plane. Customers such as Intel, Snap, Intuit, GoDaddy, and Autodesk trust EKS to run their most sensitive and mission-critical applications.

It’s straightforward to get going with Amazon EKS:

1. Provision an Amazon EKS cluster:

Provision a cluster through the AWS Management Console, AWS CLI, AWS SDKs, and AWS CDKs.
Using eksctl cli
Using Infrastructure as Code software like Terraform.

2. Deploy compute:

Kubernetes cluster deploy workloads on top of nodes; with Amazon EKS, you have several options to choose from:
- Self-managed nodes: If you want complete control of your cluster compute resources, you can provision Amazon EC2 instances and manage their lifecycle yourself with self-managed nodes.
- Amazon EKS managed node groups: Amazon EKS managed node groups automates the provisioning and lifecycle management of Kubernetes nodes for you. This removes the undifferentiated heavy lifting of launching, configuring, patching, and upgrading instances, while still providing you some control over, and access to, your Kubernetes nodes.
- AWS Fargate: This service provides on-demand, right-sized compute capacity for containers. With Fargate, you don’t need to provision or manage the lifecycle of nodes at all, as compute resources are allocated on demand for you by AWS within the Fargate service.

3. When your cluster is ready, you can configure and use your favorite Kubernetes tools “kubectl” to communicate with the cluster.

4. Deploy and manage your workloads on your new Amazon EKS cluster the same way that you would with any other Kubernetes environment.

What is Bottlerocket?

Bottlerocket is a Linux-based open-source operating system that is purpose-built by Amazon Web Services to run containers. You can deploy Bottlerocket on virtual machines or bare metal hosts. It includes only the essential software to run containers, which improves resource usage, reduces the attack surface, and improves the availability of deployments. It is now generally available at no cost as an Amazon Machine Image (AMI) for Amazon Elastic Compute Cloud (EC2).

So why Bottlerocket why not existing general-purpose operating systems?

Higher uptime with lower cost and management complexity: Bottlerocket has a lower resource footprint, boot times, and security attack surface compared to general-purpose operating systems. A smaller footprint helps reduce costs because of decreased usage of storage, compute, and networking resources.
Container-optimized : Bottlerocket OS is optimized to run and manage large containerized deployments, it only contains packages and binaries that containers need to work optimally.
Improved security from automatic OS updates: In Bottlerocket, updates are applied as a single unit as soon as they are available in a minimally disruptive manner and can be rolled back if failures occur. This removes the risk of “botched” updates that can leave the system in an unusable state. Security updates can be automatically applied.
Open source: An open development model enables customers, partners, and all interested parties to make code and design changes to Bottlerocket.
Premium support: The use of AWS-provided builds of Bottlerocket on Amazon EC2 is covered under the same AWS support plans.

What is Terraform?

In the past, managing IT infrastructure was a hard job. System administrators had to manually manage and configure all of the hardware and software that was needed for the applications to run. However, in recent years, things have changed significantly. Infrastructure as Code tools like AWS CloudFormation and Terraform have transformed the way organizations maintain IT infrastructure.

Similar to AWS CloudFormation, Terraform can be used for building, changing, and versioning infrastructure safely and efficiently. Customers can use either one depending upon their use case and requirements.

The infrastructure Terraform can manage includes low-level components such as compute instances, storage, and networking, as well as high-level components such as DNS entries, SaaS features, etc. You can also use it for existing on-prem infrastructure in private clouds such as VMWare vSphere and OpenStack, or hosted on public clouds like Amazon Web Services.

Through Terraform, you can describe the components needed to run a single application or your entire datacenter in configuration files and let Terraform generates an execution plan describing what it will do to reach the desired state and then executes it to build the described infrastructure. As the configuration changes, Terraform detects and determines what changed and creates incremental execution plans which can be applied.

Problem statement:

By default, instances in a managed node group use the latest version of the Amazon EKS optimized Amazon Linux 2 AMI for its data plane. You can choose between standard and GPU variants of the Amazon EKS optimized Amazon Linux 2 AMI. At this point, Bottlerocket is not natively supported as a built-in OS choice for managed node groups though, in the future, it will be. Until then, this post and Terraform scripts should provide a reliable set of steps to build a managed node group with Bottlerocket nodes using launch templates.

Alright, now we have a good understanding of everything we’ll be using today in this blog post. So let’s get our hands dirty and dive in to create an EKS cluster with Bottlerocket managed node group through Terraform.

Solution architecture:

Prerequisites

An AWS account with admin privileges: For this blog, we will assume you already have an AWS account with admin privileges.
Command line tools: Mac/Linux users need to install the latest version of AWS CLI, aws-iam-authenticator, kubectl, and terraform (>=v0.13.0) on their workstation. Whereas Windows users need to create a Cloud9 environment in AWS and then install these CLIs inside their Cloud9 environment.

Step 1: Create a Terraform workspace directory and clone the repo

To set up your workspace and get started with this post, open your favorite terminal in your Mac/Linux workstation,

Then clone terraform codes in your current working directory

git clone https://github.com/aws-samples/amazon-eks-bottlerocket-mngnodegrp-terraform.git

Create a directory named “bottlerocket” and change your current working directory to “bottlerocket”. You can give any name for the directory.

mkdir bottlerocket && cd "$_"

Copy “provider.tf” from cloned directory to your current workspace (bottlerocket) directory.

cp ../amazon-eks-bottlerocket-mngnodegrp-terraform/provider.tf .

To see the content of “provider.tf“ run the cat command(optional)

cat provider.tf 

provider "aws" {
region = var.region
}

This file contains the information’s about the provider you’ll be using with Terraform. As shown above, by provider “aws”, we are instructing Terraform that we want to use AWS as a provider to create all the resources. We will pass the region name through a variable than hardcoding it. Since every script in this blog post is parametrized, you won’t need to change them. All the variables with their default values are stored in separate files.

Step 2: Create AWS Network stack for managed node groups

After the initial workspace setup, we are ready to create VPC and subnets for our worker’s nodes. We need these resources while creating the EKS cluster. Amazon EKS requires subnets in at least two AZs. You can use the existing VPC ONLY when it meets Amazon EKS specific requirements. In our current blog setup, we will create a separate VPC and use three AZ’s with both public and private subnets.

A subnet is public or private depending on whether or not traffic within the subnet is routed through the internet gateway. If the subnet’s traffic does not have a default route through an internet gateway, that subnet is considered to be private. Public subnets will be used for public-facing resources like load balancers, which will direct external traffic to pods running on the worker nodes in private subnets. We will also enable DNS hostname and DNS resolution support otherwise our worker will fail to register with the cluster.

The VPC configuration for our post includes the following:

A VPC with a size /16 IPv4 CIDR block (example: 10.0.0.0/16). This provides 65,536 private IPv4 addresses. Three public subnets with a /24 IPv4 CIDR block. This provides 256 private IPv4 addresses. Kubernetes looks for tags to discover cluster resources. The public subnet must have the following listed tags that helps Kubernetes to decide where to deploy external load balancers.

kubernetes.io/cluster/<cluster-name>	:	shared
kubernetes.io/cluster/role/elb			:	1

Three private subnets with a size /24 IPv4 CIDR block. This provides 256 private IPv4 addresses. The private subnet must have below listed tags that helps Kubernetes to decide where to deploy internal load balancers

kubernetes.io/cluster/<cluster-name>		:	shared
kubernetes.io/cluster/role/internal-elb		:	1

An Internet gateway. This connects the VPC to the Internet and other AWS services.
A NAT gateway with Elastic IPv4 address. Instances in the private subnets can send requests to the Internet through the NAT gateway over IPv4 (for example, for software updates).
A custom route table associated with the public subnet. This route table contains an entry that enables instances in the subnet to communicate with other instances in the VPC over IPv4, and an entry that enables instances in the public subnet to communicate directly with the Internet over IPv4.
The main route table associated with the private subnet. The route table contains an entry that enables instances in the subnet to communicate with other instances in the VPC over IPv4, and an entry that enables instances in the private subnet to communicate with the Internet through the NAT gateway over IPv4.

To set up VPC, run the following commands:

Ensure you are inside “bottlerocket” by running the pwd command

pwd

Copy vpc.tf, vpc_variables.tf, subnet.tf, and vpc_output.tf into “bottlerocket” directory using the cp command.

cp ../amazon-eks-bottlerocket-mngnodegrp-terraform/vpc.tf ../amazon-eks-bottlerocket-mngnodegrp-terraform/vpc_variables.tf ../amazon-eks-bottlerocket-mngnodegrp-terraform/vpc_output.tf ../amazon-eks-bottlerocket-mngnodegrp-terraform/subnets.tf .

vpc.tf and “subnet.tf” both are scripts that will create network resources in AWS. For configuration, they will refer to variables and their values from vpc_variables.tf, you can override the default values with custom using this file.

For example, you can change the value of “name” variable from “eks-bottlerocket-imnr” to something different and cidr_block to your custom cidr (172.0.0.0/16).

To execute the VPC and subnet script run below commands :
- terraform init: This will initialize our working directory containing Terraform configuration files. This is the first command that should be run after writing a new Terraform configuration or cloning an existing one from version control. It is safe to run this command multiple times.
- terraform plan: The terraform plan command will create an execution plan for our scripts.
- terraform apply: The terraform apply command is used to apply the changes required to reach the desired state and will create all the resources defined in the scripts.

terraform init
terraform plan
terraform apply --auto-approve

Once you successfully run these three commands you should see output as seen in the following figure. In your case, the subnet IDs and VPC IDs could be different.

Step 3: Set up IAM role for the EKS cluster and managed worker node

After our networking stack is created, we can move on to creating the IAM role for the EKS. A Kubernetes cluster managed by Amazon makes calls to other AWS services on our behalf for resource management. Hence, it is very important to ensure that the IAM role with proper permissions gets created before cluster setup, otherwise, the cluster won’t work properly.

Similar to the cluster, the EKS worker nodes kubelet daemon makes calls to AWS APIs on our behalf. Nodes receive permissions for these API calls through an IAM instance profile and associated policies. So we need to create an IAM role for worker nodes before registering them with the cluster.

We will create two roles :

Cluster role: This will be used by the control plane to make calls to other AWS services on our behalf.
Worker role: This will be used by worker nodes kubelet daemon to make calls to AWS APIs on our behalf.

To set up IAM, run the following commands :

Ensure you are inside “bottlerocket” by running the pwd command.

pwd

Copy iam_role.tf and iam_variables.tf into “bottlerocket” directory using the cp command.

cp ../amazon-eks-bottlerocket-mngnodegrp-terraform/iam_role.tf ../amazon-eks-bottlerocket-mngnodegrp-terraform/iam_variables.tf .

We’ll use iam_role.tf script to create both IAM roles in AWS. For their configuration, this script will refer to variables and their values from iam_variables.tf. The name of the first role is EKS-${var.name}-cluster-role, here ${var.name} is a way in terraform to do variable interpolation. We are going to use the value from the “iam_variables.tf“, You can override the default values with custom through this file.

To check what is in the role file, open iam_role.tf file in your favorite editor:

In the cluster role, we are allowing the cluster control plane to assume roles and then attaching two managed policies AmazonEKSClusterPolicy and AmazonEKSServicePolicy to the cluster role.
Similarly, in the worker role, we are permitting EC2 instances to assume roles and then attaching the managed policies AmazonEKSWorkerNodePolicy, AmazonEC2ContainerRegistryReadOnly, and AmazonSSMManagedInstanceCore to the worker role.

To execute the IAM script run below commands :

terraform plan
terraform apply --auto-approve

Once you successfully run the above two commands, you should see output like this:

Step 4: Create an EKS cluster

The next step is to create an EKS cluster. Amazon EKS cluster consists of two primary components:

The Amazon EKS control plane
Amazon EKS nodes that are registered with the control plane

In this step, we are only going to create an Amazon EKS control plane that will run the Kubernetes components, such as etcd and the Kubernetes API server.

For the EKS cluster, we’ll use one script named eks.tf. Through this file, we will first create a network security group for the control plane, and then pass its ID for the cluster to use while cluster creation. Then we will create an EKS cluster. Lastly, we will create a CloudWatch log group for cluster log aggregation.

To set up an Amazon EKS cluster, run the below commands :

Ensure you are inside “bottlerocket” by running the pwd command.

pwd

Copy eks.tf, eks_variables.tf, and eks-output.tf into “bottlerocket” directory using cp command.

cp ../amazon-eks-bottlerocket-mngnodegrp-terraform/eks.tf ../amazon-eks-bottlerocket-mngnodegrp-terraform/eks_variables.tf ../amazon-eks-bottlerocket-mngnodegrp-terraform/eks_output.tf .

Then run terraform plan and apply.

terraform plan
terraform apply --auto-approve

Once you successfully run the above two commands, you should see cluster name and endpoint in your output as shown below.

We can now verify the new cluster in our AWS account:

AWS CLI: You will see your new cluster name in the output of the “aws eks list-clusters –o table” command on your terminal.

AWS Management Console: Navigate to AWS Management Console → Elastic Kubernetes Service→ Amazon EKS → Clusters. You should see a new EKS cluster.

Step 5: Create a launch template for the managed node group

The next step is to create a launch template for our worker node group that EKS will use to create worker nodes and run workloads. A launch template contains the configuration information to launch an instance. Launch templates enable users to store launch parameters so that we do not have to specify them every time we launch an instance. For example, a launch template can contain the AMI ID, instance type, and network settings that we typically use to launch instances.

In this step, we will first set a variable to pick Bottlerocket AMI by querying SSM parameter store. Next using terraform data source we will fetch AMI block device mapping. Then again using data source we will fetch EKS cluster configurations like name and endpoint to use them while generating “bottlerocket_config.toml.tpl” bootstrap user data file for the Bottlerocket worker node group.

To set up the launch template, run the following commands:

Ensure you are inside “bottlerocket” by running the pwd command.

pwd

Copy launch_template.tf, launch_template_variables.tf, and launch_template_output.tf into “bottlerocket” directory using cp command. For launch template customization, you can use the “launch_template_variables.tf” file to set variables based on your requirements.
- For example, to use your ssh key pairs (default no ssh key in launch template):
  - Generate and upload the ssh key pairs in to your AWS account.
  - set the “key_name” variable in “launch_template_variables.tf.”

cp ../amazon-eks-bottlerocket-mngnodegrp-terraform/launch_template.tf ../amazon-eks-bottlerocket-mngnodegrp-terraform/launch_template_variables.tf ../amazon-eks-bottlerocket-mngnodegrp-terraform/launch_template_output.tf .

Copy bootstrap user data template directory recursively using cp command.

cp -R ../amazon-eks-bottlerocket-mngnodegrp-terraform/templates .

You can view the Bottlerocket bootstrap template file using “cat ./templates/bottlerocket_config.toml.tpl” command from inside bottlerocket directory. Its contents should be exactly as shown below.

[settings.kubernetes]
*cluster-name = "${cluster_name}"*
*api-server = "${cluster_endpoint}"*
*cluster-certificate = "${cluster_ca_data}"*
[settings.kubernetes.node-labels]
${node_labels}
[settings.kubernetes.node-taints]
${node_taints}
[settings.host-containers.admin]
*enabled = true*
superpowered = true
%{ if admin_container_source != "" }
source = "${admin_container_source}"
%{ endif }%

Then run terraform init, plan, and apply.

terraform init
terraform plan
terraform apply --auto-approve

Once you successfully run the above three commands you should see launch template name in your output as shown below.

Step 6: Create a EKS config map for managed node to join to the cluster

The next step is to create a Kubernetes config map resources using terraform. Amazon EKS does not provide a cluster-level API parameter or resource to the the underlying Kubernetes cluster to allow worker nodes to join the cluster. We will create a Kubernetes ConfigMaps that will allow worker nodes to join the cluster via AWS IAM role authentication.

To generate a config map manifest and apply it to cluster run the following commands:

Ensure you are inside “bottlerocket” by running the pwd command.

pwd

Copy aws_auth.tf into “bottlerocket” directory using cp command.

cp ../amazon-eks-bottlerocket-mngnodegrp-terraform/aws_auth.tf .

Then run terraform init, plan and apply.

terraform init
terraform plan
terraform apply --auto-approve

Once you successfully run the above three commands you should see output as shown below.

Step 7: Create and add the managed node group to EKS workloads

The next step is to create a managed worker node group that Amazon EKS will use to deploy workloads. Managed node groups automate the provisioning and lifecycle management of nodes (Amazon EC2 instances) for Amazon EKS clusters. With Amazon EKS managed node groups, you don’t need to separately provision or register the Amazon EC2 instances that provide compute capacity to run Kubernetes applications.

Now, let’s create a managed node group using the launch template we created in Step 5:

Ensure you are inside “bottlerocket” by running the pwd command.

pwd

Copy eks_workload_node_group.tf, eks_workload_node_group_variables.tf, and eks_workload_node_group_output.tf into “bottlerocket” workspace directory using cp command.

cp ../amazon-eks-bottlerocket-mngnodegrp-terraform/eks_workload_node_group.tf ../amazon-eks-bottlerocket-mngnodegrp-terraform/eks_workload_node_group_variables.tf ../amazon-eks-bottlerocket-mngnodegrp-terraform/eks_workload_node_group_output.tf .

If you want to change the min, max, and desired node count, you can do that through the “eks_workload_node_group_variables.tf” file.
Then run terraform init, plan, and apply.

terraform init
terraform plan
terraform apply --auto-approve

Once you successfully run the above three commands you should see a new node group named “eks-bottlerocket-imnr-mng-worker-####” in your cluster with minimum and desired node count as 3 and maximum node count as 5 as shown below.

We can now verify the new cluster in our AWS account. Navigate to the AWS Management Console → Elastic Kubernetes Service→ Amazon EKS → Clusters. You should see the cluster you created in Step 4.

To verify the node group, select and click on the cluster name → Configuration → Compute, You should see a new managed node group attached to your cluster.

To view the managed node group configuration, select and click node group name. You will get navigated to a node group configuration page where you can see the configuration details.

To verify managed node group instance operating system, navigate to the AWS Management Console → Elastic Kubernetes Service→ Amazon EKS → Clusters → select and click Cluster name → Overview → click any instance as shown below.

On Instance Info page check “OS Image”.

Step 8 : Deploy a sample application and validate the setup.

The next step is to deploy an application and see if our EKS cluster using Bottlerocket managed node groups is working as expected.

Locate a directory named“sample-application” inside the source code directory, then copy it to the Bottlerocket directory.

cp -R ../amazon-eks-bottlerocket-mngnodegrp-terraform/sample-application .

Find your region and cluster name from “vpc_variables.tf” file(Step 2), then run below command after changing the region and cluster name as per your values. This will configure your kubectl to connect to your EKS cluster.

For example: aws eks –region <your region name> update-kubeconfig –name <your cluster name>.

aws eks --region us-east-1 update-kubeconfig --name eks-bottlerocket-imnr

Inside “sample-application” You will see 6 Kubernetes manifest files for our microservice application.
Deploy the sample-application on to the cluster.

kubectl apply -f ./sample-application/

After running above command you will see output like this:

Once the application is deployed, then we need to find the LoadBalancer FQDN to access it. To get the LoadBalancer name, run the following command:

kubectl get svc

Copy the LoadBalancer name from the terminal and access it through any web browser. You should see a page as shown below:

The above page validates that our application has been successfully deployed on Amazon EKS on a managed node group with the Bottlerocket operating system.

Step 8: Cleanup

To clean everything up, follow these steps:

First ensure you are in the bottlerocket directory by running “pwd,” then delete the sample application by running the following command:

kubectl delete -f ./sample-application/

Verify and confirm that LoadBalancer has been deleted by running the following command:

kubectl get svc

Then run terraform destroy.

Please take extra precaution while running this command, this will wipe out everything. So please ensure you are inside the directory you created in Step 1 and then run this.

terraform destroy --auto-approve

Conclusion

In this post, we outlined how to create an EKS cluster, Bottlerocket managed node group using Terraform, and then deployed a sample application to validate the setup.

Finally, rather than deploying each component sequentially we can deploy entire EKS stack (except the application) using one single terraform command. To do so, follow below steps:

Deploy everything using single command

Clone terraform codes in your home directory

git clone  https://github.com/aws-samples/amazon-eks-bottlerocket-mngnodegrp-terraform.git
cd amazon-eks-bottlerocket-mngnodegrp-terraform

then run init, plan and apply commands.

terraform init
terraform plan
terraform apply --auto-approve

Containers