AWS Storage Blog
Running Kubernetes cluster with Amazon EKS Distro across AWS Snowball Edge
AWS Snowball Edge customers are running applications for edge local data processing, analysis, and machine learning using Amazon EC2 compute instances on Snowball Edge devices in remote or disconnected locations. Customers use Snowball Edge devices in locations including, but not limited to, cruise ships, oil rigs, and factory floors with no or limited network connectivity. The ability to run containerized applications on Snow devices makes it even easier for customers to standardize operations across all their environments, providing better consistency, flexibility, portability, and application density.
With the announcement of Amazon EKS Distro (EKS-D) at re:Invent 2020, we have taken a big step towards supporting customers’ container needs at the edge. EKS-D provides the same Kubernetes distribution based on, and used by, Amazon Elastic Kubernetes Service (Amazon EKS) to create reliable and secure Kubernetes clusters. With EKS-D, you can rely on the same versions of Kubernetes and its dependencies deployed by EKS.
In this blog, we talk about how you can leverage Amazon EKS-D to build a Kubernetes cluster across three AWS Snowball Edge Compute Optimized devices. You can also follow these steps to set up a Kubernetes cluster on AWS Snowball Edge Storage Optimized and AWS Snowcone devices as well.
Overview
We set up an EKS-D Kubernetes cluster using seven EC2 instances running on three AWS Snowball Edge Compute Optimized devices. We are going to run one control plane node and one worker node on each Snowball. We also set up a simple load balancer for control plane API servers using HA Proxy on a standalone EC2 instance running on one of the Snowball Edge devices. With the seven EC2 instances, we build a sample layout as shown in the following diagram. The diagram demonstrates how to run multiple Kubernetes control plane nodes and multiple Kubernetes worker nodes across Snowball Edge devices.
We use the following specific software versions in this blog:
- CentOS 7
- EKS-D Kubernetes v1.18.9
- EKS-D Ectd v3.4.14
- Flannel13.0 for container networking
- HAProxy v1.5.18 for control plane API server load balancing
Getting started
To get started, you must order three Snowball Edge devices by following the documentation on creating an AWS Snowball Edge job. As part of the ordering process, you must prepare and select a CentOS 7 marketplace AMI by following the documentation on adding an AMI from AWS Marketplace.
Once your three Snowball Edge devices arrive, you can connect them to your local on-premises network using the documentation on connecting to your local network. Ensure that each Snowball Edge device gets a unique IP address and ensure you are on the same subnet. Once completed, use AWS OpsHub or the AWS Snowball Edge client to unlock all three devices.
Creating Amazon EC2 instances on Snowball Edge
Follow the documentation on using Amazon EC2 compute instances to launch three EC2 instances with instance type sbe-c.large (2 vCPUs and 8 GiB memory) on one Snowball Edge device and two EC2 instances on each of the other two devices. Configure and attach virtual network interfaces (NIC) to each EC2 instance so that they can be routable in your local network. Each EC2 instance must be reachable by SSH and have access to the public internet.
Note: Kubernetes requires at least 2 vCPUs for a control plane node to set up properly.
In this blog post, we use the following configuration to give you an idea of a sample setup, and we refer to this configuration through the rest of the blog. In your configuration, use the public IP addresses that you defined during virtual NIC setup of the instances.
Role |
Instance hostname | Snowball Edge | Public IP address | Instance type |
Control plane node 0 | master0.snowball | Snowball Edge 0 | 10.42.0.10 | sbe-c.large |
Control plane node 1 | master1.snowball | Snowball Edge 1 | 10.42.0.20 | sbe-c.large |
Control plane node 2 | master2.snowball | Snowball Edge 2 | 10.42.0.30 | sbe-c.large |
Worker node 0 | worker0.snowball | Snowball Edge 0 | 10.42.0.11 | sbe-c.large |
Worker node 1 | worker1.snowball | Snowball Edge 1 | 10.42.0.21 | sbe-c.large |
Worker node 2 | worker2.snowball | Snowball Edge 2 | 10.42.0.31 | sbe-c.large |
Load Balancer | haproxy.snowball | Snowball Edge 1 | 10.42.0.130 | sbe-c.large |
Configuring your EC2 instance used as load balancer
In this example, we are going to run HAProxy as a simple load balancer for the Kubernetes API server on three control plane nodes. The load balancer ensures that if one control node API server goes down, the Kubernetes cluster can continue to operate. For simplicity, we have it set up on an EC2 instance on one of the devices. Consider setting up external load balancers with redundancy for better availability.
- Log into your load balancer instance, escalate the user to root, and install HAProxy.
ssh -i mykey.pem centos@<instance public IP address>
sudo su
yum -y install haproxy
- Update
/etc/haproxy/haproxy.cfg
to load balance incoming traffic to three API servers on control plane nodes.
cat <<EOF >/etc/haproxy/haproxy.cfg
global
daemon
maxconn 256
defaults
mode http
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
listen stats *:9999
stats enable
stats hide-version
stats uri /
stats auth aws:snowball
frontend kubernetes-frontend
bind *:6443
mode tcp
option tcplog
default_backend kubernetes-backend
backend kubernetes-backend
mode tcp
option tcp-check
balance roundrobin
server master0 10.42.0.10:6443 check
server master1 10.42.0.20:6443 check
server master2 10.42.0.30:6443 check
EOF
- Change SELinux Policy to allow HAProxy to bind address.
setsebool -P haproxy_connect_any on
- Start the HAProxy.
systemctl start haproxy
systemctl enable haproxy
With the preceding configuration, you can take advantage of HAProxy Monitoring, which is a web interface for monitoring the load balancer and the servers. You can reach the HAProxy monitoring endpoint by opening 10.42.0.130:9999 in your web browser. The sample username:password
is aws:snowball
.
Configuring your Amazon EC2 instances used as Kubernetes nodes
In this section, we update the system configurations inside the Amazon EC2 instances to get ready for Kubernetes cluster deployment.
Updating the networking configurations
We start by updating the networking configurations to match the proposed configuration in the table above.
- Log into your three control plane instances and three worker instances one by one, or in parallel with multiple SSH sessions. Use your individual SSH session to go through the reset of this section.
ssh -i mykey.pem centos@<instance public IP address>
- Once on the instance, escalate the user to root.
sudo su
- Use the following command to set the proper hostname for each instance.
hostnamectl set-hostname <node’s hostname>
For example, on 10.42.0.10, you want to run the following command to set the hostname.
hostnamectl set-hostname master0.snowball
- Add a cloud-init configuration so that the new hostname doesn’t get reverted when the instance is rebooted.
echo "preserve_hostname: true" > /etc/cloud/cloud.cfg.d/99_hostname.cfg
- Create a second loopback interface so that the instance is aware of its own public IP. This is necessary so that applications running inside the instance can listen on its public IP. Restart the network to make it take effect.
PublicIP="<instance's public IP>"
cat <<EOF > /etc/sysconfig/network-scripts/ifcfg-lo:1
DEVICE=lo:1
IPADDR=${PublicIP}
NETMASK=255.255.255.255
NETWORK=127.0.0.0
BROADCAST=127.255.255.255
ONBOOT=yes
NAME=loopback
EOF
systemctl restart network
After network service is restarted, you can run ifconfig
to verify the loopback interface lo:1
is created and proper IP address is attached.
- Update
/etc/hosts
to be able to translate FQDN to IP for all nodes. If you have a DNS server available in your local network, you can register individual records with your DNS server and update/etc/resolv.conf
to use your DNS server address instead.
rm -f /etc/hosts
cat <<EOF > /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.42.0.130 haproxy.snowball
10.42.0.10 master0.snowball
10.42.0.11 worker0.snowball
10.42.0.20 master1.snowball
10.42.0.21 worker1.snowball
10.42.0.30 master2.snowball
10.42.0.31 worker2.snowball
EOF
Updating the operating system configurations
Stay logged in as root user and complete the following steps to make changes to additional instance configurations.
- Disable SELinux. This is necessary since kubelet lacks SELinux support.
setenforce 0
sed -i --follow-symlinks 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux
- Enable the module for cluster communication.
cat <<EOF >/etc/sysctl.d/kubernetes.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl --system
- Disable swap to prevent memory allocation issues.
swapoff -a
Installing EKS-D on all control plane and worker instances
In this section, we start installing the software and tooling for running Docker and EKS-D on the instances.
Installing Docker
We start by installing Docker on each instance.
- Stay logged in as root user and install necessary packages.
yum -y update
yum install -y yum-utils device-mapper-persistent-data lvm2 wget
- Add the Docker repo and install Docker.
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum install -y docker-ce
- Configure the Docker cgroup driver to systemd; enable and start Docker.
sed -i '/^ExecStart/ s/$/ --exec-opt native.cgroupdriver=systemd/' /usr/lib/systemd/system/docker.service
systemctl daemon-reload
systemctl enable docker --now
You can verify the Docker process is up and running by executing the following command. The process should be in active state.
systemctl status docker
Installing EKS-D
Now it’s time to install EKS-D on each instance.
- Install CNI from the AWS EKS-D repo.
mkdir -p /opt/cni/bin
wget -q https://distro.eks.amazonaws.com/kubernetes-1-18/releases/1/artifacts/plugins/v0.8.7/cni-plugins-linux-amd64-v0.8.7.tar.gz
tar zxf cni-plugins-linux-amd64-v0.8.7.tar.gz -C /opt/cni/bin/
- Download kubeadm, kubelet, and kubectl from the AWS EKS-D repo.
wget -q https://distro.eks.amazonaws.com/kubernetes-1-18/releases/1/artifacts/kubernetes/v1.18.9/bin/linux/amd64/kubeadm
wget -q https://distro.eks.amazonaws.com/kubernetes-1-18/releases/1/artifacts/kubernetes/v1.18.9/bin/linux/amd64/kubelet
wget -q https://distro.eks.amazonaws.com/kubernetes-1-18/releases/1/artifacts/kubernetes/v1.18.9/bin/linux/amd64/kubectl
mv kubeadm kubelet kubectl /usr/bin/
chmod +x /usr/bin/kubeadm /usr/bin/kubelet /usr/bin/kubectl
- Install dependencies for kubelet. Pass the kueblet arguments here to change the cgroup driver to systemd, which has to match the cgroup driver used by Docker.
yum -y install conntrack ebtables socat
cat <<EOF > /etc/sysconfig/kubelet
KUBELET_EXTRA_ARGS='--cgroup-driver=systemd'
EOF
- Create directories and files that are usually needed by kubeadm and kubelet. Enable kubelet to start on boot.
mkdir -p /etc/kubernetes/manifests
mkdir -p /usr/lib/systemd/system/kubelet.service.d
cat <<EOF > /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf
# Note: This dropin only works with kubeadm and kubelet v1.11+
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
# This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
# the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
EnvironmentFile=-/etc/sysconfig/kubelet
ExecStart=
ExecStart=/usr/bin/kubelet \$KUBELET_KUBECONFIG_ARGS \$KUBELET_CONFIG_ARGS \$KUBELET_KUBEADM_ARGS \$KUBELET_EXTRA_ARGS
EOF
cat <<EOF > /usr/lib/systemd/system/kubelet.service
[Unit]
Description=kubelet: The Kubernetes Node Agent
Documentation=https://kubernetes.io/docs/
Wants=network-online.target
After=network-online.target
[Service]
ExecStart=/usr/bin/kubelet
Restart=always
StartLimitInterval=0
RestartSec=10
[Install]
WantedBy=multi-user.target
EOF
systemctl enable kubelet
- Download the Docker images needed for the control plane from the Amazon ECR public repo.
docker pull public.ecr.aws/eks-distro/etcd-io/etcd:v3.4.14-eks-1-18-1
docker pull public.ecr.aws/eks-distro/kubernetes/pause:v1.18.9-eks-1-18-1
docker pull public.ecr.aws/eks-distro/kubernetes/kube-scheduler:v1.18.9-eks-1-18-1
docker pull public.ecr.aws/eks-distro/kubernetes/kube-proxy:v1.18.9-eks-1-18-1
docker pull public.ecr.aws/eks-distro/kubernetes/kube-apiserver:v1.18.9-eks-1-18-1
docker pull public.ecr.aws/eks-distro/kubernetes/kube-controller-manager:v1.18.9-eks-1-18-1
docker pull public.ecr.aws/eks-distro/coredns/coredns:v1.7.0-eks-1-18-1
- Due to kubeadm’s hardcoded value, change the image tag on the following images.
docker tag public.ecr.aws/eks-distro/kubernetes/pause:v1.18.9-eks-1-18-1 public.ecr.aws/eks-distro/kubernetes/pause:3.2
docker tag public.ecr.aws/eks-distro/coredns/coredns:v1.7.0-eks-1-18-1 public.ecr.aws/eks-distro/kubernetes/coredns:1.6.7
- You can run
docker images
to confirm all the images are downloaded and tagged successfully. The output should look like the following:
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
public.ecr.aws/eks-distro/kubernetes/pause 3.2 ff45cda5b28a 7 days ago 702kB
public.ecr.aws/eks-distro/kubernetes/pause v1.18.9-eks-1-18-1 ff45cda5b28a 7 days ago 702kB
public.ecr.aws/eks-distro/kubernetes/kube-proxy v1.18.9-eks-1-18-1 7b3d7533dd46 7 days ago 580MB
public.ecr.aws/eks-distro/kubernetes/kube-scheduler v1.18.9-eks-1-18-1 3f6c60b31475 7 days ago 504MB
public.ecr.aws/eks-distro/kubernetes/kube-controller-manager v1.18.9-eks-1-18-1 b50f3c224c59 7 days ago 573MB
public.ecr.aws/eks-distro/kubernetes/kube-apiserver v1.18.9-eks-1-18-1 a2ea61c746e1 7 days ago 583MB
public.ecr.aws/eks-distro/etcd-io/etcd v3.4.14-eks-1-18-1 e77eead05c5e 7 days ago 498MB
public.ecr.aws/eks-distro/coredns/coredns v1.7.0-eks-1-18-1 6dbf7f0180db 7 days ago 46.7MB
public.ecr.aws/eks-distro/kubernetes/coredns 1.6.7 6dbf7f0180db 7 days ago 46.7MB
As a reminder, you must perform all the steps in this section on all control plane nodes and worker nodes.
Standing up a Kubernetes cluster on Snowball Edge
In this section, we go through how to bring up the Kubernetes cluster based on all the setup we have done.
- SSH into the control plane node 0 (10.42.0.10). Create cluster configuration.
sudo su
cat <<EOF > kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
controlPlaneEndpoint: "haproxy.snowball:6443"
networking:
podSubnet: "10.244.0.0/16"
etcd:
local:
imageRepository: public.ecr.aws/eks-distro/etcd-io
imageTag: v3.4.14-eks-1-18-1
extraArgs:
listen-peer-urls: "https://0.0.0.0:2380"
listen-client-urls: "https://0.0.0.0:2379"
imageRepository: public.ecr.aws/eks-distro/kubernetes
kubernetesVersion: v1.18.9-eks-1-18-1
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: "10.42.0.10"
EOF
With this configuration, we initialize the cluster pod subnet using the Flannel default IP range of 10.244.0.0/16. advertiseAddress
is used to set the advertise address for this particular control plane node’s API server. controlPlaneEndpoint
is used to set the load balancer’s endpoint for nodes to reach control-plane API server.
- Initialize the Kubernetes cluster with the configuration file. Add
--upload-certs
to upload control plane certs to kubeadm-certs in Secret, so that other control plane nodes can pull later.
kubeadm init --config kubeadm-config.yaml --upload-certs
- From the output of the preceding step, you get two kubeadm join commands. One command is needed for worker nodes to join the cluster. The other command is needed for additional control plane nodes to join the cluster.
The following is a sample kubeadm join command to run on other control plane nodes for them to join the cluster.
kubeadm join haproxy.snowball:6443 --token <token> \
--discovery-token-ca-cert-hash <discovery-token-ca-cert-hash> \
--control-plane --certificate-key <certificate-key>
- To get the control plane nodes to join the cluster properly, we must set the proper advertise address setting to its public IP when executing the
kubeadm join
command. For control plane node 1 (10.42.0.20) to join the cluster, the actual command looks like the following one. See the added--apiserver-advertise-address
argument. Become a root user by runningsudo su
before executing the following command.
kubeadm join haproxy.snowball:6443 --token <token> \
--discovery-token-ca-cert-hash <discovery-token-ca-cert-hash> \
--control-plane --certificate-key <certificate-key> \
--apiserver-advertise-address=<current-instance-public-IP>
- To get worker nodes to join the cluster, run the
kubeadm join
command on the worker nodes. The command should look like the following. Again, you must become a root user by runningsudo su
before executing the following command.
kubeadm join haproxy.snowball:6443 --token <token> \
--discovery-token-ca-cert-hash <discovery-token-ca-cert-hash>
- Go back to control node 0 (10.42.0.10). Run the following with a non-root user to set up kubectl.
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
- Deploy Flannel.
By default, Flannel auto-detects the default interface and uses the IP address assigned to it. As a result, the private IPs used for individual instances are used by Flannel. This blocks the communication across devices.
In order to enable Flannel to use the public IP instead, we use annotations, which add additional arbitrary key-value pairs to existing Kubernetes objects.
On the node where you have kubectl set up, run the following command for all the control plane nodes and worker nodes before deploying Flannel. The node name is the hostname for individual EC2 instances. The public IP address is the public IP associated with individual EC2 instances. Repeat the following command six times with node-specific configs.
kubectl annotate node <node name> flannel.alpha.coreos.com/public-ip-overwrite=<Public IP address> --overwrite
With the setup in this blog, we must run the following six commands.
kubectl annotate node master0.snowball flannel.alpha.coreos.com/public-ip-overwrite=10.42.0.10 --overwrite
kubectl annotate node master1.snowball flannel.alpha.coreos.com/public-ip-overwrite=10.42.0.20 --overwrite
kubectl annotate node master2.snowball flannel.alpha.coreos.com/public-ip-overwrite=10.42.0.30 --overwrite
kubectl annotate node worker0.snowball flannel.alpha.coreos.com/public-ip-overwrite=10.42.0.11 --overwrite
kubectl annotate node worker1.snowball flannel.alpha.coreos.com/public-ip-overwrite=10.42.0.21 --overwrite
kubectl annotate node worker2.snowball flannel.alpha.coreos.com/public-ip-overwrite=10.42.0.31 --overwrite
Once complete, you are ready to deploy Flannel.
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
You now have your first EKS-D Kubernetes cluster up and running on Snowball Edge devices! You can run the following commands to take a peek at the cluster setup.
kubectl get nodes -o wide
kubectl get pod -o wide -n kube-system
With our setup, the output should look like the following. The version column in the output confirms that we are running EKS-D on Snowball Edge successfully.
Deploying a sample application
Now, you can go through this section to deploy a sample application on your EKS-D Kubernetes cluster. We are going to deploy an NGINX web server with default configuration. Log into the node where you had kubectl configured.
Deploy the application using the kubectl create deployment
command.
kubectl create deployment nginx-deployment --image=nginx
After a few seconds, you can run the following command and expect to see the NGINX pod running on one of the worker nodes.
kubectl get pod -o wide
With our cluster setup, the output looks like the following.
In this case, the NGINX pod is running on worker node 2 with IP 10.244.4.3. You can validate the application is running by reaching the endpoint with curl. You should expect to see an output showing “Welcome to nginx!”
Cleaning up
If you would like to clean up everything or start from scratch, you can terminate all seven EC2 instances by invoking the TerminateInstance API against EC2 compatible endpoints running on your Snowball Edge devices.
If you would like to return your Snowball Edge, please follow the documentation on powering off the Snowball Edge and returning the device.
Conclusion
In this blog, we walked you through how you can run an Amazon EKS-D Kubernetes cluster across multiple AWS Snowball Edge Compute Optimized devices. You can also follow these steps to set up Kubernetes clusters using a mix of Snowball Edge Compute Optimized, Snowball Edge Storage Optimized, and AWS Snowcone devices. You now have the option to adopt modern Kubernetes and container deployments at the edge or move your on-premises infrastructure onto AWS Snow Family services. By running Amazon EKS-D on AWS Snow Family services, you can take advantage of both offerings and enable your on-premises computing deployment to improve flexibility, consistency, agility, and security. Thanks for reading! Please do not hesitate to leave questions or comments in the comments section.
To learn more about AWS Snow Family and Amazon EKS-D, check out the following links: