Containers

Building an Amazon ECS Anywhere home lab with Amazon VPC network connectivity

Since 2014, Amazon Elastic Container Service (Amazon ECS) has helped AWS customers orchestrate containerized application deployments across a wide range of different compute environments. Initially, Amazon ECS could only be used with AWS managed compute hardware, such as Amazon Elastic Compute Cloud (Amazon EC2) instances, AWS Fargate, AWS Wavelength, and AWS Outposts. With the general availability of Amazon ECS Anywhere (ECS Anywhere), it is now possible to use your own compute hardware as capacity for an Amazon ECS cluster.

This post will cover the process of building a home lab for running Amazon ECS tasks. The home lab will allow you to use the Amazon ECS API to launch tasks on your own compute hardware. Additionally, by using AWS Site-to-Site VPN, you can access a remote Amazon Virtual Private Cloud (Amazon VPC) from your local network or access your local network from the remote Amazon VPC. The Site-to-Site VPN allows local tasks running in a local Amazon ECS Anywhere cluster to talk to Amazon Relational Database Service (Amazon RDS), Amazon ElastiCache, or other fully managed AWS services inside the Amazon VPC. Furthermore, the local cluster can receive inbound connections from Amazon VPC hosted services like Application Load Balancer (ALB) or Network Load Balancer (NLB).

The architecture

To understand how ECS Anywhere works, we need to look at the components that make it function. Each piece of hardware or virtual machine that you want to use for ECS Anywhere requires a few components to function as part of the Amazon ECS cluster.

The first component is an agent that is connected to AWS Systems Manager. When you install the AWS Systems Manager Agent (SSM Agent), you supply a secret activation code that allows the agent to register itself with AWS Systems Manager. The agent uses the activation code to register the hardware device as a managed instance and download a secret key for that managed instance. From that point on, the managed instance can be assigned an AWS Identity and Access Management (IAM) role and will automatically receive IAM credentials for that role. This role is essential because it allows the instance to make all the other required communications to other AWS services like Amazon ECS.

The next essential component is Docker, which will launch containers on the managed host. One of the containers that Docker launches will be the Amazon ECS agent. This agent uses the managed instance’s IAM role to connect to the Amazon ECS control plane in an AWS Region. Once connected, it can receive instructions from the Amazon ECS control plane on what tasks and containers to launch. The agent can also submit task telemetry to the control plane about the lifecycle of those containers and their health.

The next piece to understand is how the networking operates between an Amazon VPC in an AWS Region and the local network that is running the ECS Anywhere cluster.

On the Amazon VPC side, we can use AWS Site-to-Site VPN to provide a fully managed VPN gateway. The gateway is configured to add a route in the route table for the Amazon VPC. The route directs all traffic that is addressed to the on-premises network CIDR range out via the VPN gateway. There is a corresponding self-managed VPN gateway on-premises, as well as self-managed routes so that any traffic addressed to the Amazon VPC CIDR range is directed to the on-premises end of the VPN gateway.

With this configuration, any resources in the on-premises network can talk to resources in the Amazon VPC using the private IP addresses of the Amazon VPC hosted resources. For instance, an on-premises Raspberry Pi can send traffic to an Amazon RDS instance running in the Amazon VPC. Additionally, Amazon VPC-hosted resources can talk to resources on-premises using their private IP addresses. In the previous diagram, an Amazon VPC-hosted NLB communicates with a Raspberry Pi using the private IP address of the Raspberry Pi.

It is important to remember that as long as on-premises devices have internet connectivity, they can communicate to many AWS services using the internet gateway alone. This includes Amazon ECS, Amazon DynamoDB, Amazon Simple Storage Service (S3), and many other AWS services that are globally accessible via public service endpoints. Extra networking configuration is only required for AWS services that are tied to a specific Amazon VPC.

Building your home lab hardware

ECS Anywhere is designed to function on a wide range of different devices and operating systems, so there are many different hardware options you can choose from to build your home lab. You may already have some hardware that you wish to use for your Amazon ECS cluster, but perhaps you were looking for a reason to buy some new stuff! This section contains a parts list to help you build an ECS Anywhere home lab using Raspberry Pi devices. These parts can be substituted with other alternatives for your home lab, but you may find this list to be a good starting point for your build.

Raspberry Pi has a few key benefits. The devices are fairly cheap, so you can build a larger cluster on a lower budget. Additionally they are ARM-based devices, which can make them perfect for testing ARM builds locally at home if your other development devices are all Intel-based. Finally, Raspberry Pi is a low-power device that can be run with passive cooling, so it can be ideal if you don’t want a lot of noisy fans in your office.

For compute, you might use the following components:

  • 4 x Raspberry Pi 4 Model B (8 GB RAM, Broadcom BCM2711, Quad core Cortex-A72 (ARM v8) 64-bit SoC @ 1.5 GHz). This provides a total of 16 cores and 32 GB of memory for running Amazon ECS tasks.
  • 4 x Raspberry Pi Power over Ethernet HAT. This is an add-on circuit board that sits on top of the Raspberry Pi and gives it the ability to be powered over the Ethernet cable. This part is optional, but ideal if you don’t want to deal with power cables in addition to network cables.
  • 4 x 128 GB SD card. This serves as persistence for the Raspberry Pi to store the operating system and everything you install and run on the Raspberry Pi, including Docker images.
  • 4 x Raspberry Pi low profile heatsinks, for enhanced passive cooling. The heat sink must be small enough to fit in between the Raspberry Pi and the PoE Hat.

 

To get these individual devices running as a neat, self-contained cluster, consider the following components:

  • 4 x Cat8 Ethernet patch cable, 1-foot length, to connect the Raspberry Pis to a switch.
  • 1 x TP-Link 8 Port Power over Ethernet Switch. This supplies all of the Raspberry Pis with power and a wired internet connection so that you don’t overload your Wi-Fi network. This switch also fits almost perfectly in the case.
  • 1 x cluster case for Raspberry Pi from C4 Labs. The clear case lets you easily see the devices inside of it. It has eight detachable bays and room for a switch on the bottom. The case comes with room for cooling fans which you can install if you plan to fill all eight slots with devices and therefore need some mechanical help to actively pull more air flow through the case.

Setting up the software

After assembling all the hardware, you will need to do a bit of software setup. Specifically for Raspberry Pi, you can use the Raspberry Pi imager to install an operating system on the SD cards. For this example cluster, we will use Ubuntu 20. Add your public key as an authorized user for SSH access. Then, use SSH to run commands on the Raspberry PI itself. You will need to make a few adjustments to the firmware configuration.

First, in order for your devices to run Docker and Amazon ECS tasks, you must enable memory cgroups in the boot config. This may not be enabled by default, but it is necessary for Docker to function properly when you set hard or soft memory limits in your Amazon ECS task definition. You can do this by adding cgroup_enable=memory to the file /boot/firmware/cmdline.txt.

Additionally, for Raspberry Pi, you may want to reduce the noise from the cluster. The stock power over Ethernet hats have cooling fans that try to keep the temperature much lower than strictly necessary. If you have installed heatsinks, they will likely keep the device passively cooled well below its maximum operating temperature under typical conditions. The following configuration in /boot/firmware/usercfg.txt can keep the fans from turning on until the device reaches 68C.

dtoverlay=rpi-poe
dtparam=poe_fan_temp0=68000
dtparam=poe_fan_temp1=72000
dtparam=poe_fan_temp2=76000
dtparam=poe_fan_temp3=80000

By running cat /sys/class/thermal/thermal_zone0/temp, you can monitor the Raspberry Pi temperature and ensure that it remains reasonable under load. As long as the ambient temperature is not too high, the Raspberry Pi can be passively cooled by the heatsink until it’s been under a heavy load for an extended period of time. Once the temperature exceeds 68 degrees, the fan comes on for active cooling.

With these initial tweaks out of the way, you can use the AWS Management Console to get an activation command to run on each of your devices. This command will turn the device into capacity for your Amazon ECS cluster.

The script automatically installs and configures the SSM Agent, Docker, and the Amazon ECS agent without any further input necessary. Once the script has finished running, you can see the devices show up under AWS Systems Manager Fleet Manager. You’ll also see a few details, such as their local IP address within your home network.

One of the useful features of Fleet Manager is the ability to connect to a managed instance. This even works for devices that are behind Network Address Translation, with only a private IP address. This is because the SSM agent on the host opens a control channel back to SSM. This can be used to both monitor the managed instance, as well as open an AWS SSM Session Manager session to it. When you select “Start session,” it opens a shell right there in the browser. By launching htop you can see the process tree, with SSM agent spawning a session worker that runs the shell.

Setting up the AWS Site-to-Site VPN

There are a few different networking approaches that you can use for your Amazon ECS cluster. The simplest approach for inbound traffic would be to configure port forwarding on your home router. This lets you send traffic to your home IP address and have it forwarded to one of the devices on your network.

But what if you want to connect back to resources inside an Amazon VPC? For large on-premises environments, you could use AWS Direct Connect to get a direct connection to AWS. However, for a home lab, this is not ideal. As a reduced-cost alternative, you may use an AWS Site-to-Site VPN. One of the Raspberry Pis can serve as a dedicated VPN gateway that runs strongSwan as an IPsec VPN. By going to the Amazon VPC console, you can create the AWS-side VPN gateway and download the instructions for configuring the on-premises VPN gateway.

By following the downloaded instructions, you can set up an IPsec VPN tunnel between your home network and your Amazon VPC. Run ipsec status to verify that the tunnels are up. In this case, you can see the output for a VPN connection that has been configured between a home network at 192.168.1.0/24 and an Amazon VPC at 10.0.0.0/16.

Next, you need to configure a VPN Gateway route on the other Raspberry Pi devices. This route will tell them that any traffic addressed to the IP range of the Amazon VPC should use the local IP address of the VPN Raspberry Pi as a gateway to reach the Amazon VPC. In the following example, the VPN Raspberry Pi has a local IP address of 192.168.1.196.

sudo route add -net 10.0.0.0/16 gw 192.168.1.196

We can verify network connectivity by using Fleet Manager again. Open a session to one of the Raspberry Pi’s, then ping an Amazon EC2 instance running inside the Amazon VPC.

In the previous screenshot, you can see the results of running a ping command on a Raspberry Pi that is running on my desk in my home network in New York City. The address that is being pinged is an Amazon EC2 instance running inside a VPC in US East (N. Virginia). The ping makes the roundtrip from New York City to US East, and back, in less than 11ms. Your results may vary based on your internet connection, and distance from the AWS Region where you provisioned your Amazon VPC.

Launching a load balanced workload in the home cluster

With all the hardware and software setup prerequisites out of the way, you can launch a test workload in the cluster and verify that this all works. You can launch an Amazon ECS service into your home lab cluster using the new EXTERNAL launch type. Both your task definition and your Amazon ECS service must be created with the EXTERNAL launch type.

The following example shows a service called redis. Redis is a stateful service that relies on persisting information to disk. Stateful services are tricky because they have to run in the same place that their data has been stored. With ECS Anywhere you can solve this using task placement constraints. Task placement constraints can be used to pin workloads to a specific device. In this case, the redis task has been pinned to a specific Raspberry Pi using an Amazon ECS instance attribute redis=true and the task placement constraint: memberOf attribute:redis=true

Once your tasks launch, you can get metrics and logs for them as if they were running on an Amazon EC2 instance inside your VPC.

If you are hosting a service that needs to receive incoming traffic from the internet, then you will likely want a load balancer. One benefit of hosting that load balancer in an AWS Region is that you don’t have to configure DNS to point at the address of your home network. Instead the load balancer can serve your traffic using its own IP address, and your home network will be protected behind the VPN connection.

If you choose this configuration then you must launch the NLB or ALB into the same VPC that you configured in your AWS Site-to-Site VPN. If the load balancer is inside that VPC, it can send traffic to the private IP addresses of your devices inside your home network via the VPN gateway. Currently, you need to add the private IP address and port combinations of your home lab devices to the load balancer manually. Fortunately, your devices on a home network will likely have static IP addresses, so this configuration should be stable.

Once the load balancer has registered each target as healthy, you can send traffic to the load balancer’s DNS name. In this case, the response is a simple HTML page served by a small Node.js app, with a hit counter that is persisted into the Redis.

Conclusion

With ECS Anywhere, you can orchestrate container experiments in your own home lab from the cloud. You don’t have to run the control plane on your own devices. Instead, you can use your devices purely as capacity for your applications. With ECS Anywhere, you can define the desired state of the software on your devices and leave the distribution of tasks to hosts to be handled by the Amazon ECS control plane. Amazon ECS monitors your tasks and restarts them if necessary. Additionally, with Fleet Manager, you get the added ability to connect to and control your managed devices from anywhere that you have an internet connection, even if your devices are behind NAT, inside your private network.

There’s a lot more to ECS Anywhere, and we encourage you to check out the documentation and the launch blog. You can also join our live stream on Containers from the Couch in June, where we will walk through ECS Anywhere and take your questions.