Networking & Content Delivery
Configuring an Application Load Balancer on AWS Outposts
AWS Outposts bring AWS infrastructure and services to virtually any datacenter, co-location space, or on-premises facility, in the form of a physical rack connected to the AWS global network. AWS services run locally on the Outpost, and you can access the full range of AWS services available in your Region—including Application Load Balancer (ALB). However, configuring an ALB for Outposts is slightly different than creating an Application Load Balancer in an AWS Region. This post provides an overview of how to set up ALB for Outposts to scale and load balance resources. In addition, I will look at how to view events, such as scaling the ALB itself or the resources within its target group.
This blog assumes you are familiar with Outposts, including local gateway (LGW) functionality and customer-owned IP (Co-IP) address ranges. If you want to get more familiar with Outposts in general, then the user guide, What is AWS Outposts, is a great place to start.
Outposts are of particular interest to customers with very low latency use cases and need to bring load balancing functionality on-premises as a result. One common use case is the need to have low latency communication to web application servers. Outposts can provide these services on premises. The ALB adds the ability to load balance HTTP and HTTPS streams at low latency from an on-premises, scalable, and resilient environment. This is key for media or gaming use cases that are generating live video streams, or for a manufacturing company using web-based API operations to communicate with production line equipment, amongst others.
To provide application server resilience without ALB requires load balancers on premises, pointed at the customer-owned Elastic IP addresses of the application server instances. This means sizing those load balancers for peak utilization from the beginning, and creating complex scripts to allow on-premises load balancers to scale AWS Outposts resources.
With the release of the Application Load Balancer (ALB) on AWS Outposts, this function can be moved into the AWS environment. The ALB scales itself (based on available Outpost capacity) and is integrated with Auto Scaling groups to scale target instances. It also integrates with Route 53 to handle DNS resolution of the Co-IP addresses of the ALB.
While the Application Load Balancer can also be used to load balance Amazon ECS and EKS workloads, in this blog post we focus on EC2 instances as targets.
The aim of this post is to take you through the deployment of an Application Load Balancer within an AWS Outpost, and point that ALB it towards a target group of web servers created by an Auto Scaling group. Traffic is generated from an on-premises environment, targeting the DNS name of the ALB that load balances the traffic between instances in the target group. When the incoming traffic exceeds the capacity of the ALB as initially deployed, the will ALB scale itself. We are not showing the Auto Scaling group scale, since that is a standard function. More information on this can be found in our documentation, Elastic Load Balancing and Amazon EC2 Auto Scaling.
We also discuss considerations for sizing AWS Outposts, and requirements for the ALB. These are things we don’t normally think about when running in an AWS Region. However, within an Outpost, the capacity is bound by the resources within the rack (or racks). Finally, we consider the cost of the solution.
Differences from a Regional ALB
There are some key differences within AWS Outposts that must be considered when deploying an ALB. These fall into six areas:
- ALB resources – The ALB must consume EC2 instance resources within the AWS Outposts. It can be deployed on c5, c5d, r5, r5d, m5 and m5d instances, and requires two (for resilience). It initially consumes two IP addresses from within the Outpost’s subnet within the VPC. Since it is load-balancing traffic from on premises, it requires two Elastic IP addresses from the Co-IP pool.
- ALB scaling – The ALB requires two additional instances of the next size up from the current ALB resources in order to scale. These must be available concurrently with the existing instances. The ALB also needs another two Co-IP addresses from the Elastic IP address pool. Once scaling has completed and the original resources are released, all are in use for a period of time.
For example, if the ALB deploys on c5.large instances initially, then there must be c5.xlarge instances in order for it to scale itself up. This scaling continues up to c5.4xlarge, beyond this point it cannot scale up further, so it scales out. It is important to note that whatever instance type is first used, that is the family it will continue to use as it scales. ALB always chooses resources in a specific order. c5 instances are used first, then if there are no c5 instances available, other supported instance families are used. You cannot steer the ALB to use m5 if you have c5 instances available. This is important to remember when sizing the Outpost.
- ALB failures – If the ALB tries to carry out a scaling event and sufficient resources are not available within the Outpost, then it fails. The ALB reports its inability to scale using the Personal Health Dashboard, and continues load balancing with its current resources.
- Web server resources and scaling –as scaling metrics are reached when using an Auto Scaling group, it scales out the number of instances, and adds them to the target group of the ALB. However, if there is not sufficient capacity left within the Outpost, the Auto Scaling group fails to scale the backend. If a web server fails, the failed instance drops out of the Auto Scaling group and is replaced with a new one. This continues until the target group sizes match the metrics of the Auto Scaling group.
- ALB targets – an ALB forwards traffic to targets on the Outpost where it is deployed (using instance names), and to IP addresses of on-premises targets. These are contained in an “instance target group.” However, it does not forward traffic to targets in Region, even when using IP addresses. The ALB is designed to load balance in AWS Outposts and on-premises. If Regional load balancing is required, an ALB should be created in the Region rather than on the Outpost.
- DNS resolution – when the ALB is externally facing on-premises resources, DNS resolution provides the Co-IP addresses for the ALB instead of a public address. The ALB name is still globally resolvable (as with a Regional external ALB) either using the public Route 53 endpoints, or via Route 53 Resolver endpoints within a VPC. On-premises services that must resolve this address can do so in either way. The same response is given in each case—the Co-IP addresses of the ALB. It is important to use the ALB name, rather than specific IP addresses, as they may change because of scaling or failure.
General considerations when planning AWS Outposts capacity
One key difference with AWS Outposts is that they have a finite amount of defined resources. Once those resources are consumed, any attempts to launch additional resources are met with an “insufficient capacity error.” Good planning for AWS Outposts means not using 100% of the capacity available so that there is spare capacity if there is a hardware failure. This is no different from standard on-premises planning for peak, rather than average, utilization and is usually referred to as spare, or “buffer capacity.”
With AWS Outposts, there is good reason to size a web farm for peak capacity, since the resources are already available. In that case, the ALB is not providing any scaling capability of the backend farm. However, the use of load balancing and Auto Scaling groups means that the ALB automatically restores peak capacity if an instance or hardware failure occurs. Even in this scenario, the ALB still scales itself if the resources are available.
When planning for the size of AWS Outposts needed, ALB resources must be added to the overall mix of resources, so enough capacity is available to cover target group instances and the ALB.
In addition, ALB must be considered when defining a Co-IP pool size. These pools can be anything between a /26 and /16 CIDR range (approx. 60–65,000 usable addresses). If extensive use of ALB is going to be required, then at least four Co-IP addresses must be available to each ALB deployed. This is true for both steady-state and scaling activities. (The actual number could be higher if the ALB goes through two stages of scaling before releasing the smallest instances back to the pool.) Thus, it is important to have spare capacity in the Co-IP pool, as load balancer scaling fails if it is unable to assign a Co-IP address. Address space also must be considered for the choice of VPC subnet, although this is usually more flexible to assign.
Within this environment, there is an ALB deployed on a pair of r5.large instances, within the AWS Outposts subnet. These instances are deployed as the ALB is configured , since there were no m5.large or c5.large instances available, so the r5 family was used.
Each ALB instance has a Co-IP mapped to it, and Route 53 resolves these for the on-premises environment. The Co-IPs were assigned at time of creation by choosing an ALB with external IP addresses, then choosing the Co-IP pool as the resource that supplies the addresses. These ALBs forward traffic to a farm of two web servers (in this case, Amazon Linux 2 instances running NGINX as a web server target), within a target group, configured by an Auto Scaling group. This is set to scale between two and eight instances with a desired value of 2, and with its scaling metric set to
RequestCountPerTarget. The ALBs scale as the traffic increases, based on a dynamic algorithm that takes the number and size of requests in to account.
Traffic is generated from an on-premises environment, connecting to the AWS Outposts over the LGW. The traffic generators in our case are using wrk2, an open source HTTP traffic generator available on GitHub.
In the configuration process that follows, I have highlighted the steps that specifically relate to the ALB on Outposts. I do not go into the detail on how to configure the target groups, the Auto Scaling group, or launch templates. These are covered in the general configuration of an ALB, and they are no different when working with AWS Outposts.
The following diagram shows the architecture:
If setting up an Application Load Balancer with Auto Scaling groups is new to you, then you might want to try this in Region first to get used to the process. Once you have successfully managed that, then you can proceed with the configuration of an ALB on AWS Outposts. There is a good tutorial on automatic scaling in the ALB, Set up a scaled and load-balanced application, available in our documentation.
The order you start configuration is very important. Components must be set up in the following order:
- Create the target group. This is used by both the ALB and the Auto Scaling group
- Create the ALB and point it towards the target group.
- Create the Launch template. This tells the Auto Scaling group what to do when it launches an instance
- Create the Auto Scaling group, and associate it with the ALB and target group and the launch template it uses.
1 – creating the target group
This is a standard target group, but make sure the VPC you select has a subnet in your Outpost.
2 – creating the ALB
Once the target group exists, then configure an Application Load Balancer. This is done in exactly the same way as the configuration in Region. For the ALB to be accessible from on-premises, the type must be “internet-facing.” At that point, you can select an IP pool owned by the customer. Once you have assigned a Co-IP pool, then you are only able to deploy the ALB to subnets within the AWS Outposts that are associated with the local gateway (LGW).
It should be noted that while the type of ALB selected is ‘internet-facing’, it doesn’t actually have any external public connection. This is just a way of being able to select the pool of Elastic IP addresses to use. In the case of AWS Outposts, this is the Co-IP pool, which is most likely a private range.
Having previously created the target group, you should be able to point the ALB to it, and creating the list of instances that are being load balanced. However, at this point, there are no instances in the target group. That happens once the Auto Scaling group is created.
Once the ALB has been created, then you find its DNS name in the description. This is globally valid, and is the target name that on-premises instances are pointed to.
3 – creating the launch template
Before you create the Auto Scaling group, you must create a launch template to describe the instance types and configuration the Auto Scaling group uses as it launches instances. This is done in the same way as within the Region.
4 – creating the Auto Scaling group
Once the other three items are created, then it is possible to configure the Auto Scaling group.
The Auto Scaling group should target all its instances as On-Demand Instances. Remember, when choosing your primary instance type it must be a type that exists on your AWS Outposts. You can also use this list to control what instance types the Auto Scaling group can create, limiting the possibility of it conflicting with other resource requirements on the Outposts.
Then select the VPC and AWS Outposts subnet only as a target. Load balancing should be enabled, and pointed to the target group you created in step 1.
Now set the required group size, and create a scaling policy of type ‘target tracking’ that allows the Auto Scaling group to calculate scaling as a function of ALB request count. In this case, we made the count 650,000 requests per second. In addition, make sure that the instances have time to come alive before adding them to the Auto Scaling group.
Checking the configuration
Once all this is complete, the ALB should launch and then use the Auto Scaling group to launch backend instances from the launch template description. In this case, because we chose a desired capacity of two, there should be two backend web servers launched into the AWS Outposts. The screenshots that follow show the Auto Scaling group configuration, the instances launched by the Auto Scaling group, and the ALB target group. If you check, the instances launched by the ALB should have the same ID as those within the target group.
Auto Scaling group configuration
Instances in Auto Scaling group
Instances in the Target Group
Checking the name resolution
From an on-premises Linux server, I can now check to see what addresses I get resolved for the ALB. I send the request using the DNS name from the ALB configuration, and I get two results. These are two Co-IPs that have been mapped to the ALB instances.
If I try to access the web server from that address, I get a response from one of the backend NGINX hosts that are in the Auto Scaling group.
Testing the scalability of the ALB
As mentioned earlier, the ALB can automatically scale itself. We ran tests in order to see that happen. We used wrk2 on some on-premises traffic generators pointed towards the DNS name of the ALB. We ran multiple parallel processes on the traffic generator, so we could see if the traffic was being load balanced equally between the backend NGINX web servers.
As we increased the traffic load, the ALB scaled, and we noted that the addresses of the ALB DNS name resolved changed. This was because of the ALB scaling up from r5.large to r5.xlarge instances. As you can see, the resolved addresses in response to a dig request have changed.
However, the response to the web request is the same, because it is the backend servers that are responding, not the ALB.
Since the ALB is owned by a service account, you can’t actually see the instances within the console, but you are able to see the ENIs, just as in Region. There are four ENIs here as this was after a scaling event, so two are associated with the r5.large instances and two with the r5.xlarge.
However, since this is an Outpost, you can get a view of the instances by looking at the utilization of the total number of instances within the Outpost. In this case, we can see that before the start of our test, no r5.large instances were being used (blue line). There was 25% of available r5.xlarge resource already in use, but that was from a different user. This may not be pertinent in a large Outposts deployment. It may be sufficient to track the occurrence of the event in CloudWatch. It is worth pointing out so when you are initially testing the ALB you see the impact of it scaling.
At the start of the test, approx. 10:50, an ALB was created—taking 25% of the available resource. Then, at approx. 11:50, a scaling event takes place where a further 25% of the r5.xlarge resource available was used, by the ALB scaling up. After approximately 1 hour, the ALB has decided that it must keep its scale on r5.xlarge. Then it releases the r5.large resource back into the user pool.
To see the traffic that caused the scaling event, we can use CloudWatch to review the request counts in the target group. At approx. 11:50, the total request count topped 1 million requests, and that is likely to have caused the scaling event. This level of requests occurs intermittently for the next hour, so the ALB decides to keep itself on r5.xlarge instances, and release the smaller instance size.
It’s also possible to see that the requests per target are half of the total requests, matching our expectations, since there are two instances in the target group.
The ALB scales from a large instance type, all the way up to a 4xlarge instance, within a family, as long as that resource is available. However, given that this is an Outpost, it has defined capacity. It may be that there are no instances of the next size up available to scale. If that is the case, then an event is logged in the Personal Health Dashboard, so that you can see the point at which the scaling stopped. It is important to remember that the instance family first chosen (m5, c5, or r5) is the family in which the load balancer scales. That means that if it deploys in an m5.large instance, then it scales up the m5 family, through m5.xlarge, m5.2xlarge and m5.4xlarge. If any of those instance types are not available, then it stops scaling up, and will jump to a different instance family.
An example of such an event can be seen in the following screenshot:
And the resources tab shows the affected ALB:
Costs related to implementing ALB are usually split into two areas:
- the ALB service itself, and
- the instances on which it runs.
In a Region, these are priced as a per-hour charge for the ALB service, plus a load balancer capacity unit (LCU) charge that effectively covers the cost of the resource on which that ALB service is running.
In AWS Outposts, since all instances are purchased as part of the AWS Outposts service, there is only an ALB per-hour charge for the service.
In addition, the backend web servers (in this case, NGINX) are sitting on resource in the AWS Outposts that is already purchased as part of the AWS Outposts service. In our case, because we used open source software to act as a web server, that means there is no additional cost for the instances (since they are covered by the AWS Outposts charges). However, if you use an AWS Marketplace or third-party web server with an associated licensing cost, then you would still must pay for this…only the instance resource is already covered.
As you can see, ALB on AWS Outposts follow the same pattern and function as ALB in Region, and as new features are added to the ALB on AWS Outposts, they automatically become available. You can check features that are not available in the AWS Outposts ALB in this link. The main focus of the ALB is to provide resilient scalable and low latency connection between on-premises devices and the AWS Outposts, and to remove the need to provide load balancing outside of the AWS environment. This in turn means it is possible to more tightly integrate the target groups and respond to throughput and performance requirements.
The ability of the ALB to load balance to targets on premises means it can be used in two ways. It can provide scalability and resilience to AWS workloads, and also allow resilience of on-premises workloads. This can all be done without needing to build physical load balances in the customer environment.