Networking & Content Delivery
Hostname-as-Target for Network Load Balancers
Introduction:
Network Load Balancers (NLB) is the flagship Layer 4 load balancer for AWS, offering elastic capacity, high performance, and integration with AWS services like AWS Auto Scaling. NLB is designed to handle millions of requests per second while maintaining ultra-low latency, improving both availability and scalability. Network Load Balancers are widely used by all types of applications, from low-latency media streaming to high-scale enterprise data services, in a variety of software architectures ranging from traditional virtual machines to containerized environments.
Challenges:
Network Load Balancer distributes incoming traffic across backend resources called “targets”. Targets are grouped in Target Groups, a logical construct that represents your backend capacity for your application.
While NLB supports Target Groups made of Instances or IP Addresses, there are certain scenarios where a fully qualified domain name (FQDN) represents the group of instances as targets versus an individual target. Consider a scenario where you are connecting to a relational database service (RDS) database cluster through a load balancer. The cluster is added as a target for the load balancer by adding the IP address of the primary instance of the cluster. While it works, in case of fail-over events you need to manually update the target group with the newly active primary instance IP address.
In this blog, I demonstrate how to use AWS Lambda (Lambda) to create a domain name system (DNS) hostname-controlled target for a Network Load Balancer.
Solution overview:
This solution is based on AWS Lambda that periodically resolves the target FQDN and registers/deregisters IP addresses as targets from a target group. The Lambda function uses an Amazon S3 bucket as an IP address repository. Amazon EventBridge (Amazon CloudWatch Events) periodically triggers the Lambda function.
Figure 1: Flow
High-level workflow:
- An Amazon EventBridge rule invokes the Lambda function every five minutes. Invocation interval is user configurable.
- The Lambda function resolves the target FQDN using the selected DNS server.
- The Lambda function retrieves the dataset of tracked IP addresses from the designated S3 bucket.
- The Lambda function registers the missing IP addresses to the target group and deregisters expired IP addresses from the target group.
- The Lambda function stores the updated tracked IP address database in the S3 bucket.
- The Lambda function logs its operation to the CloudWatch log stream, and updates a dedicated CloudWatch metric that tracks the number of registered IP addresses for the hostname.
Note:
- This solution uses a Lambda function, Amazon S3, and Amazon CloudWatch. Note that using these services may result in additional costs.
- This solution accepts A record. You can modify the Lambda function to accept other record types as required.
- IP addresses returned by the query should be from the allowed address block.
- While we have tested this solution and believe it works well, as always, be sure to test it in your environment before using it in production!
- While this blog focuses on Network Load Balancer, this solution can be easily extended to other types of Elastic Load Balancers. As always, be sure to test it in your environment before using it in production!
Deploying the solution:
Prerequisites:
Before you start, make sure that you have an IP type target group associated with a Network Load Balancer.
Walk through:
Step 1: Create an IAM Role
Using the AWS Identity and Access Management (IAM) console, create an IAM role with permissions to run the Lambda function:
Figure 2: IAM Role
Assign the role permissions to manage the relevant load balancer, S3 bucket, and CloudWatch objects used by the Lambda function. You can use the following sample IAM policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "arn:aws:logs:*:*:*",
"Effect": "Allow"
},
{
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:DeleteObject",
"s3:CreateBucket",
"s3:DeleteBucket"
],
"Resource": [
"arn:aws:s3:::<YOUR-BUCKET-NAME>/*", #Insert desired value
"arn:aws:s3:::<YOUR-BUCKET-NAME>" #Insert desired value
],
"Effect": "Allow"
},
{
"Action": [
"s3:ListAllMyBuckets",
"cloudwatch:PutMetricData",
"elasticloadbalancing:RegisterTargets",
"elasticloadbalancing:DeregisterTargets",
"elasticloadbalancing:DescribeTargetHealth",
"ec2:CreateNetworkInterface",
"ec2:DescribeNetworkInterfaces",
"ec2:DeleteNetworkInterface"
],
"Resource": "*",
"Effect": "Allow"
}
]
}
Step 2: Create a Lambda function.
Using the AWS Lambda console, create the Lambda function. Configure the function to use the IAM role created in the previous step and set the runtime environment to Python 3.7.
Figure 3: Create Lambda function.
Step 3: Trigger Lambda invocation
In the Lambda designer panel, click “Add Trigger” and create a new “EventBridge (CloudWatch Events)” trigger. The rule runs the Lambda function periodically. We recommend that you run the Lambda function at least every 5 minutes to quickly identify changes to the DNS record. Do not enable the trigger yet.
Figure 4: Add Trigger
Step 4: Configure the Lambda function code
In the Lambda designer panel, select the Lambda function. Use the Action drop-down menu in the function code panel to upload the Lambda function zip file. You can review the Lambda function here and you can download ElbHostnameAsTarget.zip from here.
Step 5: Configure the Lambda function variables
After creating the Lambda function, configure its parameters and environment variables as follows:
Mandatory variables:
- TARGET_FQDN: The full DNS name (FQDN) of the target you are registering.
- ELB_TG_ARN: The Amazon Resource Name (ARN) of the target group.
- S3_BUCKET: The S3 bucket name that is used to for tracking IP changes between Lambda invocations. If the bucket doesn’t exist, the function creates it.
- DNS_SERVER: The IP address of DNS server that receives the query. You can provide IP address of custom DNS server(s). If you intend to use Amazon provided DNS as the domain name server, the Lambda function should be connected to VPC.
Optional variables:
- MAX_LOOKUP_PER_INVOCATION: A single DNS lookup query usually returns up to eight IP addresses. If an FQDN has more than eight IP addresses, the Lambda function uses multiple DNS queries to retrieve the addresses. The higher the value, the more likely you will have all the addresses. We recommend tuning this value if IP addresses are missing from your target group. The default value is 10.
- INVOCATIONS_BEFORE_DEREGISTRATION : This attribute controls the deregistration threshold. If an IP address is missing from N invocations (as defined by this attribute), it is deregistered. Note that the NLB health checks detect failed IP addresses, so delayed deregistration doesn’t pose a risk to traffic in most cases. The default value is 3.
- REPORT_IP_COUNT_CW_METRIC: Enable/disable the CloudWatch metric for the IP address count. The default value is “true.”
- REMOVE_UNTRACKED_TG_IP: Instructs the Lambda function to track/clean up all target group IP addresses. Enable this flag only if the target group consists of a single hostname. The default value is “false.”
Figure 5: Lambda function environment variables
Step 6: Configure the Lambda settings
In the “Basic settings” panel, set the Lambda timeout to 45 seconds to allow it enough time to query DNS. Change the handler name to “elb_hostname_as_target.lambda_handler” to map the Lambda function to the Python file that contains the function code. For more information about how to configure Lambda functions, see the Configuring functions in the AWS Lambda console documentation.
Figure 6: Lambda Function Basic Settings
Step 7: Enable the trigger
The Lambda function is now ready to go. Fire up the solution by enabling the EventBridge rule.
Figure 7: Enable Trigger for Lambda function
Step 8: Verification
Use the following steps to verify that the hostname IP addresses are added to the target group:
- Using elbv2 DescribeTargetHealth API, verify that the hostname IP addresses have been added to the target group and are healthy.
- Test access to your target using the FQDN or the IP address of the load balancer. For example, if the load balancer listens on port 80, run the following command: telnet myNLB-DNS.elb.us-east-1.amazonaws.com 80
- Verify that the Lambda function CloudWatch metric reports the same number of IP addresses as the hostname DNS record.
- Add/remove an IP address from the target hostname and verify that the change is propagated to the target group.
- Confirm that the hostname IP addresses are listed in the CloudWatch log.
Figure 8: Target Group
Figure 9: CloudWatch Network ELB HostnameAsTargetIPCount
Figure 10: CloudWatch Log Group for Lambda Function
Monitoring Lambda function [Optional]:
Lambda automatically monitors Lambda functions on your behalf and reports metrics through Amazon CloudWatch. When the Lambda function finishes processing an event, Lambda sends metrics about the invocation to Amazon CloudWatch. You can build graphs and dashboards with these metrics in the CloudWatch console. This can be used in case you want to add watchdogs tracking the Lambda invocations.
You can set CloudWatch Alarms to watch CloudWatch metrics and to receive notifications when the metrics fall outside of the levels (high or low thresholds) that you configure. For more information see using Amazon CloudWatch Alarms in the Amazon CloudWatch User Guide.
To troubleshoot and notify anomalies in the Lambda function, I created the following sample invocation metrics alarms. Note that using Amazon CloudWatch Alarm and Amazon Simple Notification Service (SNS) may result in additional cost.
- elb-hostname-as-target-errors for “Errors”: Monitors function errors include exceptions thrown by the code and exceptions thrown by the Lambda runtime. The runtime returns errors for issues such as timeouts and configuration errors. This alarm is part of composite alarm configured below. An alarm is raised if data is missing or if “Errors” is greater than or equal to threshold value.
Figure 11: Metric Alarm – elb-hostname-as-target-errors
- elb-hostname-as-target-invocations for “Invocations”: Monitors the number of times your function code is executed, including successful executions and executions that result in a function error. This alarm is part of composite alarm configured below. Alarm is raised if data is missing or if “Invocations” is less than threshold value.
Figure 12: Metric Alarm – elb-hostname-as-target-invocations
- elb-hostname-as-target-monitor: A composite alarm consisting of alarms: elb-hostname-as-target-errors and elb-hostname-as-target-invocations. Composite alarm’s action is configured to send notification to Amazon Simple Notification Service (Amazon SNS) topic when either of the two metric alarms are triggered.
Figure 13: Create Composite Alarm
Figure 14: Configure Action for Composite Alarm
Figure 15: Composite Alarm
Configure the solution using AWS CloudFormation:
AWS CloudFormation gives you the ability to model your entire infrastructure and application resources with either a text file or programming language. This, removes the need for manual actions or custom scripts. With AWS CloudFormation, you work with stacks made up of templates, which can be JSON- or YAML-formatted text files. When you create a stack, AWS CloudFormation makes underlying service calls based on the templates that you provide and provisions the resources.
To launch this solution using AWS CloudFormation in Oregon (us-west-2) region:
- To deploy using existing infrastructure, click on Launch Template. You can review the template here.
- To deploy using new infrastructure, click on Launch Template. You can review the template here.
Cleanup:
- If you used AWS CloudFormation to create the solution, delete the associated AWS CloudFormation stack.
- If you created using steps describe above, delete AWS Lambda and associated Amazon EventBridge and Amazon CloudWatch Alarms.
Summary:
In this blog, I introduced a solution using AWS Lambda to periodically resolve target’s fully qualified domain name (FQDN) and registers/deregisters IP address as targets from a target group. This allows Network Load Balancer to point to an FQDN representing a cluster of servers as target versus individual targets. I also went over Lambda’s monitoring functionality to troubleshoot and notify anomalies in the Lambda function.