How can I troubleshoot issues with my weighted routing policy in Route 53?

Last updated: 2021-05-22

I configured a weighted routing policy in Amazon Route 53. However, when I test the DNS resolution, I get unexpected results. How can I troubleshoot this?

Short description

Consider that you created a text (TXT) record with the name "weighted.awsexampledomain.com". The record has a Time to Live (TTL) of 300 seconds, and has weights configured as follows:

Name Type TTL Values Weight Health check status
weighted.awsexampledomain.com. TXT 300 "Record with Weight 0" Weight=0 Health check associated
weighted.awsexampledomain.com. TXT 300 "Record with Weight 20" Weight=20 Health check associated
weighted.awsexampledomain.com. TXT 300 "Record with Weight 50" Weight=50 Health check associated
weighted.awsexampledomain.com. TXT 300 "Record with Weight 70" Weight=70 Health check associated

This configuration is referenced in the following examples.

Resolution

Note: If you receive errors when running AWS Command Line Interface (AWS CLI) commands, make sure that you're using the most recent AWS CLI version.

Test your weighted routing policy to identify the issue

Send multiple (over 10,000) queries to test your weighted routing policy. Test the DNS resolution from multiple locations or directly query the authoritative name servers to understand the policy. Use the following scripts to send multiple DNS queries for your domain name.

Send DNS queries using the recursive resolver:

#!/bin/bash
for i in {1..10000}
do
domain=$(dig <domain-name> <type> @RecursiveResolver_IP +short)
echo -e  "$domain" >> RecursiveResolver_results.txt
done

Send DNS queries directly to the authoritative name servers:

#!/bin/bash
for i in {1..10000}
do
domain=$(dig <domain-name> <type> @AuthoritativeNameserver_IP +short)
echo -e  "$domain" >> AuthoritativeNameServer_results.txt
done

Example output using the awk tool in the AWS CLI:

$ for i in {1..10000}; do domain=$(dig weighted.awsexampledomain.com. TXT @172.16.173.64 +short); echo -e  "$domain" >> RecursiveResolver_results.txt; done
$ awk ' " " ' RecursiveResolver_results.txt | sort | uniq -c
1344 "Record with Weight 20"
3780 "Record with Weight 50"
4876 "Record with Weight 70"

Use your test results to troubleshoot your specific issue

Issue: Endpoint resources of the weighted records aren't receiving the expected traffic ratio.

Route 53 sends traffic to a resource based on the weight that you assign to the record as a proportion of the total weight for all records. DNS responses are cached by intermediate DNS resolvers for the duration of the record TTL. Clients are directed to only specific endpoints for the duration due to the cached response.

For example, if you query against the caching DNS resolver 192.168.1.2:

$ for i in {1..10000}; do domain=$(dig weighted.awsexampledomain.com. TXT @192.168.1.2 +short); echo -e  "$domain" >> CachingResolver_results.txt; done

$ awk ' " " ' CachingResolver_results.txt | sort | uniq -c
3561 "Record with Weight 20"
1256 "Record with Weight 50"
5183 "Record with Weight 70"

Notice that these results aren't as expected due to the cache at the recursive DNS resolver.

Issue: Some weighted records aren't returned.

For example, when some health checks are failing:

Name Type TTL Values Weight Health check status
weighted.awsexampledomain.com. TXT 300 "Record with Weight 0" Weight=0 Health Check Success
weighted.awsexampledomain.com. TXT 300 "Record with Weight 20" Weight=20 Health Check Success
weighted.awsexampledomain.com. TXT 300 "Record with Weight 50" Weight=50 Health Check Fail
weighted.awsexampledomain.com. TXT 300 "Record with Weight 70" Weight=70 Health Check Success
$ for i in {1..10000}; do domain=$(dig weighted.awsexampledomain.com. TXT @192.168.1.2 +short); echo -e  "$domain" >> HealthCheck_results.txt; done

$ awk ' " " ' HealthCheck_results.txt | sort | uniq -c
3602 "Record with Weight 20"
6398 "Record with Weight 70"

Notice that the "Record with Weight 50" isn't returned by Route 53 because its health check is failing.

Issue: All weighted records are unhealthy.

Even if none of the records in a group of records are healthy, Route 53 must still provide a response to the DNS queries. However, there's no basis for choosing one record over another. In this case, Route 53 considers all the records in the group to be healthy. One record is selected based on the routing policy and the values that you specify for each record.

For example:

Name Type TTL Values Weight Health check status
weighted.awsexampledomain.com. TXT 300 "Record with Weight 0" Weight=0 Health Check Fail
weighted.awsexampledomain.com. TXT 300 "Record with Weight 20" Weight=20 Health Check Fail
weighted.awsexampledomain.com. TXT 300 "Record with Weight 50" Weight=50 Health Check Fail
weighted.awsexampledomain.com. TXT 300 "Record with Weight 70" Weight=70 Health Check Fail
$ for i in {1..10000}; do domain=$(dig weighted.awsexampledomain.com. TXT @205.251.194.16 +short); echo -e  "$domain" >> All_UnHealthy_results.txt; done

$ awk ' " " ' All_UnHealthy_results.txt | sort | uniq -c
1446 "Record with Weight 20"
3554 "Record with Weight 50"
5000 "Record with Weight 70"

Notice that Route 53 considered all records healthy (Fail Open). Route 53 responded to the DNS requests based on the configured proportions. "Record with Weight 0" isn't returned because its weight is zero.

Note: If you set nonzero weights to some records and zero weights to others, then health checks work the same as when all records have nonzero weights. There are a few exceptions:

  • Route 53 initially considers only the healthy nonzero weighted records, if any.
  • If all nonzero records are unhealthy, then Route 53 considers the healthy zero weighted records.

For example:

Name Type TTL Values Weight Health Check Status
weighted.awsexampledomain.com. TXT 300 "Record with Weight 0" Weight=0 Health Check Pass
weighted.awsexampledomain.com. TXT 300 "Record with Weight 20" Weight=20 Health Check Pass
weighted.awsexampledomain.com. TXT 300 "Record with Weight 50" Weight=50 Health Check Fail
weighted.awsexampledomain.com. TXT 300 "Record with Weight 70" Weight=70 Health Check Fail
$ for i in {1..10000}; do domain=$(dig weighted.awsexampledomain.com. TXT @192.168.1.2 +short); echo -e  "$domain" >> HealthCheck_results.txt; done

$ awk ' " " ' HealthCheck_results.txt | sort | uniq -c
10000 "Record with Weight 20"

Notice that Route 53 doesn't consider the record with weight 0. Unless all weighted records are unhealthy, Route 53 doesn't return the zero-weighted records.

If you set an equal weight for all records in a group, then traffic is routed to all healthy resources with equal probability. If you set "Weight" to zero for all records in a group, then traffic is routed to all healthy resources with equal probability.