How do I troubleshoot issues with latency-based resource records and Route 53?

Last updated: 2021-05-20

Amazon Route 53 latency-based routing is returning a server in an AWS Region geographically far from the client. For example, when a user in the US tries to access my website, Route 53 returns the IP address of a server in Europe. How do I prevent clients from being routed to Regions far from their location?

Short description

Route 53 resolves to the Region with the lowest latency based on the location of the DNS query if the following are true:

Route 53 makes latency based routing decision based on:

  • The source IP address of the recursive DNS resolver sending the query to the Route 53 authoritative name server.
  • The truncated version of the client's IP address (if the DNS resolver supports the extension EDNS0-Client-Subnet).

Route 53 name servers support EDNS0-Client-Subnet by default. If a recursive DNS resolver supports EDNS0-Client-Subnet, the DNS resolver sends Route 53 a truncated version of the client's IP address. Route 53 then uses that truncated IP address to determine the lowest latency Region.

The Region with the lowest latency might not be physically closest to the DNS resolver. You might experience unwanted routing behavior if the client isn't in the same location as the DNS resolver. You might also experience unwanted routing behavior if the resolver's IP address has different location information.

Example scenario

A company has latency-based routing records for two Elastic Load Balancers, one located in Virginia (us-east-1) and the other in Ireland (eu-west-1). Users in the US use a corporate DNS resolver located in Europe or connect to the corporate office in Europe over a VPN.

If the corporate DNS resolver can't send EDNS0-Client-Subnet data to the authoritative name servers, Route 53 considers the DNS resolver IP address in Europe as the query's source. Route 53 then performs a lookup in its latency database and incorrectly determines that the load balancer in Ireland has the lowest latency.

However, if the corporate DNS resolver can send EDNS0-Client-Subnet data, Route 53 considers the truncated client IP address in the US as the query's source. Route 53 then performs a lookup in its latency database and correctly determines that the load balancer in Virginia has the lowest latency.

Resolution

Use the following steps to troubleshoot unwanted latency-based routing behavior:

1.    Check the IP address range used by the DNS resolver. On Linux/macOS, run the dig command in a loop. On Windows, run the nslookup command multiple times and be sure to note the output each time.

On Linux or macOS, use dig:
for i in {1..10}; do dig TXT o-o.myaddr.l.google.com +short; sleep 61; done;

On Windows, use nslookup:

nslookup -type=txt o-o.myaddr.l.google.com

2.    Confirm that the DNS resolver supports Anycast using the output. If the output always contains the same single IP address, then the DNS resolver doesn't support Anycast. If the IP address changes when you run the command multiple times, then the DNS resolver supports Anycast.

When a DNS resolver supports Anycast, there are multiple edge locations for DNS resolvers. A user's edge location is selected based on optimal latency, which might result in an unexpected location for the resolver IP address.

3.    Find the client IP address. From the client machine, open an internet browser and navigate to https://checkip.amazonaws.com/.

Or, use curl:

curl https://checkip.amazonaws.com/

4.    Check if the DNS resolver supports EDNS0-Client-Subnet using one of the following commands. Be sure to note the output.

On Linux or macOS, use dig:
dig +nocl TXT o-o.myaddr.l.google.com @<DNS Resolver>

On Windows, use nslookup:

nslookup -type=txt o-o.myaddr.l.google.com <DNS Resolver>

5.    Check the first TXT record returned in the Answer section of the output. This value is the nearest DNS server advertising Anycast. If there isn't a second TXT record, then the DNS resolver doesn't support EDNS0-Client-Subnet. If there's a second TXT record, then the DNS resolver supports EDNS0-Client-Subnet. The resolver sends a truncated client subnet (/24 or /32) to the Route 53 authoritative name server.

(Optional) If you enabled Route 53 DNS query logging on your hosted zone, create a test record. Then, perform a DNS lookup for the new record. Check the query logs to confirm the resolver IP address and EDNS0-Client-Subnet (if any) presented to Route 53 name servers.

6.    Check that the TTL value of the response is 60 seconds. If the TTL isn't 60 seconds, then the response is a cached response. Repeat your dig or nslookup command until the response TTL value is 60 seconds.

7.    If you can access the Route 53 DNS checking tool, then simulate queries from a specific DNS resolver or client IP address. Use these queries to find what latency resource record set that Route 53 returns.

If the DNS resolver doesn't support EDNS0-Client-Subnet, then specify the Resolver IP address as your value in the tool.

If the DNS resolver supports EDNS0-Client-Subnet, then specify the EDNS0 client subnet IP as your value in the tool. Choose Additional configuration, and then specify the Subnet mask.

Note: This tool directly queries the Route 53 latency measurement database to determine the precalculated latency between AWS Regions and an internet-based network. The tool doesn't send DNS queries over the internet or to DNS resolvers. The tool doesn't check if the DNS resolver supports EDNS0-Client-Subnet. Results from the tool and an actual DNS query might differ.

8.    (Optional) If you can't access the Route 53 DNS checking tool, then use dig. Using dig, query the Route 53 authoritative name servers for your hosted zone with EDNS0-Client-Subnet. Use the output to determine the lowest latency Region from your source IP address.

dig lbr.example.com +subnet=<Client IP>/24 @ns-xx.awsdns-xxx.com +short

Note: Latency between hosts on the internet can change over time due to changes in network connectivity and routing. For example, a request that's routed to the Oregon Region this week might be routed to the Ohio Region next week.

9.    For resolvers that don't support EDNS0-Client-Subnet, change the client's DNS resolver to a different recursive DNS resolver located geographically closer to the client. If the resolver doesn't support EDNS0-Client-Subnet, then the client DNS queries might use a DNS resolver in a different geographic location from the client. The result in this scenario is unexpected routing behavior.

If you manage the DNS resolver, check the forwarding configuration. Confirm that you aren't forwarding DNS queries to another resolver that's further from the client's geographic location.

For resolvers that support EDNS0-Client-Subnet, check the geographic location of the client subnet IP address. To check the location, use the GeoIP database on the MaxMind website, or your preferred GeoIP database. If the location of the client subnet IP address is in a different geographic location from the client, then customers might experience unexpected routing behavior.

10.    (Optional) If the DNS resolver doesn't support EDNS0-Client-Subnet, switch to a public recursive DNS resolver that supports EDNS0-Client-Subnet. Then, compare your old latency routing response results from Route 53 with your new results. For example, two public DNS resolvers that currently support EDNS0-Client-Subnet are GoogleDNS (8.8.8.8 and 8.8.4.4) and OpenDNS (208.67.222.222 and 208.67.220.220).

11.    (Optional) Determine if the latency-based routing records are associated with a Route 53 health check, and if Evaluate Target Health is enabled (for alias records). If one or both are true, then Route 53 returns the healthy endpoint that has lowest latency. If all health checks are failing, then only the routing policy is considered.

Check the status of your Route 53 health check in the Route 53 console. If Evaluate Target Health (ETH) is enabled, then check the health status of the record endpoint. Route 53 considers an endpoint for a Classic Load Balancer with ETH enabled as healthy if at least one backend instance is healthy. For Application and Network Load Balancers, every target group with targets must contain at least one healthy target to be considered healthy. A target group with no registered targets is considered unhealthy. If any target group contains only unhealthy targets, then the load balancer is considered unhealthy. Exception: If an Application Load Balancer has at least one healthy target group and all remaining target groups are empty, then Route 53 considers it healthy.

For example, let's say that you have two latency-based routing records with associated Route 53 health checks, one in Oregon and the other in North Virginia. When the Oregon endpoint's health check fails, then all requests are routed to the North Virginia endpoint regardless of the location of the client.