Use Bring your own IP addresses (BYOIP) and RFC 8805 for localization of Internet content
AWS provides hundreds of services to help you easily deploy resources and applications globally in minutes. This helps you rapidly expand your customer base across the world. At the time of writing this post, the AWS Cloud spans 102 Availability Zones (AZs) within 32 geographic AWS Regions around the world. As AWS is continuously growing, the updated details of AWS AZs and Regions can be found on the AWS Global Infrastructure page. The geographic distribution of these AWS Regions helps you deploy your resources closer to your users spread across the globe, potentially lowering latency and improving your user’s experience.
AWS customers include individual users, organizations, and the AWS Partner Network (APN). Among others, the APN comprises various security and networking partners offering network and data security to their on-premises clients for traffic to and from AWS. They typically make this possible with private connectivity between on-premises locations through tunneling (IPsec, WireGuard, etc.) to AWS. Then they route traffic through a set of partner-managed virtual firewall appliances hosted on Amazon Elastic Compute Cloud (Amazon EC2) in AWS.
A common use-case is routing internet traffic from on premises through AWS. This could be to achieve security controls in the cloud for use-cases like Deep Packet Inspection, Application Protocol Detection, Packet and Domain Name Filtering, and Intrusion Prevention. AWS customers achieve this by architecting hybrid network connectivity with AWS native services, such as AWS Direct Connect, AWS Transit Gateway, AWS Network Firewall, NAT Gateway, and Internet Gateway (IGW). Many of our customers also prefer APN partners’ services/offerings through AWS Marketplace for securing internet-bound traffic.
Localization of internet content plays an important part in enhancing the experience of internet users. For example, a user accessing a shopping website from Bogotá, Colombia, will expect the website to display content in the Spanish language and products relevant to Colombia. Any deviation from this could negatively affect the user experience. This also applies to the localization of internet data regardless of whether clients are accessing the internet directly from their on-premises networks, transiting through AWS native services, or through APN partner products hosted in AWS. Although the globally deployable AWS native and APN partner products help end-customers lower latency and achieve faster performance, they can run into problems with the localization of content for end users.
In this post, we explain how you can leverage AWS Bring Your Own IP address (BYOIP) feature with RFC 8805 to solve challenges with the localization of internet content for on-premises clients when they use AWS to transit to the internet.
Content Localization Scenarios and Challenges
In this section, we explore two use cases where the end user experience is affected by content localization challenges.
Use Case Scenario 1:
In the following diagram (Figure 1), we have a customer organization with two on-premises locations: Madrid, Spain, and Frankfurt, Germany. These locations are connected to AWS through Direct Connect connections in respective Direct Connect locations. The on-premises networks are connected to an “egress VPC” in the AWS Paris Region through a Transit Virtual Interface (TVIF), Direct Connect Gateway (DXGW), and a Transit Gateway (TGW), as shown in the preceding diagram. The egress VPC homes an AWS Network Firewall (ANF), a NAT Gateway (NATGW), and an IGW. The DXGW advertises a default route (0.0.0.0/0) toward the on-premises locations.
The following steps describe a packet walkthrough (Figure 1):
Figure 1: Use Case Scenario 1 with AWS Native Services
(1) Internet-bound traffic from the on-premises networks 192.168.0.0/24 (Spain) and 10.0.0.0/24 (Germany) is sent to the DXGW through the TVIF
(2) DXGW routes the traffic to the TGW
(3) The traffic enters the egress VPC through the TGW Elastic Network Interface (ENI) in the TGW Subnet
(4) The TGW ENI forwards the traffic to the ANF for inspection/filtering
(5) After inspection, traffic is sent to the NATGW where it is NAT’d to its Elastic IP address (EIP)
(6) This NAT’d traffic is sent toward the internet destination through IGW
The return traffic from the internet follows the same path in the reverse direction (Red arrows with labels (7) through (12) in Figure 1).
Use Case Scenario 2:
Here, we have a customer organization with two on-premises locations: London, UK, and San Francisco, USA. The on-premises networks are connected to AWS through a “transit VPC” in the AWS Paris Region through VPN tunnels. The on-premises locations and AWS transit VPC are dual-stack enabled with both IPv4 and IPv6 addressing, as shown in the following diagram (Figure 2). In this architecture, the transit VPC has a Virtual Firewall Instance that terminates the VPN tunnels between the on-premises locations and AWS. It also provides security controls for traffic traversing the VPN tunnels. The Firewall Instance advertises default routes (0.0.0.0/0 & ::/0) toward the on-premises locations.
The following steps describe a packet walkthrough (Figure 2):
Figure 2: Use Case Scenario 2 with Third-party Firewall Instance
(1) Internet-bound traffic from the on-premises networks 192.168.0.0/24 & 2001:db8:5679::/48 (London) and 10.0.0.0/24 & 2001:db8:5678::/48 (San Francisco) are routed to the firewall instance for inspection/filtering through the VPN tunnels.
(2) After inspection, traffic is NAT’d to the public IPv4 or IPv6 address of the firewall instance ENI and sent toward the internet destination through IGW.
The return traffic from the internet follows the same path in the reverse direction (Red arrows with labels (3) and (4) in Figure 2).
Note that in Figure 2, we have represented a single Firewall Instance for simplicity, but it is recommended to have multiple Firewall Instances in multiple AZs to ensure high availability.
Imagine a user in the Frankfurt on-premises location is performing an internet search for the Amazon retail website on a search engine website. In scenario 1, traffic from the user’s local IP address (in the IPv4 range 10.0.0.0/24) traverses the Direct Connect TVIF, DXGW, TGW, ANF, and enters NATGW in the egress VPC in Paris Region. At the NATGW, the private traffic is translated to the EIP 203.0.113.52 and sent out to internet through IGW.
In scenario 2, assuming the user uses IPv6 addressing for end-to-end communication, traffic from the user’s local IP address (in the IPv6 range 2001:db8:5678::/48) traverses the VPN tunnel and enters the Virtual Firewall Instance in the transit VPC in the AWS Paris Region. At the firewall instance, the traffic is translated to its public IPv6 address 2001:db8:1234:1a00::b and sent to the internet through the IGW. In both scenarios, for the search engine website, traffic appears to be originating from Paris (France) instead of the correct locations of the clients (Frankfurt in scenario one and San Francisco in scenario two). Therefore, the search engine could present the user with the amazon.fr website instead of the amazon.de and amazon.com websites respectively. This will negatively affect the user experience.
Before diving into the solution for the preceding challenge, let’s understand the concepts of AWS BYOIP and RFC 8805.
Bring your own IP addresses (BYOIP) in Amazon EC2
The BYOIP feature for Amazon EC2 allows organizations to bring their own IP addresses and associate them with their EC2 instances within the AWS environment. Traditionally, when using Amazon EC2, AWS automatically assigned public IP addresses to instances from Amazon Managed IP address pools. However, this limited the businesses’ control over IP address management and mobility. BYOIP enables organizations to use their own IP addresses, giving them more flexibility and control over their IP resources.
The process of using BYOIP in AWS involves the following steps:
- Acquiring IP addresses: Organizations must obtain a block of IP addresses from a Regional Internet Registry (RIR) or an Internet Service Provider (ISP).
- Preparing the IP addresses: The acquired IP addresses must be registered with AWS and validated to make sure of their ownership.
- Associating IP addresses with EC2 instances: Once the IP addresses are registered, organizations can associate them with their EC2 instances or other public resources in their VPC (e.g., NLBs, NATGW). For EC2 instances, this association can be done during instance launch or by modifying existing instances or ENIs.
A Format for Self-Published IP Geolocation Feeds (RFC 8805)
RFC 8805 provides a standardized format for self-published IP geolocation feeds. Geolocation is the process of determining the physical location of an IP address on the internet. The RFC acknowledges that IP geolocation is an important tool for various applications, such as targeted advertising, fraud detection, and content localization. However, the accuracy and reliability of geolocation data can vary significantly depending on the sources and methods used. To address this issue, RFC 8805 defines a format for self-published IP geolocation feeds, allowing organizations and individuals to share their geolocation data with others. The format includes specific fields for representing IP address ranges, corresponding geographic locations, and associated metadata. This RFC provides guidelines on how to populate these fields and suggests best practices for maintaining and updating the geolocation feeds. It emphasizes the importance of data accuracy, timeliness, and privacy considerations. By establishing a common format, RFC 8805 aims to improve the interoperability and consistency of self-published IP geolocation feeds. This enables easier integration and use of geolocation data across different systems and applications.
Each entry of a geolocation feed allows the following values:
- IP prefix: This is an IPv4 or IPv6 range in Classless Inter-Domain Routing (CIDR) format. For example, “192.0.2.1” and “192.0.2.0/24” for IPv4 and “2001:db8::1” and “2001:db8::/32” for IPv6.
- Alpha2code (Previously: ‘country’) (optional): This is a 2-letter ISO country code conforming to ISO 3166-1. For example, “US” for the United States and “PL” for Poland.
- Region (optional): This is a region code conforming to ISO 3166-2. For example, “ID-RI” for the Riau province of Indonesia and “NG-RI” for the Rivers province in Nigeria.
- City (optional): This is a free UTF-8 text, excluding the comma (‘,’) character. For example, “Dublin” and “New York”.
- Postal Code (optional – DEPRECATED): free UTF-8 text, excluding the comma (‘,’) character. For example, “106-6126” (in Minato ward, Tokyo, Japan).
From RFC 8805, we have the following example entries using different IP address formats and describing locations at the alpha2code (“country code”), region, and city granularity levels, respectively:
With BYOIP in Amazon EC2 and RFC 8805 understood, let’s see how you use them together to solve the challenge described earlier.
Step 1: Bring part or all of your public IP address range from your on-premises deployment to your AWS account. Refer to our Bring your own IP addresses (BYOIP) in Amazon EC2 public documentation and the Introducing Bring Your Own IP (BYOIP) for Amazon VPC post for details on requirements, prerequisites, preparing, provisioning, and advertising your public IP ranges in AWS.
Step 2: During the provisioning stage of BYOIP to AWS, an AWS public IP address pool is created. The next step is to use the allocate-address command to allocate EIPs for your NATGWs and virtual firewall appliances hosted on Amazon EC2. Refer to the Work with Elastic IP addresses documentation for detailed steps.
Step 3: Use the RFC 8805 format described in the previous section to publish specific locations for the EIPs/IP ranges. The following screenshot (Figure 3) shows a self-published geolocation feed file with multiple IPv4 and IPv6 address ranges:
Figure 3: An example of self-published geolocation feed file
Let’s look at an IPv4 entry from the preceding screenshot:
In the preceding entry,
- “198.51.100.0/24” is the public IPv4 prefix you have brought to AWS
- “DE” is the 2-letter ISO country code for Germany (Deutschland)
- “DE-HE” is the ISO region code for Hessen state
- “Frankfurt” is the city in Germany where the IPv4 prefix users are located
Step 4: Update your regional internet registry (such as ARIN, APNIC, and RIPE) with a comment/remark on where Geolocation database providers such as MaxMind, IPdata.co, etc. can find the self-published geofeed file so that they can poll and parse these feeds to update or merge with other geolocation data sources and procedures. The following screenshots (Figures 4 and 5) show examples of RIPE and ARIN internet registries updated with remarks/comments specifying the location of the self-published geofeed file.
Figure 4: An example of RIPE database entry with remarks
Figure 5: An example of ARIN database entry with comments
The solution architectures for scenarios one and two are shown in the following two diagrams (Figures 6 and 7).
In the solution for scenario 1, you have brought the IPv4 range 198.51.100.0/24 into the AWS eu-west-3 (Paris) Region, created an EIP 198.51.100.10, and assigned it to the NATGW. You have also published the geolocation feed for IP prefix 198.51.100.0/24 to be located in Frankfurt, Germany (shown in Figure 3). Traffic from a client located in the Frankfurt, Germany on-premises location is translated to the NATGW’s EIP (198.51.100.10) and sent out through the IGW. The client is presented with the correct content relevant to the German audience.
Figure 6: Solution for Use Case Scenario 1
Similarly, in the solution for scenario 2, you have brought the IPv6 range 2001:db8:4000::/48 into the AWS eu-west-3 (Paris) Region, made it publicly routable, and assigned the IPv6 address 2001:db8:4000:1b00::c to the Virtual Firewall Instance ENI. You have also published the geolocation feed for IP prefix 2001:db8:4000::/48 to be in San Francisco, USA (shown in Figure 3). Traffic from a client in the San Francisco, USA on-premises location is translated to the IPv6 address of Firewall Instance ENI (2001:db8:4000:1b00::c) and sent out through the IGW. The client is presented with the correct content relevant to the US audience.
Figure 7: Solution for Use Case Scenario 2
- IP prefix length: The most specific public IPv4 address range that you can bring to AWS is /24. The most specific IPv6 address range that you can bring is /48 for CIDRs that are publicly advertised, and /56 for CIDRs that are not publicly advertised. However, there are no restrictions on the subnet mask of the IP prefixes you publish in the self-published geofeed file. You can publish as low as a /32 IPv4 prefix and a /128 IPv6 prefix in the geofeed file.
- Amazon VPC IP Address Manager (IPAM): IPAM is a VPC feature that makes it easier for you to plan, track, and monitor IP addresses for your AWS workloads. With IPAM, you can enable cross-Region and cross-account sharing of your BYOIP addresses. In contrast, if you bring your own IP addresses in the traditional way, you must bring each address range to one AWS Region and account at a time. IPAM has additional costs involved. Refer to the Amazon VPC pricing page for IPAM related pricing details.
- Propagation delay: After you self-publish geolocation information in the geofeed file and update your Regional internet registry with a comment/remark, it may take several weeks for the changes to be propagated and for the geolocation database providers to start honoring the information.
- It is up to the discretion of the geolocation database providers to honor the geolocation information you publish in the geofeed file. We strongly recommend testing the solution in a lab environment before deploying it into production.
In this post, we discussed the challenges that AWS customers and APN partners might face with the localization of internet content for their end users, and how they can use the AWS BYOIP feature with RFC 8805 to address them.
For more information, refer to the following resources:
- Bring your own IP addresses (BYOIP) in Amazon EC2
- Tutorial: Bring your IP addresses to IPAM
- RFC 8805: A Format for Self-Published IP Geolocation Feeds