Leverage Amazon CloudFront geolocation headers for state level geo-targeting

Introduction

When you provide content online, personalization is used to improve your customers’ experience, market effectively, and meet regulatory requirements. One common way you can personalize web content is based on the geographical location of your customers. Since 2014, Amazon CloudFront has supported country-level location based personalization with a feature called Geolocation Headers. Using the CloudFront-Viewer-Country header, you can identify the country a request has originated from and customize the content it receives.

There are some cases were you will need additional, more granular, targeting. In the United States for example, you may want to present different variants of the website to viewers from different states. Historically, you have been able to do this with third-party products like ipgeolocation.io or IP2LocationLite which provide location data for IP addresses.

In July 2020, Amazon CloudFront announced support for additional geolocation headers including state, city and postal code to support granular location based web personalization. In this blog post, you will be walked through the implementation of these new headers, including caching and geo-targeting at an US state level with Amazon CloudFront, Lambda@Edge, and a static website hosted on Amazon S3.

Overview of solution

To demonstrate this functionality, imagine you have a static website hosted in an S3 bucket with an index.html file and different variants of the banner image for US states (banner_ca.png for the state California for example) and a default banner.png image as fallback or for non-US viewers. When your user makes a request for the website, the following steps are performed:

User makes a request to the website domain, in our case the CloudFront distribution domain name
The HTTP request reaches CloudFront CDN which returns the cached version of the resource if it exists
In case of a cache miss, CloudFront sets the cache key and geolocation headers
Depending on the cache behavior defined in the CloudFront distribution (in our example all files matching the *.png pattern), a Lambda function checks the country of the viewer. If the viewer is in the US, it rewrites the url appending the state suffix and forwards the request to the S3 bucket
The requested resource is returned from the S3 bucket

Prerequisites

An AWS account.
Follow this video to setup Amazon CloudFront to serve a static website hosted on S3 using a S3 API as the origin within CloudFront. Don’t upload any files in the S3 bucket part of the setup. After this setup you should have created a CloudFront distribution with a S3 bucket as an origin.

Note: The example in the video assumes you use a custom domain name and SSL certificate. This is not mandatory for the purpose of this blog post. If you don’t have a custom domain you can skip the CNAME and custom SSL certificate selection.

Walkthrough

S3 bucket setup

Create the index.html file.

<!DOCTYPE html>
<html>
    <body>
        <h1>Join the 2020 AWS Summit</h1>
        <img src="banner.png">
    </body>
</html>

For the default version of the banner, download this banner image from the Global AWS Summit and rename it to banner.png.
We choose to demo this functionality for a viewer in California so as a variant for the default banner, download this banner image from the San Francisco AWS Summit and rename it to banner_ca.png.
Upload these three files in the S3 bucket.

CloudFront setup

The first step will be to define the path pattern of the files for which you support multiple variants and want to be cached based on a cache key of your choice. In this case, we are looking at variants for the image file, banner.png.
Go to the CloudFront console page and select your distribution. Select the ‘Behaviors’ tab and create a new behavior.
- In our case, we choose the *.png path pattern
- For ‘Viewer Protocol Policy’, select ‘Redirect HTTP to HTTPS’.
- For ‘Cache and origin request settings’ select ‘Use legacy cache settings’, we will change this later.
- Continue with the rest of the settings as default to create the cache behavior.

Lambda setup

Create the Lambda function by going into the Lambda console page and selecting ‘Create function’ and ‘Author from scratch’. Name your function and select the Python runtime. Finally, create a new execution role and select the ‘Basic Lambda@Edge permissions’ policy which will allow CloudFront to trigger this function.

Paste this code snippet into the editor.

import os
from urllib.parse import urlparse
 
def lambda_handler(event, context):
   
    request = event['Records'][0]['cf']['request']
   
    parsed_uri = urlparse(request['uri']).path
   
    root = build_root_path(parsed_uri)
    file_name= os.path.basename(parsed_uri).split(".")[0]
    suffix = build_suffix(request['headers']);
    extension = os.path.splitext(parsed_uri)[1]
    
    request['uri'] = root + file_name + suffix + extension;
 
    return request
  
def build_root_path(parsed_uri):
    root = os.path.split(parsed_uri)[0]
    return root if (root == "/") else rootDir + "/"
   
def build_suffix(headers):
    country = headers['cloudfront-viewer-country'][0]['value']
    if (country == 'US'):
        return '_' + headers['cloudfront-viewer-country-region'][0]['value'].lower()
    else:
        return ''

Deploy the function to run at the edge when CloudFront invokes it. From the ‘Actions’ dropdown, select ‘Deploy to Lambda@Edge’.
In the next step, select the distribution and the created cache behaviour. Select the CloudFront event as the ‘Origin request’. As described in the documentation page, the function will execute only when CloudFront forwards a request to your origin.

Cache policy and cache behavior setup

Go to the CloudFront console page and select your distribution. Select the ‘Behaviors’ tab and edit the created behavior.
For the ‘Cache and origin request settings’, select ‘Use a cache policy and origin request policy’ and ‘Create a new policy’

This will open a new page for creating the cache policy.
- Set the TTL settings.
- Select the contents of the cache key. In our case we are interested in caching based on the country and the region of the viewer so we will choose to whitelist the two headers:
  - CloudFront-Viewer-Country
  - CloudFront-Viewer-Country-Region – for US, this header contains a code (up to three characters) that represent the viewer’s region. The region is the most specific subdivision of the ISO 3166-2 code.

On the cache behavior page, select the newly created cache policy and save the behavior.

Testing the implementation

Now let’s test the two versions of the content by making a request from California and one from any other state in the US or country in the world. In the browser, paste either your custom domain name if you created one during the initial setup or the CloudFront distribution domain name.

Cleanup

Remove the CloudFront distribution, S3 bucket, and the Lambda function to avoid further costs.

Conclusion

In this post, you have seen how easy it is to leverage the geolocation headers available in Amazon CloudFront to cache and customise content based on the location of the viewer. You used the the two headers CloudFront-Viewer-Country and CloudFront-Viewer-Country-Region to create a cache key in the CloudFront and build logic executed at the Edge of the network to determine the correct path of the file to be returned to the requester. Other than these two, there are other headers that can be used to obtain information about the viewer’s location and build personalized experiences like: display content by city (with CloudFront-Viewer-City), show accessible attractions close by (with CloudFront-Viewer-Postal-Code), adjust times to the viewer’s timezone (with CloudFront-Viewer-Time-Zone) and accurately identify viewer’s location (with CloudFront-Viewer-Latitude and CloudFront-Viewer-Longitude).

Mihai Anghel

Mihai is a Solutions Architect at AWS and works with customers in the Information Services and Telco industries. He is interested in distributed architectures and the containers ecosystem.

Networking & Content Delivery