Networking & Content Delivery

Dynamically Route Viewer Requests to Any Origin Using Lambda@Edge

By Jake Wells, AWS Solutions Architect

Earlier this year we announced the general availability of Lambda@Edge, which allows you to intelligently process HTTP requests at locations that are close (for latency purposes) to your users. To learn more about edge networking with AWS, click here.

Since the launch, we have seen customers using Lambda@Edge in a variety of ways:

  • Serving different versions of pages for A/B testing
  • Providing rich and personal login experiences
  • “Prettifying” URLs to hide file names and extensions
  • Pulling data from Amazon DynamoDB
  • Generating dynamic HTML pages without returning to origin
  • Determining WebP image support to return the most efficient image for the browser

All of these uses provide a rich and personal experience by executing logic closer to your users. Example functions for those listed previously can be found in the Lambda@Edge Developer Guide.

Today I’m pleased to announce an addition to Lambda@Edge that allows you to do content-based routing. Now you can programmatically define the origin based on logic in your Lambda function. This enables you to route requests to different origins based on request attributes such as headers, query strings, and cookies.

To get started with Lambda@Edge, attach a Lambda function to the Origin Request trigger of your Amazon CloudFront distribution behavior. When a user views a page that isn’t cached in CloudFront, the associated behavior triggers, which provides you with the opportunity to modify the origin object in the request prior to routing to the origin.

The following screenshot shows an example Amazon S3 origin object with the HTTP request.

 

This can be useful for doing the following:

  • Detecting crawlers and routing them to an origin serving a crawler-friendly web page. For example, you could have one origin for real-world users and a second origin for crawlers. When a viewer request arrives you could inspect the User-Agent to see if it’s a crawler and route to the appropriate origin.
  • A/B testing across multiple site configurations. Using cookies with A/B testing without this feature would only allow routing between objects from a single origin, however this feature allows for routing between multiple origins.
  • Using cookies for stickiness to enhance the consistency of the user experience for websites that have multiple origins.

This new Lambda@Edge capability allows you to use any attribute of the HTTP request such as URIPath, Header, Cookie, or Query String and set the Origin accordingly. For example, you could inspect the CloudFront-viewer-Country header to determine the location of the viewer and route their request to an origin that is closer to them.

In this blog post I am going to focus on the latter use case to demonstrate content based routing. For latency and consistency reasons, you may want a user to be ‘sticky’ to the same origin throughout their journey. Using weight-based-routing in DNS wouldn’t allow stickiness, and the user would likely switch between the two origins potentially causing an inconsistent experience.

 

Solution overview

To represent this use case I will create two Amazon S3 buckets as origins, a CloudFront distribution and an AWS Lambda function that routes a user’s request to one of the two Amazon S3 origins based on a cookie. Next, I’ll walk through the steps.

The following diagram illustrates the sequence of events for our example.

Here is how the process works:

  1. Viewer navigates to the website.
  2. CloudFront serves content from cache if available, otherwise it goes to step 3.
  3. Only after a CloudFront cache miss, the Origin Request Trigger is fired for that behavior. This triggers the Modify Origin Lambda Function to determine which origin to route the request to.
  4. CloudFront sends the request to the chosen Origin.
  5. The object is returned to CloudFront from Amazon S3, served to the viewer and caches, if applicable.

To start, I navigate to the Amazon S3 console and create 2 buckets in any Region (I choose to use Singapore and London). I give them each a name. (I use lambdaedgedemo-firstbucket and lambdaedgedemo-secondbucket for this demo.) After both buckets are created, I need to add a bucket policy to allow public read access because these buckets will be serving a public website. To do this, I navigate into each bucket in turn, choose the Permissions tab, then select Bucket Policy and paste the following code (changing the bucket name):

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "PublicReadGetObject",
            "Effect": "Allow",
            "Principal": "*",
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::lambdaedgedemo-firstbucket/*"
        }
    ]
}

Note: Before you choose Save, be aware that every object in this bucket will become readable publically.

Now that both buckets are created, I upload a basic index.html file with plain text that identifies each bucket when I navigate to it. I open the CloudFront console, choose Create Distribution and select a Web distribution. For my first Origin Domain Name I select my first bucket, leave the Origin Path blank and set my Origin ID to something descriptive.

I then temporarily set the Default TTL to 0 to make it easier to test. I set Default Root Object to index.html since I’m serving a simple HTML page with some text on it. Now I set Forward Cookies to Whitelist and add origin, which allows the origin cookie I set client-side in my browser to be forwarded to my Lambda function. Finally, I choose Create Distribution.

I have only configured one of the S3 buckets as an origin for my distribution because I’m not required to have an origin configured in order to route to it. In my Lambda function I can dynamically route my requests to any origin as long as it’s publically accessible.

After my CloudFront distribution has changed state to Deployed, I check to make sure that the CloudFront Domain Name works and that it always serves index.html from my first bucket.

Note: If you don’t see your HTML page here or you’re being redirected, make sure that the CloudFront Distribution State (visible from the CloudFront dashboard) is Deployed.

If you’re still being redirected after the distribution state is Deployed, then you’ll need to wait until the S3 buckets DNS entry is propagated. To ensure that you don’t experience this, you can create both of your buckets in the US-East-1 Region.

 

Creating the Lambda function

First, I go to the Lambda Console and ensure I’m in the US-East-1 N. Virginia Region by selecting US East (N. Virginia) from the drop-down list in the top right. Then I choose Create Function to create a new Lambda function.

 

Note: I need to select the US-East-1 Region as the location where I create the Lambda function. Otherwise I am unable to connect to a CloudFront trigger. However, after I’ve finished with the setup the function will be replicated to all other Regions.

Next, I am presented with the option to Select a blueprint or Author from scratch. If I type in CloudFront I am presented with a range of different pre-built functions. For this solution, I choose Author from scratch because I’ll be using the code provided here for this function.

 

On the next screen I type a descriptive name, choose to create a new role from a template, use the Basic Edge Lambda permissions role as the Policy Template, and choose Create function when done.

 

Now that I have created my Lambda function, I copy the following code into the Function code box and ensure Node.js 6.10 Runtime and index.handler are selected. The bucket names in the following code need to be replaced with your origin names.

'use strict';

exports.handler = (event, context, callback) => {
    const request = event.Records[0].cf.request;
    const headers = request.headers;
    const origin = request.origin;

    //Setup the two different origins
    const originA = "lambdablog-firstbucket.s3.amazonaws.com";
    const originB = "lambdablog-secondbucket.s3.amazonaws.com";


    //Determine whether the user has visited before based on a cookie value
    //Grab the 'origin' cookie if it's been set before
    if (headers.cookie) {
        for (let i = 0; i < headers.cookie.length; i++) {
            if (headers.cookie[i].value.indexOf('origin=A') >= 0) {
                console.log('Origin A cookie found');
                headers['host'] = [{key: 'host',          value: originA}];
                origin.s3.domainName = originA;
                break;
            } else if (headers.cookie[i].value.indexOf('origin=B') >= 0) {
                console.log('Origin B cookie found');
                headers['host'] = [{key: 'host',          value: originB}];
                origin.s3.domainName = originB;
                break;
            }
        }
    } else
    {
        //New visitor so no cookie set, roll the dice weight to origin A
        //Could also just choose to return here rather than modifying the request
        if (Math.random() < 0.75) {
            headers['host'] = [{key: 'host',          value: originA}];
            origin.s3.domainName = originA;
            console.log('Rolled the dice and origin A it is!');
        } else {
            headers['host'] = [{key: 'host',          value: originB}];
            origin.s3.domainName = originB;
            console.log('Rolled the dice and origin B it is!');
        }
    }

    

    callback(null, request);
};

To allow me to configure a trigger for this function, I need to create a version of it. To do this, I make sure my function is saved, then I choose Actions and Publish new version, and Publish (without a name). After the version is created I can navigate to the Triggers tab and choose Add trigger.

I click on the dotted grey box and choose CloudFront. (If you can’t see CloudFront as a trigger option, make sure you’re in the US-East-1 Region, per Step 1.) The following options are presented here:

Distribution ID: I’m selecting the distribution I created earlier, that serves content from my S3 bucket.

Cache Behavior: I choose *, which is the Default behavior. Since in this case, I am not creating additional behaviors, this will apply to all requests. If I had created multiple behaviors, this would only be triggered if none of the other behaviors match.

CloudFront Event: As discussed earlier, I want this to trigger after the CloudFront cache but before returning to origin, so I choose Origin Request.

Enable trigger and replicate: I check this box to enable CloudFront as a trigger for a Lambda function. Upon Lambda function creation, this option automatically replicates my function across multiple Regions.

 

Upon choosing Submit my CloudFront distribution returns to the In Progress state while the Lambda function is replicated to each Edge Location.

After my CloudFront distribution returns to Deployed status, I go ahead and return to my CloudFront Domain Name and refresh the page. My Lambda function is looking for a cookie called “origin” which isn’t set, so the function randomly routes to either one of the origins. If I keep refreshing the page I should get served one of the two origins randomly.

To test that my function is reading the “origin” cookie correctly, I use the Mozilla Firefox Developer Toolbar to add a cookie called “origin” with value “B” in my browser, then refresh the page. This time I’m returned with only the second bucket, no matter how many times I refresh.

I can modify the origin cookie to point only to origin A or remove it completely to allow the Lambda function to randomly pick the origin. In this example, I used a cookie called “origin” and set its value to either “A” or “B” to determine which origin to route the user to. You could also use headers or query strings for the same functionality.

 

Monitoring and debugging

Although I didn’t experience any errors in my execution here, I want to let you know where to go if you experience any problems. In the same way that I monitor any Lambda function, I can use Amazon CloudWatch Logs to monitor the execution of Lambda@Edge functions. The slight difference here is that the logs are stored in the Region closest to the location the function executes in. So, for my test, I need to look at CloudWatch Logs in the London Region. I’ll need to change Region to view the CloudWatch Logs for my Lambda function, according to where my viewers are located.

In this screenshot, I’ve forced an error to show you the log output:

 

I find it helpful to test my Lambda function directly in the Lambda console before I enable it to be triggered and replicate. That way I save the time it takes to create a new version, assign a trigger, visit the website, and then view the logs. To do this, I need to configure a test event in the Lambda function in the way I normally would for Lambda, and pass it a sample request or response specific to CloudFront. After I choose Save and Test, I’m presented with the output and any errors. That way I can quickly fix them.

Conclusion

In this post, I showed you how to dynamically route viewer requests to any origin using Lambda@Edge. I demonstrated how to create a Lambda@Edge function, associate it with a trigger on a CloudFront distribution, and use cookies to route to one of two configured origins.

If you have any questions about this blog please post them in the Comments section below. If you have any awesome ideas for creative ways you can use Lambda@Edge, please share them in the AWS Lambda Forums.

If you’re new to AWS Lambda@Edge, I encourage you to refer to Getting Started with Amazon CloudFront and Getting Started with AWS Lamdba@Edge documentation for more information.