Networking & Content Delivery

Leveraging external data in Lambda@Edge

Introduction

Lambda@Edge is a feature of Amazon CloudFront that allows developers to implement custom logic for manipulating HTTP request/response exchanges or generating responses on the fly with low latency. Lambda@Edge empowers our customers with a full programming language (Node.js) to implement advanced logic.

While customers often write stateless logic that is fully contained in Lambda@Edge code, there are some scenarios where it’s useful to change the behavior of a function based on external information. In the Lambda@Edge documentation, you can find several examples of stateless logic like header manipulation, content localization, and cache key normalization. In contrast, the following are some scenarios that leverage external data:

  • Datadome is a cybersecurity solution for web and mobile applications that analyzes and manages non-human traffic in real time. Datadome explains in this talk how they use Lambda@Edge to decide whether to allow or block an incoming request. The Lambda@Edge function in the implementation makes external API calls to help analyze incoming requests.
  • TrueCar is a digital automotive marketplace that provides comprehensive pricing transparency about the price that people have paid for their cars. TrueCar details in this article how they leverage Lambda@Edge to dynamically route requests, based on rules fetched from Amazon DynamoDB.
  • DAZN is a subscription video streaming service dedicated to sports. In this article, DAZN explains how they used Lambda@Edge for A/B testing and geo-routing. Their Lambda@Edge function uses a parameter file that it downloads from Amazon S3.
  • Disney Streaming Services is responsible for developing and operating The Walt Disney Company’s direct-to-consumer video businesses globally. In this talk, Disney shares how they use Lambda@Edge to select an origin from a server pool in a sticky way. Server selection configuration is part of their Lambda@Edge code, which is updated when a configuration changes.

In this article, I will guide you through common patterns and options for reading external data in Lambda@Edge functions. For example, you can include external data directly in your deployment package, and then update it later as your data changes. Or, you can fetch data dynamically from an external endpoint by making a network call. When you choose to make external network calls, you can improve latency by caching data in your function’s memory, leveraging the CloudFront cache, or bringing data geographically closer to Lambda@Edge.

The best pattern to use depends on your scenario. For example, how cacheable is your data? How frequently does it change? How large is the data set that you need to use? How tolerant is your application to eventual consistency? It’s important to use the right pattern because what you choose impacts your function’s execution duration, which in turn impacts the cost of your function and how it scales. For more information about optimizing how you use Lambda@Edge, see my earlier blog post, Lambda@Edge Design Best Practices.

Include your data in your function deployment package

If the data required for your scenario rarely changes, consider including the data in your function deployment package (learn more about Lambda function deployment packages). Often you can add third-party Node.js libraries to the Lambda@Edge deployment package, as well as your own files. For example, you can add binary files, like small SQLite databases, or text files, like a JSON file.

As an example, the following Lambda@Edge function, triggered an origin request event, implements a low latency redirection mechanism. When the function is triggered, it checks whether the request path matches a source value in a list of regular expressions that are stored in the file redirects.json. If there’s a match, Lambda@Edge redirects the user to the associated destination which is listed in the file. If not, the request is forwarded to the original origin.

const redirects = require('./redirects.json').map(
  ({ source, destination }) => ({
    source: new RegExp(source),
    destination
  })
);
 
exports.handler = async event => {
  const request = event.Records[0].cf.request;
 
  for (const { source, destination } of redirects) {
    if (source.test(request.uri)) {
      return {
        status: '302',
        statusDescription: 'Found',
        headers: {
          location: [{ value: destination }]
        }
      };
    }
  }
 
  return request;
};

Although it’s convenient to include the redirection data in the package, the downside with this approach is that you must update and redeploy your function package every time data changes. A function update takes few seconds (sometimes, though it can take up to several minutes ) to propagate to all CloudFront edge locations. This short delay means that some edge locations continue to execute your function with earlier data until the propagation completes across CloudFront. This means that if you use this method, your application must be tolerant to eventually consistent data.

TIP: You can manage function package updates more easily by building a CI/CD pipeline to automate the code changes, triggered by data changes. The steps for setting up a pipeline are described in detail in this blog post. If your company has different teams that manage code and data, setting up this sort of automated pipeline requires proactive ongoing coordination between the teams.

A final consideration for including data in a package is the size of the data. The maximum compressed size of a Lambda@Edge package can’t exceed current limits of 50MB for origin events and 1MB for viewer events.

Fetch the data using network calls in your Lambda@Edge function

If it’s not workable for your use case to include the data in your deployment package, you can instead make a network call from your Lambda@Edge function to fetch the data. Network calls go to an external endpoint and fetch the data required by your function.

I’ll illustrate this by modifying the redirection example in the previous section to decouple the Lambda@Edge code from the redirection data. In the following updated code, I make an HTTP call in the fetchRedirections function, which downloads the redirection data from an Amazon S3 bucket. Note that the IAM role that your function assumes must allow read access to the S3 bucket.

const aws = require('aws-sdk');
const s3 = new aws.S3({ region: 'us-east-1' });
const s3Params = {
  Bucket: 'redirections-configuration',
  Key: 'redirects.json',
};
 
async function fetchRedirections() {
  const response = await s3.getObject(s3Params).promise();
  return JSON.parse(response.Body.toString('utf-8')).map(
    ({ source, destination }) => ({
      source: new RegExp(source),
      destination
    })
  );
}
 
exports.handler = async event => {
  const request = event.Records[0].cf.request;
 
  try {
    const redirects = await fetchRedirections();
 
    for (const { source, destination } of redirects) {
      if (source.test(request.uri)) {
        return {
          status: '302',
          statusDescription: 'Found',
          headers: {
            location: [{ value: destination }],
          },
        };
      }
    }
    
    return request;
    
  } catch (_error) {
    return request;
  }
};

Be aware that when you make an external network call like this, it adds latency and the function takes longer to run. You can mitigate this latency by using several different methods, such as caching and persistent connections. In the following sections, I share some of these best practices for optimizing network calls and reducing latency, and illustrate each method by updating the redirection example.

I – Cache your data in Lambda@Edge Memory

When your data is unlikely to change across many Lambda@Edge invocations, consider caching the network call result in your Lambda@Edge instance by using global variables. Global variables that you declare outside of the function handler persist across multiple Lambda@Edge invocations.

To illustrate how caching works in the redirection example, I download the redirection data from Amazon S3, and then store the data for 5 seconds in a global variable, redirections. On every invocation, the fetchRedirections function verifies that redirections is initialized and fresh. If it’s time to be refreshed, the function triggers the network call to S3 to get a new set of data.

const aws = require('aws-sdk');
const s3 = new aws.S3({ region: 'us-east-1' });
const s3Params = {
  Bucket: 'redirections-configuration',
  Key: 'redirects.json',
};
const TTL = 5000; // TTL of 5 seconds
 
async function fetchRedirectionsFromS3() {
  const response = await s3.getObject(s3Params).promise();
  return JSON.parse(response.Body.toString('utf-8')).map(
    ({ source, destination }) => ({
      source: new RegExp(source),
      destination,
    })
  );
}
 
let redirections;
function fetchRedirections() {
  if (!redirections) {
    redirections = fetchRedirectionsFromS3();
 
    setTimeout(() => {
      redirections = undefined;
    }, TTL);
  }
 
  return redirections;
}
 
exports.handler = async event => {
  const request = event.Records[0].cf.request;
 
  try {
    const redirects = await fetchRedirections();
 
    for (const { source, destination } of redirects) {
      if (source.test(request.uri)) {
        return {
          status: '302',
          statusDescription: 'Found',
          headers: {
            location: [{ value: destination }],
          },
        };
      }
    }
    
    return request;
    
  } catch (_error) {
    return request;
  }
};

When you use a global variable for caching data, you can adjust the caching duration in the Lambda@Edge instance to align with your application’s tolerance to eventual consistency. Depending on the amount of data that you need to cache, you can also increase the memory allocated to your function if it is configured on CloudFront origin events.

II – Use Persistent connections

Another method for reducing latency is to establish a persistent connection to your external endpoint that is maintained across Lambda@Edge invocations. This works well when the time elapsed between your network calls is shorter than the keep alive timeout of your endpoint. Using a persistent connection allows you to avoid reestablishing the TCP/TLS connection for every invocation, which removes the latency cost of the repeated connections.

You can implement a persistent connection by doing the following:

  1. Declare an HTTP/S keep alive agent in global variables, for example:
    const https = require('https');
    const keepAliveAgent = new https.Agent({keepAlive: true});
  2. Use the keep alive agent in each HTTP request as follows:
    • If you download data from an AWS service, like S3 or DynamoDB, you must supply a keep alive agent to the constructor of the AWS service or to the AWS package global configuration. Connections are not persistent by default in the current release (V2) of the AWS Javascript SDK. To supply a keep alive agent to S3, you could use the following variable, for example:
      const s3 = new aws.S3({region: 'us-east-1', httpOptions: {agent: keepAliveAgent}});
    • If you download data from a source outside of AWS, you must pass the keep alive agent to the get method of the HTTP object when you make the network call. For example, you could use the following call:
      http.get({ hostname: 'example.com', path: '/', agent: keepAliveAgent}, (response) => {...

III – Cache your data In CloudFront

Another option to reduce latency is to use the CloudFront cache. When your data is cacheable, consider fetching it through CloudFront to leverage CloudFront’s distributed cache. In general, you can use this approach to complement in-memory caching. However, in some cases you can only use CloudFront caching, for example, when you have too much required data to fit in Lambda@Edge memory.

As an example of caching with CloudFront, let’s say that you want to dynamically generate web pages by populating static HTML templates hosted on Amazon S3 that include user-specific information. Instead of downloading the HTML templates directly from S3, you can download them through a CloudFront distribution. You can use the same CloudFront distribution for this purpose that you have associated to your Lambda@Edge function, but make sure that you use a different CloudFront cache behavior than the one that triggers your function, to avoid infinite loops. In the following CloudFront configuration, I created four cache behaviors pointing to an S3 bucket for static content (that is, html, css, jpg, and js content) and I enabled Lambda@Edge only on the default cache behavior. In my Lambda@Edge function, I can safely download HTML files from the same CloudFront domain that is exposed to users.

Note that when you use this approach for caching, requests made from your Lambda@Edge function are counted in your CloudFront costs and reports.

IV – Bring your data closer to your viewers

In some scenarios, you can’t cache your data because it’s unique for every user or it’s not popular enough to stay cached. If that’s the case, you can reduce latency by bringing your data closer to your viewers. The overall approach is to replicate your endpoint to multiple geographies, and then make a network call from Lambda@Edge to the nearest endpoint. You have several options for how to get the data from the closer location, which I explain in this section.

First, replicate your endpoint to a subset of locations where CloudFront has regional edge caches. (To see a list of the CloudFront regional edge cache locations, see Amazon CloudFront Infrastructure). The subset that you should choose depends on where your customers are located, and trade-offs between your infrastructure cost and the latency reduction that you want to achieve.

For example, TrueCar leverages DynamoDB global tables to manage replication automatically. TrueCar replicates data to us-east-1 and us-west-2 to serve their customers in different areas of the United States, as shown in the following diagram.

Now we need to access data from the closer location. The first option is to use Route 53. When your replicated endpoints can reuse the same domain name, for example, when you use Amazon API Gateway, consider leveraging Route 53 latency-based routing to resolve the domain name to the nearest endpoint. With this method, your Lambda@Edge code can simply make an HTTP request to your domain without implementing any routing logic.

If your endpoints can’t use the same domain name—for example, when you have DynamoDB or S3 endpoints—the second option is to use the AWS_REGION environment variable in your Lambda@Edge code to route requests. To illustrate this, I’ll once again modify the redirections example. This time, I’ll store the redirection data in a DynamoDB global table that I’ve replicated in us-east-1, us-east-2, us-west-2, eu-west-2, and eu-central-1. Replicating to these AWS Regions optimizes the latency for my viewers in the United States and Europe. When a request comes from another location, I’ll use us-east-1 by default.

The following code shows how I implemented this example using replicated endpoints:

const aws = require('aws-sdk');
const https = require('https');
const { AWS_REGION } = process.env;
const replicatedRegions = {
  'us-east-1': true,
  'us-east-2': true,
  'us-west-2': true,
  'eu-west-2': true,
  'eu-central-1': true,
};
 
const documentClient = new aws.DynamoDB.DocumentClient({
  apiVersion: '2012-10-08',
  region: replicatedRegions[AWS_REGION] ? AWS_REGION : 'us-east-1',
  httpOptions: {
    agent: new https.Agent({
      keepAlive: true,
    }),
  },
});
 
exports.handler = async event => {
  const request = event.Records[0].cf.request;

  try {  
      const data = await documentClient
        .get({
          TableName: 'RedirectionsTable',
          Key: {
            path: request.uri,
          },
        })
        .promise();
     
      if (!(data && data.Item && data.Item.redirection)) return request;
     
      return {
        status: '302',
        statusDescription: 'Found',
        headers: {
          location: [{ value: data.Item.redirection }],
        },
      };
      
  } catch (_error) {
    return request;
  }
};

V – Additional Optimizations

In addition to the latency-reducing strategies that I shared in previous sections, you can also do the following to help optimize your Lambda@Edge implementation:

  • Leverage the AWS network by using AWS-based endpoints like Amazon S3, DynamoDB, API Gateway, ELB, and so on. When you connect to AWS resources, your application traffic travels on the high-quality network that Amazon monitors for both availability and low latency. This monitoring has the additional benefit of keeping error rates low and window sizes high.
  • Handle HTTP request failures in your function. If you don’t anticipate and address HTTP errors, your function can wait for a response that will never arrive because something went wrong on the wire. To manage this, set a timeout on your network call, and then fail over gracefully in your code.
  • Reduce the size of retrieved data. For example, you can use S3 Select or filters in DynamoDB queries to get only the data that is relevant for your logic. Another option is to compress the data by using gzip on your endpoint, and then unzip it in your function with the zlib library.

Conclusion

No single pattern for accessing external data is optimal for all use cases, but odds are that one of the options described in this post will work for your scenario. In each case, take into consideration the amount of data needed, how much your application is tolerant to eventual consistency, and overall costs. To summarize, depending on your data’s cacheability, you can fetch data from a geographically replicated endpoint, retrieve it through the CloudFront cache, or get it from Lambda@Edge memory. When your data doesn’t change often, a great option is to simply include it in your function deployment. Finally, if your function makes frequent network calls, make sure that you optimize the calls.