Networking & Content Delivery

Handling Redirects@Edge Part 2

In continuation with our series on Handling Redirects@Edge, in this blog post, we will explore how you can leverage Amazon CloudFront, Lambda@Edge and Amazon Simple Storage Service (S3) to offload the origin from URL redirection with more advanced capabilities.

As part of this solution, we offer a simple custom-built user interface to define and manage URL redirection which is simple to setup and use, even for non-technical users. For example, your marketing teams can now self-serve their request to setup vanity URLs for SEO (search engine optimizations) without being dependent on their development teams.

Proposed Architecture

The diagram below depicts the architecture of redirection workflow. When a viewer accesses a URL through Amazon CloudFront for which a redirection is defined, a redirect response is generated by the Lambda@Edge Origin-Request function. The generated response is then cached for subsequent requests. The redirection rules are maintained and stored in a JSON file in Amazon Simple Storage Service (S3). If no redirection is defined for a given URI, then CloudFront fetches the object from the origin.

Here we have:

  • A CloudFront distribution to serve traffic.
  • ‘Origin-Request’ Lambda@Edge function.
  • Amazon S3 bucket which holds the redirection rules.
  • Amazon S3 bucket to serve as the origin for CloudFront (when no redirect applies)

Let’s understand the various steps through which the request flows through.

  • Step 1: Viewer makes a request for a URL.
  • Step 2: Origin-Request Lambda@Edge function retrieves the redirect rules JSON file from the S3 bucket on a defined schedule set in the function.
  • Step 3: Once the JSON rules file has been retrieved from S3, the Lambda@Edge function will check if the incoming URL matches an active redirection rule. If there is a match, the function generates a redirect response which is sent back to CloudFront for delivery. The request is not sent to the origin.
  • Step 4 & 5: If there is no matching rule, CloudFront fetches the object from the origin.
  • Step 6: The object (redirection or content from origin) is served to viewer and cached for subsequent requests.

Note: Step 2, 3, 4 & 5 are executed only when object is stale (expired) or does not exist in cache.

All the redirection rules are defined in redirector.json file and stored in a pre-configured Amazon S3 bucket. A sample format is explained below:

Rules File:

{  
   "uris":[  
      {  
         "original":"/index1.html",
         "redirect":"/index2.html?test=1",
         "statusCode":"302",
         "startTime":"2017-10-23T13:15",
         "endTime":""
      }],
"wildcards":[  
         {  
            "original":"/index6/*",
            "redirect":"/index7/*",
            "statusCode":"301",
            "startTime":"2017-10-23T01:01",
            "endTime":"2017-10-25T01:01"
         }],
"querystrings":[  
            {  
               "original":"campaign=1",
               "redirect":"/index4.html?test",
               "statusCode":"301",
               "startTime":"",
               "endTime":""
            }]
}

The current implementation supports exact, wildcard & query string matching capability. These rules are classified into “rules”, “wildcards”, “querystrings” nodes respectively inside the redirector.json file. Following are the attributes and definitions of a single rule:

"original": required, holds the incoming URI or pattern if wildcard or query string key=value,
"redirect": required, holds the destination URI to redirect to,
"statusCode": required, specify whether its Permanent (301) or Temporary (302) redirect,
"startTime": optional, start date time of rule,
"endTime": optional, end date time of rule

If you are comfortable editing and maintaining the rules file using your favorite editor, then you can upload your version into the root of S3 bucket created ( pattern {StackName}-rulebucket-{RandomID} ). You can further skip deploying the User Interface while launching the CloudFormation template.

If you need the user interface to assist in rule definitions, then the following section provides a walkthrough of the functionality of the interface. Otherwise, you can deep dive into the code snippets section.

The user interface setup

To ease the management of rules a simple user interface is provided. Access to this console is restricted and only authenticated Amazon Cognito user will be able to update the rule definitions as shown in below diagram. The user credentials with appropriate role is also created if you opt-in to deploy the user interface in the CloudFormation template.

The user interface is hosted from a S3 bucket and the ‘Deploy’ Lambda function unzips the bundle, sets appropriate object ACLs and modifies configuration to match your specific deployment. This utility function is invoked as part of custom resource definition from CloudFormation template. This helper function is also used to create the Cognito user credentials and deploy the user interface. To review the code and learn more please follow link below:

https://github.com/aws-samples/aws-lambda-redirection-at-edge

Walkthrough of the User Interface

Login screen :

Using the username and password generated we login to the User Interface URL. During first login, you will be prompted to change the password. Please note the new password must conform to your password policy. Once logged-in, you will be taken to the main Rules Manager console.

Rules console :
Here you can add various rules under three different sections namely ‘Rules’,’ Wildcards’ and ‘QueryString’.

Examples:

  • If we want to setup a simple redirect from /index.html to /newindex.html, we define it under the ‘Rules’ section by clicking on ‘+’ sign and enter the values as shown below:

  • If you want to define a wildcard rule, say redirect all URLs under /oldpath/* path to /newpath/* then you would define it under the ‘Wildcards’ section as shown below

Each rule needs the Original and Redirect URI’s, the HTTP response codes (either 301 for Permanent or 302 for Temporary type redirect) and optionally the validity of the rule to be specified. Once the rules are defined you can click the ‘Save’ to create a new version of the redirector.json file.

We use S3 object versioning feature to maintain previous versions of the rules. This enables us to revert back quickly to an earlier version if need be by simply selecting it from ‘Select Version’ drop down and click ‘Save’ again as shown below:

Looking deeper at Lambda@Edge function

Origin-Request function (code snippet 1)

The below Lambda function is invoked on the ‘Origin-Request’ trigger which is invoked only when CloudFront tries to fetch the object from origin. This can happen due to the object being not available in cache or the cache has expired. To learn more on the available Lambda@Edge triggers please refer to https://docs.aws.amazon.com/lambda/latest/dg/lambda-edge.html

'use strict';

const utils = require('./utils');
const http = require('https');
const rules = require('./rules');
const AWS = require('aws-sdk');
const S3 = new AWS.S3({
  signatureVersion: 'v4',
});

//initialize the rules engine
const R = rules.init();

let redirectorJson = null;
let lastUpdatedTime = 0;

//number of seconds before updating the redirection rules file.
let intervalBetweenUpdates = 60;

exports.handler = (event, context, callback) => {
  const request = event.Records[0].cf.request;
  const uri = request.uri;

  const headers = request.headers;
  const querystring = request.querystring;
  const originHost = headers['host'][0].value;
  const customHeaders = request.origin.s3.customHeaders;
  
  //read the bucket which holds the redirection rule from the custom 
  //origin header passed from CloudFront 
  const rulesBucket = customHeaders.rules_bucket[0].value;
  //read the redirection file name from the custom origin header
  //passed from CloudFront
  const rulesFile = customHeaders.rules_file[0].value;

  //clear the custom headers before sending to origin
  delete request.origin.s3.customHeaders.rules_bucket;
  delete request.origin.s3.customHeaders.rules_file;

  syncRedirectionRule(rulesBucket, rulesFile)
    .then(() => {

      // if no redirection rules are defined, then simply pass on the request.
      if (!redirectorJson) {
        console.log("No redirection rule exists..: %j", request);
        callback(null, request);
        return;
      }

      //define common variables
      let redirectUrl;
      let statusCode;
      let response;

     //fact variable contains attributes needed to evaluate if an active rule exist
      let fact = {
        "host": originHost,
        "uri": uri,
        "querystring": querystring,
        "ruleset": redirectorJson
      };

      //Now pass the fact on to the rule engine for results
      R.execute(fact, function(result) {
        //if a match is found
        if (result.result) {
          //calculate the Cache-Control max-age header value
          let maxAge = utils.calculateMaxAge(result.redirectRule.endTime);
          //generate response
          callback(null, utils.generateResponse(result.redirectRule.redirect,
            result.redirectRule.statusCode, maxAge));
        } else {
          console.log("no redirect present..: %j", request);
          callback(null, request);
        }
      });
    })
    .catch(err => {
      console.log("Error in OriginRequestFunction: %j", err);
      //if there is an error we simply allow the request to pass to origin
      callback(null, request);
    });
}


// the function loads and refreshes the redirector.json file periodically
// to fetch the rule definitions.
function syncRedirectionRule(ruleHost, ruleFile) {

...
 Refer to Origin Request Lambda@Edge function in repository for complete code

Dependencies: We use open source node-rules module, a light weight forward chaining Rule Engine.

Explanation:
In above code, the rules engine is initiated outside of function handler call to maintain its global reference across invocations and reduce execution time during each request. The Lambda function periodically synchronizes the redirect rules JSON file by invoking the syncRedirectionRule function. The bucket containing the redirection rules and the redirection file name (redirector.json) are passed as custom origin headers from CloudFront as shown below.

The refresh frequency (default 60 sec) is controlled by the variable intervalBetweenUpdates (in sec) maintained in the same rules file. This refresh frequency call also be updated from the User Interface.

The incoming URI and query parameters are read from the request object within the Lambda@Edge function. These parameters are sent to the rules engine to identify if a matching active rule definition exists. If a matching rule is found, we proceed with response generation with following redirect information:

  1. Cache-control header based on the identified rule’s time to live (TTL) or set to a default of 1 day (86400 sec) if the rule does not specify an expiry.
  2. Appropriate status codes (either 301 or 302)

The dependent rules engine and utility functions are explained below.

Rules Engine (code snippet 2)

The rule engine definition file is maintained in rules.js.

'use strict';
const RuleEngine = require('node-rules');
const utils = require('./utils');
var _ = require('underscore');

module.exports = rules;
rules.init = init;

function rules() {
  var self = this;
  return self;
}

//define the rules
var config = [{
    //rule to check if a path match exists?
    "priority": 9,
    "Id": "match-path",
    "condition": function(R) {
      // console.log("In condition path :%s", this.uri);

      let ruleFound = _.find(this.ruleset.rules, function(rule) {
        let tempRule = rule.original.match(this.uri);
        //if there is a match check if time is valid
        if (tempRule && utils.isRuleValid(rule.startTime, rule.endTime)){
          return rule;
        }
        return false;
      }, this);

      if (ruleFound) {
        // console.log("path rule found :%j", ruleFound);
        this.redirectRule = ruleFound;
      }
      R.when(ruleFound);
    },
    "consequence": function(R) {
      //console.log("calling next of host :%j",this.redirectRule);
      this.result = true;
      R.stop();
    },
  },
  {
    //rule to check if a wildcard path match exists?
    "priority": 8,
    "Id": "match-wildcard",
    "condition": function(R) {
      // console.log("In condition wildcard :%s", this.uri);
      let ruleFound = _.find(this.ruleset.wildcards, function(rule) {
        //regex to match path starts with..case-insensitive
        let regex = new RegExp('^' + rule.original.replace("*", ".*"), 'i');
        let tempRule = regex.test(this.uri);
        //if there is a match check if time is valid
        if (tempRule && utils.isRuleValid(rule.startTime, rule.endTime)){
          return rule;
        }
        return false;

      }, this);

      if (ruleFound) {
        let matchedRule = {};
        matchedRule.original = ruleFound.original;
        matchedRule.redirect = ruleFound.redirect;
        matchedRule.statusCode = ruleFound.statusCode;
        // console.log("wildcard rule found :%j", matchedRule);
        if (ruleFound.redirect.endsWith("*")) {
          let regex = new RegExp('^' + matchedRule.original, 'i');
          let match = this.uri.match(regex);
          // console.log("match :", match);
          matchedRule.redirect = matchedRule.redirect.replace("*", this.uri.replace(match, ""));
        }
        this.redirectRule = matchedRule;
      }
      R.when(ruleFound);
    },
    "consequence": function(R) {
      //console.log("calling next of host :%j",this.redirectRule);
      this.result = true;
      R.stop();
    },
  },
  {
    //rule to check if querystring match exists?
    "priority": 7,
    "Id": "match-querystring",
    "condition": function(R) {
      // console.log("In condition querystring :%s", this.querystring);

      let ruleFound = _.find(this.ruleset.querystrings, function(rule) {
        let tempRule = this.querystring.includes(rule.original);
        if (tempRule && utils.isRuleValid(rule.startTime, rule.endTime))
          return tempRule;
      }, this);

      if (ruleFound) {
        // console.log("querystring rule found :%j", ruleFound);
        this.redirectRule = ruleFound;
      }

      R.when(ruleFound);
    },
    "consequence": function(R) {
      //console.log("calling next of host :%j",this.redirectRule);
      this.result = true;
      R.stop();
    },
  },
  {
    //default rule which implies there is no matching redirection rule defined
    "priority": 6,
    "Id": "match-default",
    "condition": function(R) {
      // console.log("In condition norule");
      R.when(true);
    },
    "consequence": function(R) {
      this.result = false;
      this.redirectRule = null;
      R.stop();
    },
  }
];

//initialize the rule engine
function init() {
  //'ignoreFactChanges' is set to 'true' to ignore any changes made to the 'fact' construct //passed while evaluating it against a rule. 'false' implies that if there is a change to //'fact' then the rule is re-evaluated.
  return new RuleEngine(config, { ignoreFactChanges: true });
}

A rule definition consists of a ‘condition’ and its corresponding ‘consequence’. If a rule ‘condition’ matches the ‘fact’ you pass into the engine, then associated ‘consequence’ is run. Further, the order in which rules are evaluated is controlled by ‘priority’ attribute and called in decreasing order. Eg: rule with priority ‘9’ will be invoked before rule with priority ‘8’ and so on..

In the above code, the rules engine tries to find a matching rule in the following precedence of rule type:

  1. Exact URI match, where the complete URI is checked for a matching rule.
  2. Wildcard URI match, where part of URI is checked for a matching rule.
  3. Query string match, where only query parameter and value are matched.

This precedence can be altered by modifying the priority attribute in rules.js. Once a matching rule is found, the rule’s validity is checked based on start time and end time attributes. With this you can schedule your redirection rules well in advance and not have to worry about setting it up just in time. Once an active rule is found further checks for other subsequent rule matches are aborted.

Note: The precedence of rules within a rule type is defined by the order in which you define them.

Utilities function

The utils.js file contains following helper functions:

  1. generateResponse() – helps in generating a response.
  2. isRuleValid() – checks whether the matched rule is valid under various scenarios explained below:
    • No Start and End Time specified – Rule is always valid.
    • Only Start Time is specified – Rule is valid if current server time is greater than specified start time.
    • Only End Time is specified – Rule is valid if current server time is lesser than specified end time.
    • Both Start and End Time are specified – Rule is valid if current server time is within the specified range.
  3. calculateMaxAge() – calculates the max-age cache control header value.

Please refer to utils.js function in repository for complete code.

Build and Deploy the solution

The complete codebase can be accessed from GitHub repo

https://github.com/aws-samples/aws-lambda-redirection-at-edge

We need to build the required dependencies and package them for deployment using CloudFormation. The build scripts and steps are described in the README.

As part of the deployment some key AWS resources created are:

  1. Origin Request Lambda function which handles the redirection rules logic.
  2. S3 bucket to hold the User Interface and deploys the dependencies into them.
  3. Amazon Cognito user credentials for authenticated access to User Interface.
  4. CloudFront distribution with a single behavior configured to invoke the Lambda function using the ‘Origin Request’ trigger.

Once the deployment status turns to ‘CREATE COMPLETE’, the output tab of the deployed template will contain the UserInterface URL, Username and temporary Password along with the CloudFront distribution on which this redirect module is configured as shown below:

In our case we are using a S3 bucket as origin for CloudFront to simplify the deployment. In actual scenarios, you will be serving your dynamic content through your backend applications hosted behind a load balancer.

Testing the solution

  1. Define your redirection rules using the User Interface. You should see them updated into the redirector.json file located in the root of the S3 bucket which holds the deployed UI code.
  2. Open your browser and navigate to a URL to check whether redirection is working as intended.

Note: It may take up to the configured ‘Refresh Time’ for the rules to reflect on your distribution.

Troubleshooting

  1. UI is not responsive – it could be due to prolonged inactive period leading to invalid session. Refresh your browser window and re-login.
  2. Redirect rules are not working – depending on your ‘Refresh Time’ setting, it will take a while for the rules to reflect.
  3. Modification to rule not working – If your rule definition affects a URL recently served by CloudFront, then additionally you may have to invalidate the cache for that URL for changes to take effect immediately.

To learn more on cache invalidation in CloudFront please follow

https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Invalidation.html

When you create a trigger, Lambda@Edge automatically starts to send logs to Amazon CloudWatch Logs stream in the AWS region closest to the location where the function is executed. The format of the name for each log stream is ‘/aws/lambda/us-east-1.function-name’ where function-name is the logical name of the function created when you deployed the CloudFormation template.

To see the Regions where your Lambda@Edge function is receiving traffic, view graphs of metrics for the CloudFront distribution on the AWS console. Metrics are displayed for each AWS Region. On the same page, you can choose a Region and then view log files for that Region so that you can investigate issues.

Summary

In this blog post, we explored a way to handle redirections at the edge using Lambda@Edge functions. We saw how to define and manage a simple rule set using the user interface which we deployed as part of the solution. Since the redirections are handled closer to the viewer, we are able to reduce the latency of response and offload the origin(s) from managing and synchronizing redirection rules across web servers. Further there is reduction in origin bandwidth and server compute cycles.

If you’re new to Amazon CloudFront and Lambda@Edge, I encourage you to refer to Getting Started with Amazon CloudFront and Getting Started with Lambda@Edge documentation for more information on how to get started with our services today. To help debug and troubleshoot Lambda@Edge function please refer here.