Dynamic Request Routing in Multi-tenant Systems with Amazon CloudFront

In this blog post, we will share how OutSystems designed a globally distributed serverless request routing service for their multi-tenant architecture. This will provide you ways to benefit from a managed solution that’s scalable and requires a low operational effort. Namely, we explain how to select the origin serving an HTTP/S request using Lambda@Edge, including the capability to call an external API to make a dynamic selection.

Lambda@Edge is an extension of AWS Lambda, a service that lets you perform functions that customize the content that Amazon CloudFront delivers. Lambda@Edge scales automatically, from a few requests per day to many thousands of requests per second. Processing requests at AWS locations closer to the viewer instead of on origin servers significantly reduces latency and improves the user experience. You can run Lambda functions when the following CloudFront events occur:

When CloudFront receives a request from a viewer (viewer request)
Before CloudFront forwards a request to the origin (origin request)
When CloudFront receives a response from the origin (origin response)
Before CloudFront returns the response to the viewer (viewer response)

Figure 1. CloudFront events that can trigger Lambda@Edge functions

There are many uses for Lambda@Edge processing. You can check out more details on the service documentation.

Challenges in OutSystems’ multi-tenant architecture

To support some changes in how OutSystems handles multi-tenancy, they need to be able to dynamically select the origin in each request based on its host header and path.

OutSystems has a global presence and handles an increasing volume of requests every day. Deploying a custom reverse proxy solution globally would bring an additional development and operational overhead.

OutSystems’ use case required a system for advanced routing. The system should be secure, high-performing, and easy to operate. Finally, it needs to quickly integrate with existing deployment and orchestration tooling at OutSystems.

Architecture for the serverless dynamic origin selection solution

In CloudFront and Lambda@Edge, OutSystems found a fully managed serverless solution that will scale with their business and allow them to focus on their customers’ needs.

Natively, CloudFront can route requests to different origins based on path patterns, which can be configured in cache behaviors. For more advanced routing, customers can implement their own logic using Lambda@Edge.

Let’s take a look at the architecture OutSystems designed.

Figure 2. Example architecture for a Dynamic Request Routing solution

In this configuration, end users, regardless of their location, send requests to a common CloudFront distribution. Once the request arrives to CloudFront, it is evaluated based on the two configured cache behaviors:

The first behavior serves static objects from Amazon Simple Storage Service (Amazon S3) that are common to all tenants. This cache behavior is optimized for performance and caches static resources.
The second behavior forwards requests to the backend service. On this behavior, a Lambda@Edge is configured on Origin Request event to implement origin selection logic.

Several multi-tenant clusters will be running on different AWS Regions to serve their requests. To properly route users’ requests, we use a Lambda@Edge function. This function evaluates the request’s host header and/or path, based on that it chooses the corresponding origin cluster. The request is then forwarded upstream by CloudFront.

Choosing the right origin is based on an API call made to an Amazon DynamoDB table where OutSystems stores the mappings between their customers and different backend clusters. To improve performance, OutSystems implemented some of the best practices mentioned in the Leveraging external data in Lambda@Edge blog post:

Lambda@Edge temporarily caches the API call results, avoiding the need to make an API call for every request.
Additionally, DynamoDB global tables feature is used, and Lambda@Edge will make the API call to the nearest region to reduce latency.

The following code snippet can be used as guidance.

import boto3
import os
import time


DEFAULT_REGION = 'eu-west-1'
DDB_REGION_MAP = {
    'af': 'eu-west-1',
    'ap': 'ap-southeast-1',
    'ca': 'us-east-1',
    'eu': 'eu-west-1',
    'me': 'ap-southeast-1',
    'sa': 'sa-east-1',
    'us': 'us-east-1'
}
NOT_FOUND = '/404.html'
TABLE_NAME = 'ClusterLookup'


class LookupCache:
    """ Simple origin lookup cache
    
    Lookup origin on DynamoDB and cache results to reduce Lamdba execution time.
    To benefit from the caching capacbilities and connection pool of botocore,
    an instance of this class must be created outside of the Lambda handler.
    """
    _cache = {}

    def __init__(self, table_name):
        self._setup_table(table_name)

    def lookup(self, host):
        """Lookup origin based on hostname
        
        Check if we've origin in cache and that the cache is still valid,
        otherwise lookup the origin on DynamoDB
        """
        # Check if object is in cache and valid, otherwise do a lookup on DDB
        if (host in self._cache and
                self._cache[host]['expires_at'] > int(time.time())):
            host = self._cache[host]['value']
        else:
            host = self._ddb_lookup(host)

        # Return a custom origin if a value was found, 404 otherwise
        if host:
            return "custom:{}".format(host)
        else:
            return "s3:{}".format(NOT_FOUND)

    def _ddb_lookup(self, host):
        """Lookup hostname on DynamoDB table"""
        resp = self._table.get_item(
            Key={
                'hostname': host,
            })
        if 'Item' in resp:
            self._cache[host] = {
                'value': resp['Item']['clusterhostname'],
                'expires_at': int(time.time()) + resp['Item']['ttl'],
            }
            return resp['Item']['clusterhostname']

        return None

    def _setup_table(self, table_name):
        """Create table resource, select table on the nearest region"""
        region_prefix = os.environ.get('AWS_REGION').split('-')[0]
        if region_prefix in DDB_REGION_MAP:
            region = DDB_REGION_MAP[region_prefix]
        else:
            region = DEFAULT_REGION
        self._table = boto3.resource('dynamodb', region).Table(table_name)

# Global variable to hold lookup cache (persist across Lambda executions)
cache = LookupCache(TABLE_NAME)


def lambda_handler(event, context):
    request = event['Records'][0]['cf']['request']

    # Lookup origin based on the host path
    origin = cache.lookup(request['headers']['host'][0]['value'].lower())

    # Transform request to point to correct origin, or use S3 for unknown hosts
    if origin.startswith('custom:'):
        request['origin'] = {
            'custom': {
                     'domainName': origin.split(':')[-1],
                     'port': 443,
                     'protocol': 'https',
                     'path': '',
                     'sslProtocols': ['TLSv1.2'],
                     'readTimeout': 5,
                     'keepaliveTimeout': 5,
                     'customHeaders': {}
                 }
            }
    else:
        request['uri'] = origin.split(':')[-1]
        request['headers']['host'] = [
            {
                'key': 'host',
                'value': request['origin']['s3']['domainName']
            }]
    return request

After this step, the origin domain name is selected and the request forwarded upstream.

Globally, with this architecture OutSystems met their initial requirements:

Security
- Stronger distributed denial of service (DDoS) resource protection by using global presence of CloudFront
Performance
- Transport layer security (TLS) termination on CloudFront
- Cache optimization for multiple tenants
Operation
- Low maintenance
- Avoid logic replication
Reliability
- Global presence (200+ Points of Presence across the globe)

Conclusion

By using CloudFront and Lambda@Edge together with AWS services like DynamoDB, you can build high-performing distributed web applications for your use cases. In this blog post, we shared how OutSystems was able to dynamically route requests to their multi-tenant application, while achieving global distribution, service availability, and the agility to operate at scale.

About OutSystems

OutSystems is an AWS Advanced Tier Technology Partner that helps developers build applications quickly and efficiently. They provide a visual, model-driven development environment with AI-based assistance to ensure applications are built efficiently. Their platform services, also with AI, provide automation, which enhances the application lifecycle so that applications can be deployed and managed easily.

AWS Architecture Blog

Dynamic Request Routing in Multi-tenant Systems with Amazon CloudFront

Challenges in OutSystems’ multi-tenant architecture

Architecture for the serverless dynamic origin selection solution

Conclusion

About OutSystems

Resources

Follow