Networking & Content Delivery

Aggregating Lambda@Edge Logs

Just as with AWS Lambda, Lambda@Edge supports logging to CloudWatch, which can help you to troubleshoot your Lambda function code or to log custom data that is not available in CloudFront access logs.  Lambda@Edge functions are replicated around the world so CloudFront can invoke them closer to your end viewers, and CloudWatch log files for Lambda@Edge are pushed to the Region nearest to where the function was executed.

To make it easier to use the logs for troubleshooting and analysis, we’ll provide steps in this blog post that show you how to aggregate Lambda@Edge logs from different Regions into a single Region, by using CloudWatch Logs subscription filters, Kinesis Data Firehose, and Amazon S3.

(Lambda@Edge is a feature of Amazon CloudFront that lets you run code closer to users of your application, which improves performance and reduces latency. Learn more in the Amazon CloudFront Developer Guide.)

Overview of the log file aggregation steps

Before we can aggregate the Lambda@Edge logs, we first need a Kinesis Data Firehose data stream to deliver the logs to.  So the first step is to run a CloudFormation template to create a Firehose delivery stream.

Next, to view just the logs for our Lambda@Edge function in the datastream, we use CloudWatch Logs subscription filters. Setting up subscription filters in a Region requires CloudWatch log groups to already exist in that Region. Lambda@Edge automatically creates the log group when a function is invoked in a Region. To make sure that each Region has a log group, we create a CloudFormation stack set that deploys a CloudFormation stack across all of the Regions to create the log groups if they don’t exist yet.

Finally, we use CloudWatch Logs subscription filters to stream logs for the Lambda@Edge function from each AWS Region where CloudFront has a Regional Edge Cache, and consolidate them in one Region. Lambda@Edge generates CloudWatch logs in the 11 AWS Regions where CloudFront has a Regional Edge Cache.  To set up filters to capture logs in all Regions, we create a second CloudFormation stack set that deploys a CloudFormation stack across all of the Regions.

(To see the current CloudFront architecture, including the location of all Regional Edge Caches, visit the CloudFront infrastructure page.)

Prerequisites

To deploy a CloudFormation stack set, your account must have the correct roles and policies in place.  To simplify setting up the permissions, use the provided CloudFormation templates which generate the required IAM roles and policies. To learn more about required permissions for stack sets, see Granting Permissions for Stack Set Operations in the CloudFormation Developer Guide.

Cost considerations

When you use a Firehose data stream, data is transferred from the CloudWatch logs in multiple Regions worldwide to the Region where you deploy the data stream. To help reduce the cost of cross-Region data transfer, consider deploying the Firehose data stream in the Region where you expect most of your traffic to be.

“Data Transfer OUT from CloudWatch Logs” is priced the same as “Data Transfer OUT from Amazon EC2 To” and “Data Transfer OUT from Amazon EC2 to Internet.” Learn more on the EC2 Pricing Page.

Provided CloudFormation templates

To make it easier to walk through how to aggregate log files, we include the following CloudFormation templates that you can deploy, as described in the rest of this blog post:

Create a Kinesis Firehose data stream

In this section, we use a CloudFormation template to generate the required resources for a Kinesis Data Firehose and set up a delivery stream.

The template includes the option to generate a CloudFront distribution with a Lambda@Edge function association that you can use for testing the log aggregation.  The test distribution simply returns a Lambda@Edge-generated response that includes an HTML table containing HTTP headers received in the request.

The following resources are created by the template:

  • S3 bucket – An S3 bucket where Kinesis Firehose writes the log files.
  • Kinesis Firehose delivery stream – The delivery stream used to aggregate logs from the original CloudWatch log files.
  • Kinesis IAM role – An IAM role with permissions for Firehose to deliver logs to S3.
  • CloudWatch IAM role – An IAM role with permissions to allow CloudWatch to send logs to the Kinesis Firehose delivery stream.

In addition, the following optional resources can be created:

  • CloudFront distribution – A distribution with a default cache behavior to invoke a Lambda function with a viewer request trigger.
  • Lambda@Edge function – A function that generates a response that returns the request headers in an HTML table.
  • Lambda version – A published version of the Lambda function associated to the CloudFront distribution.
  • CloudWatch IAM role – An IAM role with permissions for the Lambda function to write to CloudWatch logs.

Note: To generate the test distribution with an associated Lambda@Edge function, you must be in the US East (N. Virginia) Region. If you don’t want to create the distribution and associate the Lambda function, you can run the CloudFormation template in any Region where Kinesis Firehose is available.

To create the Firehose delivery stream, follow these steps.

  1. In the AWS console, open the CloudFormation console in the Region where you want to aggregate the log files. In this example, we use the US East (N. Virginia) Region.
  2. Choose Create Stack.
  3. Choose Specify an Amazon S3 template URL, and then copy and paste the following URL:
  4. Choose Next.
  5. Specify the following:
    • Stack name – Enter a name for the stack, for example, aggregate-le-logs.
    • CreateTestDistribution – For this example, choose Yes. If you use the template yourself, set the value to Yes if you want CloudFormation to create a distribution for testing.
      • Note: This resource only deploys if you are in the US East (N. Virginia) Region.
    • KinesisBufferIntervalSeconds – Use the default, 3MB. This value specifies the interval in seconds during which Kinesis will buffer incoming records before processing.
      • Note:  Amazon Kinesis Data Firehose buffers incoming streaming data to a certain size or for a certain period of time before delivering it to destinations. Kinesis triggers data delivery based on the buffer condition that is satisfied first.
    • KinesisBufferSizeMB – Use the default, 60 seconds. This value specifies the total size in MB of incoming records which Kinesis will buffer before processing records.
      • Note:  Amazon Kinesis Data Firehose buffers incoming streaming data to a certain size or for a certain period of time before delivering it to destinations. Kinesis triggers data delivery based on the buffer condition that is satisfied first.

  1. Choose Next.
  2.  On the Options page, choose Next to use the default options.
  3.  On the Review page, select the I acknowledge that AWS CloudFormation might create IAM resources check box, and then choose Create.

Create log groups

Now that you’ve created the Kinesis Firehose, follow the steps here to create (where needed) the log groups that CloudWatch Logs subscription filters require, using the CloudFront template that we provide.

  1. Sign in to the AWS Management console, and then open the CloudFormation console.
  2. In the CloudFormation dropdown menu, select StackSets.
  3. Choose Create StackSet.
  4. Choose Specify an Amazon S3 template URL, and then copy and paste the following link:
  5. Choose Next.
  6.  Enter the following values:
    • StackSet name – Enter a name for this stack set, for example, lambda-edge-log-groups.
    • LambdaEdgeFunctionName – Enter the name for the Lambda@Edge function, for example, GenerateViewerResponse-aggregate-le-logs.
  7. Choose Next.
  8. On the Set deployment options page, specify the following:
    • Choose Deploy stacks in accounts, and then specify the account number under which the stacks will be created.  In this case, we will use the same account as the one we used to deploy the Lambda@Edge functions.

For Specify regions, add to the Available regions column each Region where CloudFront has a Regional Edge Cache. To identify the Regions that have Regional Edge Caches, see the Amazon CloudFront Infrastructure page.

    • For Deployment options, specify the following:
      • For Maximum concurrent accounts, use the default option, By number, and for the Number of accounts per region, set the value to 1.
      • For Failure tolerance, use the default option, By number, and for the Number of accounts per region, set the value to 1. For this deployment, we allow a stack failure in case the log group already exists in a Region.

  1. Choose Next.
  2. On the Options page, for IAM Admin Role ARN, enter the IAM role that you created as a prerequisite for this procedure.
  3. Choose Next. (The default values are correct for this page.)
  4. Choose Create.

Wait for the CloudWatch Logs stack set to finish creating the log groups, and then follow the steps in the next section to create subscription filters.

Create subscription filters to aggregate logs

Follow the steps here to configure CloudWatch Logs subscription filters to collect the Lambda@Edge logs, and then send them to the Kinesis delivery system. You do this by creating another stack set using a CloudFormation template that we provide.

  1. Sign in to the AWS Management console, and then open the CloudFormation console.
  2. In the CloudFormation dropdown menu, select StackSets.
  3. Choose Create stack set.
  4. Choose Specify an Amazon S3 template URL, and then copy and paste the following link:
  5. Choose Next.
  6. Enter the following values:
    • StackSet name – Enter a name for this stack set, for example, lambda-edge-subscription-filters.
    • CloudWatchRoleArn – Enter the ARN for the CloudWatch role that the Kinesis Firehose template generated.
    • FilterPattern – Enter a filter pattern, enclosed in double quotes (“”), to match the log events for your function.  In this example, we enter the text “Request Processed In” to capture only the events logged from our Lambda@Edge function.  To learn more about CloudWatch filter patterns, see Filter and Pattern Syntax in the Amazon CloudWatch Logs User Guide.
    • FirehoseDestinationArn – Enter the ARN for the Firehose delivery stream that the Kinesis Firehose template generated.
    • LambdaEdgeFunctionName – Enter the name of the Lambda@Edge function, for example, GenerateViewerResponse-aggregate-le-logs.

  1. Choose Next.
  2. On the Set deployment options page, specify the following:
    • Choose Deploy stacks in accounts, and then specify the account number under which the stacks will be created.  In this case, we will use the same account as the one we used to deploy the Lambda@Edge functions.
    • For Specify regions, add to the Available regions column each Region where CloudFront has a Regional Edge Cache. To
    •  identify the Regions that have Regional Edge Caches, see the Amazon CloudFront Infrastructure page.

    • Under Deployment options, use the default values.

  1. Choose Next.
  2. On the Options page, for IAM Admin Role ARN, enter the IAM role that you created as a prerequisite.
  3.  Choose Next. (The default values are correct for this page.)
  4.  Choose Create

Test the subscription filters

You can use the test distribution created by the CloudFormation template to test the log aggregator, to make sure that log files are being collected from each AWS Region.

For example, you can use a tool such as GeoScreenshot to query the CloudFront distribution from different geographic locations.  Or you can use a tools such as curl to query the distribution from EC2 instances in different Regions. The following screenshot shows an example of what you see with the GeoScreenshot tool:

To test the log aggregation process that you set up, do the following.

  1.  Sign in to the AWS Management console, and then open the Kinesis console.
  2. Select the Firehose delivery stream, for example, cf-lambda-edge-logs-aggregate-le-logs.
  3. On the Monitoring tab, check to see that records for the expected time frame were sent to the delivery stream. 
  4. Open the Amazon S3 console, and then select the S3 bucket that the CloudFormation template generated, for example, aggregate-le-logs-s3bucket-1234example.
  5. Download a log file from the S3 bucket, and then look for the logs generated from the Lambda@Edge function invocations.  Notice that the log events only contain the logs that match our filter pattern, “Request Processed In”.

Note: The Kinesis logs are stored in a compressed ZIP format, so you must unzip the logs to view them.

The following screenshot shows an example of the filtered logs:

Congratulations!  You have successfully aggregated logs generated from Lambda@Edge functions that were executed in locations around the world.  Visit our documentation to learn more about CloudFrontLambda@Edge, Kinesis Data Firehose, and CloudWatch Logs.

The author

Tino Tran

Tino Tran is a Sr. Edge Specialized Solutions Architect based out of Florida. His main focus is to help companies deliver online content in a secure, reliable, and fast way using AWS Edge Services.  He is a experienced technologist with a background in software engineering, content delivery networks, and security.