AWS Cloud Operations Blog

Sending CloudFront standard logs to CloudWatch Logs for analysis

Amazon CloudFront is a fast content delivery network (CDN) service that securely delivers data, videos, applications, and APIs to customers globally with low latency, high transfer speeds, all within a developer-friendly environment.

CloudFront standard logs (also known as access logs) give you visibility into requests that are made to a CloudFront distribution. The logs can be analyzed for a variety of use cases, such as determining which objects are the most requested or which edge locations receive the most traffic. You can also use logging to troubleshoot errors or gain performance insights.

You can gain these insights using these Amazon CloudWatch Logs features:

In this blog post, I’ll show how you can send CloudFront access logs to Amazon CloudWatch Logs. I’ll also discuss tools that you can use with CloudWatch Logs to generate meaningful insights and create dashboards from your CloudFront logs.

Overview

Many AWS services write logs to CloudWatch Logs natively, while others write to Amazon Simple Storage Service (Amazon S3) or to both CloudWatch Logs and Amazon S3. For a list of services, see AWS Services That Publish Logs to CloudWatch Logs in the Amazon CloudWatch Logs User Guide.

In the case of a service that only publishes its logs to S3, a common practice is to use an AWS Lambda function triggered by an Amazon S3 event notification to write the logs into your CloudWatch log groups. In this post, I’ll walk you through this process and some sample metric filters, Contributor Insights rules, and CloudWatch Logs Insights queries. I’ll combine this data with CloudWatch metrics emitted from CloudFront to put together an operational CloudWatch dashboard for a given CloudFront distribution.

Note: In cases where there is a high frequency of write events to S3, you might want to send the S3 event notifications to an Amazon Simple Queue Service queue and then have the Lambda function poll the queue. The Amazon SQS approach allows a single function invocation to retrieve many objects from S3.

Lambda function pulls CloudFront logs from an S3 bucket and writes them to CloudWatch Logs. CloudFront metrics are used in a CloudWatch dashboard

Figure 1: Architecture for sending CloudFront logs to CloudWatch Logs

To use the CloudFormation template in this post, you need the following:

  • A CloudFront distribution with standard logging enabled. For more information, see Configuring and using standard logs (access logs) in the Amazon CloudFront Developer Guide.
  • Access to the S3 bucket where the CloudFront logs are being delivered.

Approach

You will use a CloudFormation template to deploy the following resources:

  • An S3 event notification to trigger for new-object-created events for your CloudFront logs.
  • A CloudWatch log group to store your CloudFront logs.
  • A Lambda-backed custom resource to configure the S3 event notification.
  • A Lambda function to write CloudFront logs in S3 to a CloudWatch log group.
  • Several metric filters for your log group.
  • Several Contributor Insights rules for your log group.
  • A CloudWatch Logs Insights query for the log group.
  • A CloudWatch dashboard that includes service metrics, metric filters, Contributor Insights rule reports, and CloudWatch Logs Insights query results.

The CloudFormation template takes the following parameters:

  • S3 logging bucket name

This is the name of the S3 bucket where your CloudFront logs are being delivered. The S3 bucket should be in the same AWS Region where the template is being deployed. The Lambda function deployment package is hosted on an S3 bucket in us-east-1. If your logging bucket is in a different Region, you will need to host the deployment package for the function in a bucket in that Region and edit the CloudFormation template to reference that location.

  • S3 prefix for CloudFront logs

This is the prefix in the S3 bucket where CloudFront logs are being written. This parameter is used for the S3 event notification filter.

  • CloudFront distribution ID

This is the CloudFront distribution ID. This parameter is used to add CloudFront metrics to your dashboard and determine the name of the log group and metric namespace for your metric filters.

  • Contributor Insights rule state

This parameter is configured to enable or disable the Contributor Insights rules on creation.

Specify stack details displays a field or the stack name and a Parameters section with fields for S3 logging location, CloudFront distribution, and rule configuration.

Figure 2: Configured CloudFormation stack parameters

If you would like to use a CloudFormation template to create these resources, download the following template.

To create the CloudFormation stack

  1. Open the CloudFormation console at https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/.
  2. In the navigation pane, choose Stacks.
  3. Under Create stack, choose With new resources (standard).
  4. For Specify template, choose upload a template file and select the file you downloaded named “CFtoCWLogsV2.yaml”.
  5. Enter the parameters for your CloudFront distribution ID, the S3 bucket where your CloudFront logs are being written, and the S3 prefix for your logs, and then choose Next.
  6. Leave all other fields at their defaults, and then choose Create stack.

Analyze CloudFront logs in CloudWatch

After your CloudFront logs are in CloudWatch, you can use Contributor Insights, metric filters, and CloudWatch Logs Insights queries to analyze them. When combined with the CloudWatch metrics emitted from CloudFront, you can create valuable dashboards for your CloudFront distribution.

Contributor Insights rules for your log group

Using Contributor Insights with CloudFront logs allows you to view requests made to your distribution from several dimensions. Here are a few examples of these rules, along with a link to the JSON-formatted rule definitions in GitHub.

Rule name Rule description
CF-Bytes-Out-By-POP This rule tracks the number of bytes sent out by each CloudFront edge location. This can help you view which edge location is responsible for serving the majority of traffic over time.
CF-Cache-Miss-by-URI This rule shows you the top requested URIs that resulted in a cache miss. The rule is filtered for only GET and HEAD requests, because PUT, POST, and DELETE requests cannot result in a cache hit. This rule is helpful for identifying objects that might be good candidates for caching.
CF-Edge-Status-by-POP This rule shows you the edge status for each request by edge location. This can be helpful to identify which edge locations are producing the most errors or miss or hit requests.
CF-Errors-By-POP-and-Path This rule shows you the top contributors for errors by URI path and edge location. This can be helpful for identifying which URI or edge location is responsible for a higher than expected error rate.
CF-Requests-by-HTTP-Method This rule displays the total number of requests by the HTTP method used (GET, POST, PUT, OPTIONS, HEAD, DELETE). This can help you get a better understanding of the type of requests that are being made to your distribution. You can gather this same data point by using a CloudWatch Logs metric filter.
CF-Requests-by-URI This rule shows you the top requested URIs across your CloudFront distribution over time.
CF-Requests-by-URI-and-UserAgent A modification of the previous rule, this rule tracks the top requests by URI and the User Agent field. This rule can help you understand if some objects are more requested by different user agents.
CF-Status-by-POP This rule tracks the HTTP response codes by CloudFront edge location. This can help you see which edge locations are producing the most requests that result in an error or a 200 OK.

Metric filters for the log group

You can use CloudWatch metric filters to extract meaningful metrics from your CloudFront logs. In some cases, the number of possible results for a log field is known, such as HTTP method or HTTP response code. In these cases, you can create metric filter expressions to collect these data points. For more information, see Filter and Pattern Syntax in the Amazon CloudWatch Logs User Guide.

Metric filter format for CloudFront logs:

You can use the following format for your CloudFront logs. This filter pattern will identify all of the log fields available. You can then use a numeric or equality operator to match a given log field.

 [date, time, x_edge_location, sc_bytes, c_ip, cs_method, Host, cs_uri_stem, sc_status, cs_referer, cs_User_Agent, us_uri_query, Cookie, x_edge_result_type=CapacityExceeded, x_edge_request_id, x_host_header, cs_protocol, cs_bytes, time_taken, x_forwarded_for, ssl_protocol, ssl_cipher, x_edge_response_result_type, cs_protocol_version, fle_status, fle_encrypted_fields, c_port, time_to_first_byte, x_edge_detailed_result_type, sc_content_type, sc_content_len, sc_range_start, sc_range_end ]

Metric filters to collect HTTP protocol:

When a log event matches a filter expression, you can decide to take a value from the log field and publish it as the metric value. In these examples, I use a combination of either bytes out or time taken. You can use the sample count statistic for the metric to view how many times an event matched the pattern. You can use the sum and average statistics to view the actual bytes downloaded or time taken.  For example, when you look at the two metric filters for HTTP and HTTPs requests, I use the bytes out field. When you use the sum statistic to view the HTTP and HTTPs metrics, you see the total number of bytes sent out. However, if you just want to compare the total number of HTTP and HTTPs requests, you can use the sample count statistic.

  • HTTP Requests
  • HTTPS-Requests

Latency metric filters:

In these metric filters, you are matching every log event in the log group. Each filter produces a different metric from a different field in the log event. In this case, I am tracking the values for the time to first byte and the time taken fields, respectively.

  • Origin Latency (time to first byte)
  • Latency-Filter

Metric filters to collect edge result type:

The following metric filters provide you with a metric of every potential value of the x-edge-result-type log field. You can gather this same data point from a Contributor Insights rule.

  • Hit-Requests
  • RefreshHit-Requests
  • OriginShieldHit-Requests
  • Redirect-Requests
  • Miss-Requests
  • Error-Requests
  • LimitExceeded-Requests
  • CapacityExceeded-Requests

CloudWatch Logs Insights query

You can use the following CloudWatch Logs Insights query template to parse out all the available log fields in your CloudFront logs. Then, you can easily use filter and group expressions by referring to the field name directly.

The following query is included as an example in the CloudWatch dashboard that is created with the CloudFormation template attached to this post. You can save this query in CloudWatch Logs Insights for use as a template to build more advanced queries. For more information, see CloudWatch Logs Insights Query Syntax in the Amazon CloudWatch Logs User Guide.

parse @message "* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *" as date, time, x_edge_location, sc_bytes, c_ip, cs_method, host, cs_uri_stem, sc_status, referer, useragent, cs_uri_query, cookie, x_edge_result_type, x_edge_request_id, x_host_header, cs_protocol, cs_bytes, time_taken, x_forwarded_for, ssl_protocol, ssl_cipher, x_edge_response_result_type, cs_protocol_version, fle_status, fle_encrypted_fields, c_port, time_to_first_byte, x_edge_detailed_result_type, sc_content_type, sc_content_len, sc_range_start, sc_range_end
| limit 20
| sort @timestamp desc 

Find log events that contain an error:

Here is an example query that will only show you requests where the x-edge-result-type field was in error.

parse @message "* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *" as date, time, x_edge_location, sc_bytes, c_ip, cs_method, host, cs_uri_stem, sc_status, referer, useragent, cs_uri_query, cookie, x_edge_result_type, x_edge_request_id, x_host_header, cs_protocol, cs_bytes, time_taken, x_forwarded_for, ssl_protocol, ssl_cipher, x_edge_response_result_type, cs_protocol_version, fle_status, fle_encrypted_fields, c_port, time_to_first_byte, x_edge_detailed_result_type, sc_content_type, sc_content_len, sc_range_start, sc_range_end
| limit 20
| sort @timestamp desc 
| filter x_edge_result_type like "Error"

Here is an example query that will produce a time series showing the minimum, average, and maximum values of the time_to_first_byte field over a one-minute period.

Parse @message “* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *” 
as date, time, x_edge_location, sc_bytes, c_ip, cs_method, host, cs_uri_stem, sc_status,  eferrer, useragent, cs_uri_query, cookie, x_edge_result_type,
 x_edge_request_id, x_host_header, cs_protocol, cs_bytes, time_taken, x_forwarded_for, ssl_protocol, ssl_cipher, x_edge_response_result_type, cs_protocol_version, fle_status, fle_encrypted_fields, c_port, time_to_first_byte, x_edge_detailed_result_type, sc_content_type, sc_content_len, sc_range_start, sc_range_end
| stats avg(time_to_first_byte), min(time_to_first_byte), max(time_to_first_byte) by bin(60s)

Using the default metrics in conjunction with the metrics emitted from your metric filters and the report data from your Contributor Insights rules, you can create an operational dashboard for your CloudFront distribution.

The dashboard displays graphs for edge status, requests by HTTP method, requests by URI, cache miss by URI, and more.

Figure 3: CloudFront Insights dashboard

Cost considerations

When evaluating costs, consider the volume of logs ingested into CloudWatch. A higher volume of logs impacts the cost for log ingestion, the number of matched log events in Contributor Insights, and the total amount of data queried in CloudWatch Logs Insights. For more information, see the CloudWatch pricing page.

Cleanup

To clean up this dashboard and associated rules, just delete the CloudFormation stack. For instructions, see Deleting a stack on the AWS CloudFormation console in the AWS CloudFormation User Guide.

Conclusion

In this post, I walked you through the sending of CloudFront standard logs to CloudWatch Logs through an S3 event notification and a Lambda function. I demonstrated some of the ways in which you can use Contributor Insights, CloudWatch Logs Insights, and metric filters to create powerful dashboards for your CloudFront distributions—without needing to use a third-party partner product. You can extract many other data points from CloudFront logs using CloudWatch. The examples in this post are designed to help you get started. Try building out your own CloudWatch Logs Insights queries, Contributor Insights rules, metric filters, and dashboards to add value to your observability strategy for CloudFront.

About the authors

Bobby Hallahan

Bobby Hallahan

Bobby Hallahan is a Senior Specialist Solutions Architect on the AWS Observability team. He is passionate about helping customers find innovative solutions to difficult problems. He works with AWS customers to help them meet their observability goals. During his tenure at AWS, Bobby has supported enterprise customers with their mission-critical workloads.

Lucas Vieira Souza da Silva

Lucas Vieira Souza da Silva

Lucas Vieira Souza da Silva is a Solutions Architect at Amazon Web Services (AWS). He devotes time to dive deep into Observability technologies, use cases, and open standards; automating telemetry of workloads running in the cloud through Infrastructure as Code; connecting Observability with business applications’ life cycles. Lucas is also passionate about designing functional, meaningful, scalable dashboards.