Dilip shows you how to process
Amazon CloudFront logs using
Amazon Elasticsearch Service

dilip_cloudfront_logs_elasticsearch

I want to build custom reports using the access logs from my Amazon CloudFront distribution. How can I process CloudFront logs with Amazon Elasticsearch Service (Amazon ES) so that I can create custom reports?

Follow these steps to process CloudFront logs using Amazon ES:

  1. Create an Amazon ES domain and an Amazon Simple Storage Service (Amazon S3) bucket in the same AWS Region.
  2. Configure your CloudFront distribution to store access logs in the Amazon S3 bucket.
  3. Configure an Amazon Elastic Compute Cloud (Amazon EC2) instance to use Logstash to process the CloudFront logs and push them to the Amazon ES domain.

Create an Amazon ES domain and an Amazon S3 bucket in the same AWS Region

To process CloudFront logs using Amazon ES, you must first set up these resources in the same AWS Region:

Configure your CloudFront distribution to store access logs in the Amazon S3 bucket

  1. Open the Amazon CloudFront console.
  2. Select your CloudFront distribution, and then choose Distribution Settings.
  3. In the General view, choose Edit, and then do the following:
    For Logging, select On.
    For Bucket for Logs, select the S3 bucket that's in the same AWS Region as your Amazon ES domain.
    For Log Prefix, enter a prefix for the names of the logs.
  4. Choose Yes, Edit.

Note: It might take up to 24 hours for log requests to be delivered to the S3 bucket.

Configure an Amazon EC2 instance to use Logstash to process the CloudFront logs and then push them to the Amazon ES domain

1.    Launch an Amazon EC2 instance.
Note: This instance must use an AWS Identity and Access Management (IAM) role that has access to Amazon S3 (GET object) and Amazon ES (PUT document). For more information, see Creating IAM Roles.

2.    Connect to the instance using SSH.

3.    Install Java 8 on the instance—this is required to run Logstash. For more information, see Installing Logstash on the Elastic website.

4.    Run this command to download the Logstash client on the instance:

wget https://artifacts.elastic.co/downloads/logstash/logstash-5.5.0.tar.gz

5.    Run this command to extract the Logstash client:

tar xvf logstash-5.5.0.tar.gz

6.    Run this command to install the Logstash plugin for Amazon ES:

cd logstash-5.5.0
bin/logstash-plugin install logstash-output-amazon_es

7.    Create a JSON-formatted file to serve as the template for the Logstash output. The template file can be similar to the following:
Note: Be sure to modify the template according to your reporting requirements.

cloudfront.template.json

#cloudfront.template.json
{
  "template": "cloudfront-logs-*",
  "mappings": {
    "logs": {
      "_source": {
        "enabled": false
      },
      "_all": {
        "enabled": false
      },
      "dynamic_templates": [
        {
          "string_fields": {
            "mapping": {
              "index": "not_analyzed",
              "type": "string"
            },
            "match_mapping_type": "string",
            "match": "*"
          }
        }
      ],
      "properties": {
        "geoip": {
          "dynamic": true,
          "properties": {
            "ip": {
              "type": "ip"
            },
            "location": {
              "type": "geo_point"
            },
            "latitude": {
              "type": "float"
            },
            "longitude": {
              "type": "float"
            }
          }
        }
      }
    }
  }
}

8.    Create a Logstash configuration file to define the S3 bucket with CloudFront logs as the input, and the Amazon ES domain as the output. The configuration file can be similar to the following:

cloudfront.conf

#cloudfront.conf
input {
  s3 {
    bucket => "{CLOUDFRONT_LOG_BUCKET}"
    prefix => "{CLOUDFRONT_LOG_KEY_PREFIX}"
    region => "{BUCKET_REGION_NAME}"
  }
}


filter {
  grok {
    match => { "message" => "%{DATE_EU:date}\t%{TIME:time}\t%{WORD:x_edge_location}\t(?:%{NUMBER:sc_bytes:int}|-)\t%{IPORHOST:c_ip}\t%{WORD:cs_method}\t%{HOSTNAME:cs_host}\t%{NOTSPACE:cs_uri_stem}\t%{NUMBER:sc_status:int}\t%{GREEDYDATA:referrer}\t%{GREEDYDATA:User_Agent}\t%{GREEDYDATA:cs_uri_stem}\t%{GREEDYDATA:cookies}\t%{WORD:x_edge_result_type}\t%{NOTSPACE:x_edge_request_id}\t%{HOSTNAME:x_host_header}\t%{URIPROTO:cs_protocol}\t%{INT:cs_bytes:int}\t%{GREEDYDATA:time_taken}\t%{GREEDYDATA:x_forwarded_for}\t%{GREEDYDATA:ssl_protocol}\t%{GREEDYDATA:ssl_cipher}\t%{GREEDYDATA:x_edge_response_result_type}" }
  }

  mutate {
    add_field => [ "listener_timestamp", "%{date} %{time}" ]
  }

  date {
    match => [ "listener_timestamp", "yy-MM-dd HH:mm:ss" ]
    target => "@timestamp"
  }

  geoip {
    source => "c_ip"
  }

  useragent {
    source => "User_Agent"
    target => "useragent"
  }

  mutate {
    remove_field => ["date", "time", "listener_timestamp", "cloudfront_version", "message", "cloudfront_fields", "User_Agent"]
  }
}

output {
  amazon_es {
    hosts => ["{AMAZON_ES_DOMAIN_ENDPOINT}"]
    region => "{AMAZON_ES_DOMAIN_REGION_NAME}"
    index => "cloudfront-logs-%{+YYYY.MM.dd}"
    template => "/path-to-file/cloudfront.template.json"
  }
}

9.    Use a text editor such as vi to edit the following values in your configuration file:
For bucket, enter the name of the S3 bucket that stores the CloudFront logs.
For prefix, enter the prefix that you specified as the Log Prefix when you enabled logging on your CloudFront distribution.
For region, enter the AWS Region of the S3 bucket and Amazon ES domain.
For hosts, enter the endpoint of your Amazon ES domain.
For template, enter the path to the template file that you created.

10.    Save the changes that you made to the configuration file.

11.    Run Logstash with the -f option, and specify the configuration file that you created. For more information, see Command-Line Flags on the Elastic website.

After you complete these steps, Logstash publishes documents to the Amazon ES domain that you specified. To check that documents are published successfully, open your Amazon ES domain from the Amazon ES console, and then check the Indices view.

You can now use Kibana to create custom reports and visualizations for your logs. For more information, see Kibana and Logstash.

Note: You might need to configure an access policy to be sure that Kibana can access the logs stored in your Amazon ES domain.


Did this page help you? Yes | No

Back to the AWS Support Knowledge Center

Need help? Visit the AWS Support Center

Published: 2018-01-09

Updated: 2019-02-12