A comprehensive log management and analysis strategy is mission critical, enabling organizations to understand the relationship between operational, security, and change management events and maintain a comprehensive understanding of their infrastructure. AWS customers have access to service-specific metrics and log files to gain insight into how each AWS service is operating, and many services capture additional data, such as API calls, configuration changes, and billing events. Log files from web servers, applications, and operating systems also provide valuable data, though in different formats, and in a random and distributed fashion. To effectively consolidate, manage, and analyze these different logs, many AWS customers choose to implement centralized logging solutions using either self-managed tools or AWS Partner Network (APN) offerings. These solutions provide a streamlined view of application, system, and AWS log information in the pursuit of operational excellence.

This webpage provides high-level best practices for log management as well as information and considerations for selecting a centralized logging solution using AWS services or third-party offerings. It also introduces an AWS solution for centralized logging and data visualization using AWS managed services.

The following sections assume basic knowledge of Amazon Elastic Compute Cloud (Amazon EC2), Amazon Simple Storage Service (Amazon S3), Amazon CloudWatch (CloudWatch), Amazon Elasticsearch Service (Amazon ES), as well as a general understanding of application and system logging.

  • Solution Brief

    When planning a centralized log management strategy, first identify business and compliance requirements, such as log monitoring and review processes, access control granularity, and log aggregation, alerting, reporting, and retention requirements. Look for a solution that is scalable and that can support new log types as you expand your use of AWS services and cloud technologies. Consider these additional best practices for implementing a log management solution:

    • Define log retention requirements and lifecycle policies early on, and plan to move log files to cost-efficient storage locations as soon as practical.
    • Incorporate tools and features that automate the enforcement of lifecycle policies. For example, Amazon Simple Storage Service (Amazon S3) is a cost-efficient log-storage location that includes built-in lifecycle capabilities, enabling customers to automatically retire logs to less expensive storage tiers (Amazon S3 Standard - Infrequent Access, Amazon Glacier) as necessary.
    • Before implementing a custom-built solution, such as an open-source ELK stack running Elasticsearch, Logstash, and Kibana on local servers, consider the additional tasks, costs, and dependencies associated with managing and maintaining its components. Custom architectures might offer design flexibility, but managed services and tools can significantly reduce operational complexity.
    • Automate the installation and configuration of log shipping agents to consistently capture system and application logs and support dynamic scaling of Amazon EC2 instances. Use Amazon EC2 user data scripts or configuration management software to perform these tasks. Alternatively, include the agent as part of the Amazon Machine Image (AMI)
    • Organizations operating hybrid architectures should choose a solution that integrates with both on-premises and AWS workloads. Whether you choose to consolidate AWS and on-premises logs or to manage them separately, implement a log management solution that provides the visibility you require across all operating environments.

    The AWS Cloud provides flexible infrastructure and tools to support both sophisticated partner offerings and self-managed centralized-logging solutions. In general, the scale of the services to be monitored as well as an organization’s experience, budget, business requirements, and preferences for completeness and polish will determine which approach is most appropriate. The following sections describe some third-party products for log management as well as the key AWS services and open-source technologies to include in a self-managed centralized logging solution.

    The AWS Partner Network offers a variety of comprehensive log-management solutions that can help make it easy for organizations of any size or stage of development to manage, analyze, retain, and archive logs. When selecting a third-party product, look for a solution that is easy to configure, includes a flexible method for data ingestion, and incorporates functionality for event searching, monitoring, alerting, and data visualization (e.g. real-time data dashboards). This approach may be appropriate for customers in the following situations:

    • They have an existing partner tool in place for managing on-premises logs and want to extend their solution to incorporate logs from cloud resources. Organizations who already leverage popular partner technology can easily consolidate on-premises and cloud log data to visualize hybrid application performance in a single user interface.
    • They have advanced alerting or reporting requirements but do not have the dedicated development and system administration resources to create or manage this capability.
    • They have encryption, user management, security, or scalability requirements that Amazon ES and Kibana cannot presently meet.
      For example, some partner solutions are tailored to satisfy security requirements such as log-data encryption at rest or network isolation to a single VPC. These solutions can be the most efficient option for customers who manage protected personal or health information and who must comply with PCI and HIPAA standards.

    See the Partner Offerings tab for a list of popular partner products.

    Many customers choose to build their own centralized logging solution using AWS managed services. This can be a cost-effective and scalable way to help organizations meet their log-management needs. A serverless design can further reduce the overhead associated with managing individual solution components. This section introduces AWS services commonly used in log-management architectures.

    See the AWS Solution tab for a prescriptive centralized logging solution that customers can deploy in minutes using AWS CloudFormation. This automated solution uses native AWS services and open-source tools to capture, consolidate, and visualize log data.

    Elasticsearch is a popular open-source search and analytics engine from Elastic that provides a quick time to value and is well supported by a vibrant open-source community. AWS offers Amazon Elasticsearch Service (Amazon ES) as a managed service that makes it easy to deploy and operate Elasticsearch in the AWS Cloud. Amazon ES manages the capacity, scaling, patching, and administration of Elasticsearch clusters while providing direct access to the Elasticsearch API. The service is integrated with CloudWatch Logs, so there are no additional requirements to write code for movement or transformation of log data.

    Amazon ES provides integrated and managed access to Kibana, a data visualization plugin for Elasticsearch. Customers can create a variety of Kibana charts and dashboards for large volumes of data, and can load and use dashboards developed by the Kibana and AWS user communities. Note that Kibana does not provide native access control and must be secured with an additional mechanism, such as an Nginx web proxy (see the AWS Security Blog for detailed information). A third-party log management and visualization tool might be more appropriate for customers who cannot work within these limitations.

    Amazon CloudWatch Logs enables customers to monitor, store, and access log files from Amazon EC2 instances, AWS CloudTrail, and other sources. Customers can retrieve log data from CloudWatch Logs using the Amazon CloudWatch console, the CloudWatch Logs commands in the AWS CLI, the CloudWatch Logs API, or the CloudWatch Logs SDK. The CloudWatch Logs agent can be easily installed and configured on Linux and Windows instances to send application and system log files to CloudWatch. It is best practice to use EC2 roles to grant the CloudWatch Logs agent the necessary permissions.

    Amazon CloudWatch can also collect detailed system performance metrics from EC2 instances and provide those metrics to dashboards and API consumers such as Amazon Simple Notification Service and Auto Scaling triggers.

    Customers can subscribe to real-time CloudWatch Logs event feeds which they can either process themselves with Amazon Kinesis and AWS Lambda, or deliver directly to Amazon ES using an AWS-provided Lambda function that connects CloudWatch Logs to Amazon ES (see Real-time Processing of Log Data with Subscriptions in the Amazon CloudWatch Logs User Guide).

    Customers who have large amounts of log data to process can use Amazon Kinesis Firehose as a serverless log ingestion and delivery mechanism. Amazon Kinesis Firehose is a managed service that enables customers to deliver real-time streaming data to destinations such as Amazon ES, Amazon S3, and Amazon Redshift. Firehose is designed to handle large amounts of incoming data and can generate bulk indexing requests to an Amazon ES domain.

    Unlike self-managed log processing components, such as a Logstash cluster, Firehose does not require any servers, applications, or resource management. Customers configure individual data producers to send log data to a Firehose delivery stream continuously, and Firehose manages the rest.

    Many organizations choose to export log data from CloudWatch Logs to Amazon S3. Amazon S3 offers customers a durable, highly scalable location to store log data and to consolidate log files for custom processing and analysis. Amazon S3 is the best choice for long-term retention and archiving of log data, especially for organizations with compliance programs that require log data to be auditable in its native format.

    Once log data is in an Amazon S3 bucket, define lifecycle rules to automaticall­­y enforce retention policies and move these objects to other, cost-effective storage classes, such as Amazon S3 Standard - Infrequent Access (Standard - IA) or Amazon Glacier.

    Download PDF Version of this Solution Brief
  • AWS Solution

    AWS offers a centralized logging solution for collecting, analyzing, and displaying logs on AWS. The solution uses Amazon Elasticsearch Service (Amazon ES), a managed service that simplifies the deployment, operation, and scaling of Elasticsearch clusters in the AWS Cloud, as well as Kibana, an analytics and visualization platform that is integrated with Amazon ES. In combination with other AWS managed services, this solution offers customers a highly available, turnkey environment to begin logging and analyzing their AWS environment and applications.

    The diagram below presents the centralized logging architecture you can automatically deploy using the solution's implementation guide and accompanying AWS CloudFormation template.

    1. The solution deploys an Amazon ES domain, and also launches three Amazon EC2 instances in two separate Availability Zones of an Amazon Virtual Private (Amazon VPC) network.
    2. Elastic Load Balancing and automatic recovery support two instances with an Nginx proxy, which is used as an additional layer of authentication to restrict access to the Amazon ES domain dashboard.
    3. A custom AWS Lambda function is deployed to load log data from Amazon CloudWatch to an Amazon ES domain, configured with a set of default Kibana dashboards as a starting point for data visualization.
    4. User requests from approved IP addresses can access the Kibana UI using customer-defined credentials, and start working with its search, visualization, and reporting capabilites to manipulate data from your domain.
    Deploy Solution
    Implementation Guide

    What you'll accomplish:

    Deploy a centralized logging solution using AWS CloudFormation. The CloudFormation template will automatically launch and configure the components necessary to upload log files to Amazon ES for analysis and visualization in a customizable, user-friendly dashboard.

    Extend your logging capabilities beyond default AWS service logs. This flexible solution includes examples for capturing host-level log files and VPC flow logs, and is design to scale with your growing business.

    Control access to your dashboards using an Nginx proxy to simplify authentication to Amazon ES, as well as user credentials for an extra layer of protection.

    Simplify data visualization using built-in Amazon ES support for Kibana, including a default set of preconfigured dashboards that give you a first glimpse into the customization capabilities of Kibana 4.

    What you'll need before starting:

    An AWS account: You will need an AWS account to begin provisioning resources. Sign up for AWS.

    Skill level: This solution is intended for IT infrastructure and networking professionals who have practical experience architecting on the AWS Cloud.

    Q: What log sources does this solution work with?

    This solution provides example source logs from Apache web servers, VPC Flow Logs, and AWS CloudTrail. It includes an example Apache web server with the AWS CloudWatch Logs agent installed to demonstrate how to deliver system or application log data from Amazon EC2 instances running Linux or Windows Server to this solution.

    Q: Which log formats does this solution support?

    Amazon VPC Flow Logs, AWS CloudTrail, AWS Lambda, Common Log Format, Space Delimited, JSON, Apache web server logs, and other (user defined).

    Q: Can I deploy the centralized logging solution in any AWS Region?

    This solution uses AWS Lambda, Amazon CloudWatch Events, and Amazon ES which are currently available in specific AWS Regions only. Therefore, you must deploy this solution an AWS Region where all three services are available (see AWS service offerings by region).

    The Amazon ES domain that this solution creates can accept log data from other AWS Regions, therefore customers can incorporate this solution into a larger multi-region logging strategy. However, an alternative delivery mechanism is required in AWS Regions where Amazon CloudWatch Events is not available, because the solution uses this service to trigger the custom AWS Lambda function.

  • Partner Offerings

    The Amazon Partner Network (APN) offers a variety of comprehensive log-management solutions for organizations of any size or stage of development. Explore the AWS Marketplace for a comprehensive list of partner offerings, including popular options from the following parters.

    splunk

    Splunk software enables large and small organizations to search, monitor, analyze and visualize data coming from websites, applications, servers, networks, sensors and mobile devices.
    Learn more »

    sumologic

    Sumo Logic Log Management and Analytics enables enterprises to collect, manage, and analyze log data in order to improve their application and infrastructure management and monitoring.
    Learn more »

    datadog

    Datadog is a monitoring service that turns metric and event data produced by applications, tools, and services into actionable insight.
    Learn more »

    elastic

    From the company behind Elasticsearch, Logstash, and Kibana, Elastic Cloud is a fully managed Elasticsearch solution that runs in the AWS Cloud, and includes advanced security features, monitoring, and alerting.
    Learn more »

    loggly

    Loggly is a cloud-based log management solution designed for a number of operational use cases, including production issue troubleshooting, monitoring and alerting, and more. It offers comprehensive search, filtering, graphing, and analysis capabilities.
    Learn more »

Need more resources to get started with AWS? Visit the Getting Started Resource Center to find tutorials, projects and videos to get started with AWS.

Tell us what you think