AWS Big Data Blog

Getting started with Trace Analytics in Amazon Elasticsearch Service

Updated March 25, 2021. See the release notes below for more details.

Trace Analytics is now available for Amazon Elasticsearch Service (Amazon ES) domains running versions 7.9 or later. Developers and IT Ops teams can use this feature to troubleshoot performance and availability issues in their distributed applications. It provides end-to-end insights that are not possible with traditional methods of collecting logs and metrics from each component and service individually.

This feature provides a mechanism to ingest OpenTelemetry-standard trace data to be visualized and explored in Kibana. Trace Analytics introduces two new components that fit into the OpenTelemetry and Amazon ES ecosystems:

  • Data Prepper – A server-side application that collects telemetry data and transforms it for Amazon ES.
  • Trace Analytics Kibana plugin – A plugin that provides at-a-glance visibility into your application performance and the ability to drill down on individual traces. The plugin relies on trace data collected and transformed by Data Prepper.

The following diagram illustrates a component overview.

Here is a component overview:

Applications are instrumented with OpenTelemetry instrumentation, which emit trace data to OpenTelemetry Collectors. Collectors can be run as agents on Amazon Elastic Compute Cloud (Amazon EC2) as sidecars for Amazon ECS, or as sidecars or DaemonSets for Amazon Elastic Kubernetes Service (Amazon EKS). They are configured to export traces to Data Prepper, which transforms the data and writes it to Amazon ES. The Trace Analytics Kibana plugin can then be used to visualize and detect problems in your distributed applications.

OpenTelemetry is a Cloud Native Computing Foundation (CNCF) project that aims to define an open standard for the collection of telemetry data. Using an OpenTelemetry Collector in your service environment allows you to ingest trace data from other projects like Jaeger, Zipkin, and more. As of version 0.7.1, Data Prepper is currently an alpha release. It is a monolithic, vertically scaling component. Work on the next version is underway. It will support more features, including horizontal scaling.

In this post, we cover the following topics:

  • Launching Data Prepper to send trace data to your Amazon ES domain.
  • Configuring an OpenTelemetry Collector to send trace data to Data Prepper.
  • Exploring the Kibana Trace Analytics plugin using a sample application.

Prerequisites

To get started, you need:

Deploy to Amazon EC2 with AWS CloudFormation

Use the CloudFormation template to deploy Data Prepper to Amazon EC2.

  1. On the AWS CloudFormation console, choose Create stack.
  2. In Specify template, choose Upload a template file, and then upload the CloudFormation template.
  3. All fields on the Specify stack detailspage are required. Although you can use the defaults for most fields, enter your values for the following:
    • AmazonEsEndpoint
    • AmazonEsRegion
    • AmazonEsSubnetId (if your Amazon ES domain is in a VPC)
    • IAMRole
    • KeyName

The InstanceType parameter allows you to specify the size of the EC2 instance that will be created. For recommendations on instance sizing by workload, see Right Sizing: Provisioning Instances to Match Workloads, and the Scaling and Tuning guide of the Data Prepper repository.

It should take about 3 minutes to provision the stack. Data Prepper starts during the CloudFormation stack deployment. To view output logs, use SSH to connect to the EC2 host and then inspect the /var/log/data-prepper.out file.

Configure OpenTelemetry Collector

Now that Data Prepper is running on an EC2 instance, you can send trace data to it by running an OpenTelemetry Collector in your service environment. For information about installation, see Getting Started in the OpenTelemetry documentation. Make sure that the Collector is configured with an exporter that points to the address of the Data Prepper host. The following otel-collector-config.yaml example receives data from various sources and exports it to Data Prepper:

receivers:
  jaeger:
    protocols:
      grpc:
  otlp:
    protocols:
      grpc:
  zipkin:

exporters:
  otlp/data-prepper:
    endpoint: <data-prepper-address>:21890
    insecure: true

service:
  pipelines:
    traces:
      receivers: [jaeger, otlp, zipkin]
      exporters: [otlp/data-prepper]

Be sure to allow traffic to port 21890 on the EC2 instance. You can do this by adding an inbound rule to the instance’s security group.

Explore the Trace Analytics Kibana plugin by using a sample application

If you don’t have an OpenTelemetry Collector running and would like to send sample data to your Data Prepper instance to try out the trace analytics dashboard, you can quickly set up an instance of the Jaeger Hot R.O.D. application on the EC2 instance with Docker Compose. Our setup script creates three containers on the EC2 instance:

  • Jaeger Hot R.O.D. – The example application to generate trace data
  • Jaeger Agent – A network daemon that batches trace spans and sends them to the Collector
  • OpenTelemetry Collector – A vendor-agnostic executable capable of receiving, processing, and exporting telemetry data

Although your application, the OpenTelemetry Collectors, and Data Prepper instances typically wouldn’t reside on the same host in a real production environment, for simplicity and cost, we use one EC2 instance.

To start the sample application, complete the following steps:

  1. Use SSH to connect to the EC2 instance using the private key specified in the CloudFormation stack.
    1. When connecting, add a tunnel to port 8080 (the Hot R.O.D. container accepts connections from localhost only). You can do this by adding -L 8080:localhost:8080 to your SSH command.
  2. Download the setup script by running the following code:
    wget https://raw.githubusercontent.com/opendistro-for-elasticsearch/data-prepper/master/examples/aws/jaeger-hotrod-on-ec2/setup-jaeger-hotrod.sh
  3. Run the script with sh setup-jaeger-hotrod.sh.
  4. Visit http://localhost:8080/ to access the Hot R.O.D. dashboard and start sending trace data.
    Figure 2: Hot R.O.D. Rides on Demand
  5. After you generate sample data with the Hot R.O.D. application, navigate to your Kibana dashboard and from the left navigation pane and choose Trace Analytics.

The Dashboard view groups traces together by HTTP method and path so that you can see the average latency, error rate, and trends associated with an operation.

Figure 3: Dashboard page

  1. For a more focused view, choose Traces to drill down into a specific trace.
    Figure 4: Traces page
  2. Choose Services to view all services in the application and an interactive map that shows how the various services connect to each other.
    Figure 5: Services page

March 25, 2020 update

Data Prepper version 0.8.0-beta has been released and provides new features not covered in this post. Some highlights include:

  • Horizontally-scaling clusters can now be deployed by using the new Peer Forwarder plugin. Refer to the GitHub repository for new container-based deployment strategies.
  • Prometheus-friendly metrics can now be scraped via a new /metrics endpoint.

See the project’s Releases page for more information about the 0.8.0-beta release and future releases.

Conclusion

Trace Analytics adds to the existing log analytics capabilities of Amazon ES, enabling developers to isolate sources of performance problems and diagnose root causes in their distributed applications. We encourage you to start sending your trace data to Amazon ES so you can benefit from Trace Analytics today.


About the authors

Jeff Wright is a Software Development Engineer at Amazon Web Services where he works on the Search Services team. His interests are designing and building robust, scalable distributed applications. Jeff is a contributor to Open Distro for Elasticsearch.

 

 

 

Kowshik Nagarajaan is a Software Development Engineer at Amazon Web Services where he works on the Search Services team. His interests are building and automating distributed analytics applications. Kowshik is a contributor to Open Distro for Elasticsearch.

 

 

 

Anush Krishnamurthy is an Engineering Manager working on the Search Services team at Amazon Web Services.