Building a reliable metrics pipeline with the OpenTelemetry Collector for AWS Managed Service for Prometheus
In this blog post, AWS intern engineers Aman Brar and Jason Liu talk about their experience working with the OpenTelemetry Collector and Prometheus Remote Write Exporter. They share their experiences in tackling challenges they faced and how they applied lessons learned to ensure the reliability of the AWS Distro for the OpenTelemetry Collector as the de facto agent for sending metrics to AWS Managed Service for Prometheus.
As software becomes exponentially more complex, understanding the state of our applications and infrastructure becomes increasingly important. Observability allows us to do just that. While monitoring enables us to understand the overall health of our systems, observability empowers us with detailed insights into the behavior of our systems through distributed traces, metrics, and logs. The recent surge and interest in monitoring and observability has led to the development and advent of numerous technologies and standards within the industry.
One such technology is Prometheus, an open source monitoring solution focused on understanding metrics data. Prometheus is an open source monitoring and alerting toolkit project that is part of the Cloud Native Computing Foundation (CNCF). More about Prometheus can be found on the Prometheus website.
Since its launch, the Prometheus exposition format has been widely used and adopted, becoming the de facto standard for monitoring Kubernetes. AWS recently announced its Managed Service for Prometheus (AMP) to provide customers with a ready made, scalable, and secure solution to help them leverage the power of Prometheus.
Prometheus does so much on its own, but it often binds developers into its framework. With the rise of many different APM vendors, developers want flexibility in choosing vendor backends to visualize their metrics, such as Grafana, Amazon CloudWatch, and others. This flexibility also drives a need for unity within the realm of observability. The OpenTelemetry project aims to provide that support by defining a new open standard—the OpenTelemetry Protocol (OTLP) for collecting traces, metrics, and logs.
The OpenTelemetry Collector is a vendor-agnostic implementation to receive, process, and export telemetry data. As Prometheus continues to be a prominent player in observability and OpenTelemetry grows as the open standard telemetry protocol, compatibility between the OpenTelemetry Collector and Prometheus is vital. To ensure protocol compatibility, AWS has taken a lead role in developing and maintaining the OpenTelemetry Collector’s Prometheus Receiver and Prometheus Remote Write Exporter.
We are excited to release key features in the secure, production-ready, AWS-supported distribution of OpenTelemetry (ADOT)—and an end-to-end solution for Prometheus pipelines now exists in ADOT. ADOT follows the key tenets of reliability, security, and scalability. This means AWS teams ensure the AWS Distro for the OpenTelemetry components, including the Collector, various language SDKs, and Prometheus exporters, are production-ready for sending metrics to AMP.
To ensure the ADOT Collector was production-ready as the de facto agent for sending metrics to AMP, we needed to ensure that the ADOT Collector has full compatibility with Prometheus, while meeting the AWS design tenets of operational excellence, security, scalability, and reliability.
Full compatibility with Prometheus
As with any project, we needed to understand the requirements of the ADOT Collector-AMP pipeline and what full compatibility would mean for us. As the goal of the ADOT Collector is to act as a drop-in replacement for the Prometheus server, we wanted to ensure parity between the a regular Prometheus workflow and the ADOT Collector-AMP pipeline. The end metrics produced by a Prometheus server should be the same as the metrics the ADOT Collector-AMP pipeline produced. Moreover, keeping in tune with the goal of the OpenTelemetry Collector, we needed to ensure the Prometheus metrics were being successfully converted to the OTLP data format. Thus, we needed to keep Prometheus’ three main features of service discovery, metric scraping, and relabeling, and also performing a successful conversion on Prometheus’ four main metric types: counters, gauges, histograms, and summaries.
In our testing, we discovered that the ADOT Collector-AMP pipeline was able to successfully perform service discovery within our Kubernetes and Amazon Elastic Kubernetes Service (Amazon EKS) clusters. We tested each of the five Kubernetes objects that Prometheus can be configured to scrape: Services, Pods, Nodes, Endpoints, and Ingresses. Given the proper RBAC permissions, ADOT Collector performed service discovery successfully.
Next, we needed to test metric scraping and relabeling. Using a sample Prometheus metric generation app, we verified that ADOT handled counters, gauges, and histograms as expected; however, summary metrics were being completely dropped along the pipeline. Diving deeper, we found that OpenTelemetry removed support for the summary metric within their specifications as certain elements of the OTLP protocol were still under discussion. As those issues were since resolved, we filed new issues and attended Collector and Metric SIG meetings to discuss our use case and the importance of the summary metric. After several rounds of discussion, we added the summary metric support into the Prometheus Receiver, Prometheus Remote Write Exporter, and Logging Exporter. With those changes merged upstream, we were able to fully verify that all four metrics were being scraped and converted correctly.
ADOT as an AWS Distribution
AWS customers should not have to worry about the security of their data when working with AWS services. To ensure security, AMP requires incoming requests to both be encrypted by HTTPS and signed by AWS Signature v4. Only allowing HTTPS requests ensures that data is hidden in requests and the AWS Signature v4 signature ensures that AWS credentials that have permissions to communicate with AWS-managed Prometheus are signing them. The AWS Prometheus Remote Write Exporter ensures that both of these conditions are met by any outgoing requests in order to ensure a secure export to AMP.
Moreover, it is important that AWS customers never miss out on any metrics, and hence we have included functionality to allow for Cortex HA Deduplication. This means customers can run multiple redundant replicas of their AOC instance, so that if any one of them fails, one of the others is there to pick up where it left off and ensure that no metrics are missed. This is easily configured by the use of High Availability (HA) labels. Customers simply need to set environment variables in their AOC deployment pod that represent a cluster and replica. For each cluster, only one replica’s data is kept, so we recommend the replica being the pod name. Customers then need to specify the cluster and __replica__ labels using external labels in the AWS Prometheus Remote Write Exporter. With this configuration, AWS-managed Prometheus will apply Cortex HA de-duplication logic and ensure no duplicate metrics are kept while multiple redundant replicas are running.
The two main components of the OpenTelemetry Collector in the ADOT Collector-AMP pipeline include the Prometheus Receiver and the Prometheus Remote Write Exporter.
The Prometheus Receiver is one type of receiver that is implemented by the OpenTelemetry Collector. Using similar mechanisms to the Prometheus Server, the Prometheus Receiver performs service discovery, metric scrapping, and re-labeling based on a wide set of Prometheus configurations. Essentially, the Prometheus Receiver allows us to directly insert our Prometheus configurations into the OpenTelemetry Collector’s to collect the same exact metrics. One caveat of the OpenTelemetry Collector is that the OTLP metric data model is different from the Prometheus data model. Thus, the Prometheus Receiver also needs to handle the metric conversion. These were the major requirements used to vet the Prometheus Receiver.
As the OpenTelemetry Collector continues to evolve, we found issues that needed to be resolved before the Collector could be officially released. One issue that arose while testing the Prometheus Receiver was a race condition creating deadlock. Through thorough investigation and help from our AWS mentors, we discovered the source of the problem and ensured the reliability of the solution.
With all of our changes merged upstream, the Prometheus Receiver has been fully vetted, ensuring that the first component in the ADOT Collector-AMP pipeline is working as expected.
Prometheus Remote Write Exporter
The Prometheus Remote Write Exporter is a component within the collector that converts OTLP format metrics into a time series format Prometheus can understand, before sending an HTTP POST request with the converted metrics to a Prometheus push gateway endpoint.
The only significant bug that was discovered in the functionality of this component was one that was straightforward to resolve. This bug was about histograms not being converted as expected. Prometheus expects histograms to be cumulative, which means that a bucket should contain the count of all observations with values less than or equal to that bucket’s upper bound. The OTLP histograms being consumed by this exporter were not cumulative histograms, so we simply needed to make that conversion by summing up each bucket’s preceding buckets.
Another issue that was prevalent was this component’s interoperability with AMP. To send an HTTP request to an AWS service, it must have the correct authentication, which can be done by AWS Signature Version 4 (sigv4). Because the OpenTelemetry Collector is written in Go, and Go is a language that simply compiles static binaries, there is no way to have a dynamic run-time configuration to do this authorization. This means that this logic must exist alongside the exporters it is being used with, and because this is an open source project, it raised an important question: Where exactly should this code be located? Because this is a vendor-specific functionality, it didn’t make sense to be made a part of the core open source OpenTelemetry Collector. However, because AMP is going to be an important service for the OpenTelemetry Collector to work with going forward, it should be a part of the open source distribution’s responsibility.
Ultimately the decision was for this code to exist in the OpenTelemetry Collector Contrib repository as its own exporter called AWS Prometheus Remote Write Exporter. The OpenTelemetry Collector Contrib repository is a superset of the core OpenTelemetry Collector and includes vendor specific components, and components that may not be widely used enough to warrant being in the core distribution. With the code being hosted in the Contrib repository, the ADOT Collector could simply import it and allow customers to use it, thus solving the problem of the Prometheus Remote Write Exporter’s interoperability with AWS-managed Prometheus.
To ensure operational excellence, we required comprehensive end-to-end testing of each component added to the AWS Distro for OpenTelemetry Collector. All the tests are available on GitHub.
For our scenario, we needed to test two new components: the Prometheus Receiver and the Prometheus Remote Write Exporter. Referring back to the goal of the ADOT Prometheus pipeline, we decided to test the two components together and ensure that the scraped metrics produced by the ADOT Prometheus pipeline exactly matched the generated metrics (including the metric name, labels, and values).
In test workflow, we use ADOT to scrape a sample Prometheus data generation app and export those metrics to AWS Managed Prometheus. The sample app mimics a customer application hooked with the Prometheus client library. This sample app exposes metrics at the
/metrics endpoint in the Prometheus exposition format and exposes the dynamically generated metrics at the
/expected_metrics endpoint. A separate validator then makes HTTP requests to both the expected metrics endpoint and AWS Managed Prometheus and verify the returned metrics match.
Soaking and performance tests
In addition to the validation tests, we perform soaking tests and performance tests in Amazon Elastic Compute Cloud (Amazon EC2) to ensure that ADOT performs efficiently and can handle immense scraping loads.
In soaking tests, we generate tens of thousands of metrics per scrape where ADOT scrapes at an interval of 15 seconds. The sample app and ADOT are deployed on separate Amazon EC2 instances within the same VPC. Amazon CloudWatch alarms are created on ADOT’s EC2 instance to measure the CPU and memory usage, as well as the amount of incoming bytes.
Performance tests run under a similar architecture. In this case, we generate different loads (100, 1k, and 5k metrics per second) for ADOT to scrape, process, and export. For performance tests, we monitor how the average and maximum CPU and memory usages change depending on the metric load.
|Test||Result||Duration||CPU Avg %||RAM Avg MB||Items Sent|
These tests will continue to run on a routine basis to ensure the reliability of the AWS Distro for OpenTelemetry Collector.
During the course of this project, we have learned so much about the open source community and observability. As our project relied heavily on our knowledge of Prometheus and the OpenTelemetry Collector, we took a deep dive into both projects, learning the customer use cases, design choices, and nuances involved. By analyzing and helping to develop open source-approved code, we learned a lot of Go and open source best practices. Aside from programming, we also learned a lot about writing technical documentation and evaluations to ensure that our ideas were thoroughly planned and concrete.
We loved the feedback that we’ve gotten both from the OpenTelemetry community and internally from our mentors and managers at Amazon. It has been a tremendous time for both of us, and we hope to continue to contribute to OpenTelemetry and many other open source projects in the future.
The content and opinions in this post are those of the third-party author and AWS is not responsible for the content or accuracy of this post.