AWS Open Source Blog
Using AWS Distro for OpenTelemetry Collector for cross-account metrics collection on Amazon ECS
In November 2020, we announced OpenTelemetry support on AWS with AWS Distro for OpenTelemetry (ADOT), a secure, production-ready, AWS-supported distribution of the Cloud Native Computing Foundation (CNCF) OpenTelemetry project. With ADOT, you can instrument applications to send correlated metrics and traces to multiple AWS solutions, such as our Amazon Managed Service for Prometheus (AMP) and Partner monitoring solutions.
Many customers have their applications running on separate AWS accounts—and even separate AWS Regions—and would like to have a central place for observability. In a previous article, we explained how to collect metrics across multiple accounts with Amazon Elastic Kubernetes Service (Amazon EKS). The scenario will be similar, except, in this one, we use the ADOT agent to collect application and platform metrics for workloads running on Amazon Elastic Container Service (Amazon ECS), our native container orchestration platform to an AMP workspace.
Setup overview
To resolve this challenge, we will use the following structure.
On the workload accounts:
- Create an IAM role to be used by Amazon ECS tasks.
On the central monitoring account:
- Create an AMP workspace.
- Create an IAM role that allows cross-account access to AMP.
On the workload accounts:
- Create Amazon ECS tasks permissions to assume a cross-account IAM role.
- Set up the application and the AWS Distro for OpenTelemetry agent.
- Create an Amazon ECS cluster and run the application.
On the central monitoring account:
- Visualize metrics with Amazon Managed Grafana.
The entire architecture looks like the following:
Workload account: ECS role setup
Logged into the workload account, we create an IAM role that will be used later by Amazon ECS tasks. This role then will be trusted on the central monitoring account and granted assume-role permissions.
Monitoring account setup
Logged into the workload account, we create an AMP workspace with the following command with awscli:
Alternatively, we can use the AWS console and navigate to the AMP service.
We now can create an IAM role with write permissions to the AMP workspace. To grant multiple accounts, populate the "AWS"
array with appropriate IAM role ARNs:
Workload account
Note: You can repeat instructions in this section for as many workload accounts as needed.
Logged into the workload account, we grant assumeRole
permissions to the role created previously:
Workload configuration
Next, we set up a sample application that exposes Prometheus metrics:
- Configure the
aws-otel-collector
to scrape the application and ECS metrics. - Build Docker images and host them on Amazon Elastic Container Registry (Amazon ECR).
- Configure, create an Amazon ECS cluster, and run everything using
ecs-cli
.
The layout should be organized as follows:
To set up Amazon ECS, we need Docker and ecs-cli as requirements. On Linux, ecs-cli
can be installed like this:
Now, let’s create the sample application that exposes a /metrics
Prometheus endpoint:
This will create a Dockerfile for the application:
And finally, the following script will create an ECR repository, build the application image, and push the image to Amazon ECR:
Now, let’s configure the AWS Distro for OpenTelemetry Collector. We will create a custom configuration to collect data called a Pipeline. A Pipeline defines a path the data follows in the collector starting from reception, then further processing or modification, and finally exiting the collector via exporters.
We will collect from the application with the /metrics
endpoint and make use of the ecs-metrics-receiver
to scrape various ECS task metadata from the ECS task metadata endpoint. Visit the documentation to learn more about ecs-metrics-receiver and other configuration options.
We will export collected metrics to the AMP workspace created on the monitoring account using awsprometheusremotewrite
exporters configuration. We will provide both the AMP remote_write
endpoint and the IAM role to assume—in our case, ECS-AMP-Central-Role
.
Edit the WORKSPACE_ID
and CENTRAL_ACCOUNT_ID
variables and run the following script to create the pipeline:
From the latest version of the aws-otel-collector
, create a custom image on Amazon ECR with our custom configuration:
Finally, build and push the image:
Run application: Set up Amazon ECS
Amazon ECS needs an execution role
—a set of permissions to run our tasks. Run the following script to create it:
Set up the WORKLOAD_ACCOUNT_ID
variable and run the following script to create a docker-compose
file:
Using ecs-cli
, we will create an Amazon ECS cluster:
After few minutes, the cluster should be created with all necessary associated resources. Select the VPC_ID
from the preceding command and get the default security group associated to the VPC:
Edit the ecs-params.yml
file needed by ecs-cli
, and replace the subnet IDs and security group from the previous outputs:
Finally, run the following script to deploy the application:
After few minutes, the Amazon ECS service should be up and running. You can verify the logs of the aws-otel-collector
on the Amazon CloudWatch Logs console, with the log group ecs-xaccount-metrics-demo
.
Monitoring account: Visualize metrics
Back in the monitoring account, let’s visualize our metrics using an Amazon Managed Grafana workspace. Refer to the documentation to set up Amazon Managed Grafana.
We can view metrics coming from the application endpoint:
And the Amazon ECS cluster metrics:
Clean up
Workload account
Central account
Conclusion
In this post, we explained how to use the AWS Distro for OpenTelemetry (ADOT) agent to collect application and platform metrics for workloads running on Amazon ECS.
You can use ADOT on other platforms, such as Amazon EKS, Amazon Elastic Compute Cloud (Amazon EC2), or on-premises. Additionally, you can use ADOT to collect distributed traces data and have multiple heterogeneous workload accounts sending metrics centrally to AMP and other platforms. Also, you can set up private connectivity with VPC endpoints and VPC peering, according to your needs.
Visit the ADOT, AMP, and Amazon Managed Grafana sites to learn more.