Improving HA and long-term storage for Prometheus using Thanos on EKS with S3

Prometheus is an open source systems monitoring and alerting toolkit that is widely adopted as a standard monitoring tool with self-managed and provider-managed Kubernetes. Prometheus provides many useful features, such as dynamic service discovery, powerful queries, and seamless alert notification integration. Beyond certain scale, however, problems arise when basic Prometheus capabilities do not meet requirements such as:

Storing petabyte-scale historical data in a reliable and cost-efficient way
Accessing all metrics using a single-query API
Merging replicated data collected via Prometheus high-availability (HA) setups

Thanos was built in response to these challenges. Thanos, which is released under the Apache 2.0 license, offers a set of components that can be composed into a highly available Prometheus setup with long-term storage capabilities. Thanos uses the Prometheus 2.0 storage format to cost-efficiently store historical metric data in object storage, such as Amazon Simple Storage Service (Amazon S3), while retaining fast query latencies. In summary, Thanos is intended to provide:

Global query view of metrics
Virtually unlimited retention of metrics, including downsampling
High availability of components, including support for Prometheus HA

In this post, we’ll learn how to implement Thanos for HA and long-term storage for Prometheus metrics using Amazon S3 on an Amazon Elastic Kubernetes Service (Amazon EKS) platform.

Overview of solution

Thanos is an open source project that is capable of integrating with a Prometheus deployment, enabling a highly available metric system with long-term, scalable storage. For the simpler setup, we can get started with three new Thanos components:

Thanos SideCar: SideCar runs with every Prometheus instance. The sidecar uploads Prometheus data every two hours to storage (an S3 bucket in our case). It also serves real-time metrics that are not uploaded in bucket.
Thanos Store: Store serves metrics from Amazon S3 storage.
Thanos Querier: Querier has a user interface similar to that of Prometheus and it handles Prometheus query API. Querier queries Store and Sidecar to return the relevant metrics. If there are multiple Prometheus instances set up for HA, it can also de-duplicate the metrics.

Thanos basic components

We can also install Thanos Compactor, which applies compaction procedure to Prometheus block data stored in an S3 bucket. It is also responsible for downsampling data.

Prerequisites

This guide has the following requirements:

An AWS account with adequate permissions to operate IAM roles, IAM policy, Amazon EKS, and Amazon S3.
Running Amazon EKS cluster (Kubernetes 1.13 or above).
Prometheus or Prometheus Operator Helm Chart installed (v2.2.1+).
Helm 3.x.
Working knowledge of Kubernetes and using kubectl.
AWS Command Line Interface (AWS CLI) with at least version 1.18.86 or 2.0.25.
eksctl version 0.22.0 or above.
Confirm that all Thanos components are installed in the same Kubernetes namespace as Prometheus.
Clone the Kubernetes manifests for Thanos Querier and Store Deployment steps:
git clone -b release-0.12 https://github.com/thanos-io/kube-thanos.git
Thanos Compact manifests.

All instructions in this document use Prometheus Operator chart version 8.15.6.

Deployment overview

Before beginning with Thanos deployment, we configure an S3 bucket to use as object storage and create IAM policy required to access this bucket.

To deploy the Thanos components, we complete the following:

Diagram illustrating the deployment steps for Thanos components.

Enable Thanos Sidecar for Prometheus.
Deploy Thanos Querier with the ability to talk to Sidecar.
Confirm that Thanos Sidecar is able to upload Prometheus metrics to our S3 bucket.
Deploy Thanos Store to retrieve metrics data stored in long-term storage (in this case, our S3 bucket).
Set up Thanos Compactor for data compaction and downsampling.

Configure S3 bucket and IAM policy

To store metric data, create an S3 bucket in an AWS Region local to the Prometheus environment. Use the appropriate console or API-based mechanisms.
Create an IAM policy to attach to the IAM role to give access to ServiceAccount used by Prometheus POD.

{

    "Version": "2012-10-17",

    "Statement": [

        {

            "Sid": "Statement",

            "Effect": "Allow",

            "Action": [

                "s3:ListBucket",

                "s3:GetObject",

                "s3:DeleteObject",

                "s3:PutObject"

            ],

            "Resource": [

                "arn:aws:s3:::thanos-metrics-s3storage/*",

                "arn:aws:s3:::thanos-metrics-s3storage"

            ]

        }

    ]

}

Create Amazon EKS cluster

Next, create an Amazon EKS cluster using the configuration below. Once created, the cluster enables the following:

Provision the cluster with Kubernetes version 1.16 with one managed node group.
IAM OIDC provider to provide fine-grained permission management for an application running on Amazon EKS that uses other AWS services.
Create monitoring namespace on the provisioned Amazon EKS cluster. Use prometheus-prometheus-oper-prometheus ServiceAccount to run Prometheus POD.
Map the IAM policy to the ServiceAccount role to provide required permissions on the S3 bucket storing Thanos metric data.

apiVersion: eksctl.io/v1alpha5
    kind: ClusterConfig
    metadata:
      name: thanosdemo
      region: us-west-2
      version: '1.16'
    iam:
      withOIDC: true
      serviceAccounts:
      - metadata:
          name: prometheus-prometheus-oper-prometheus
          namespace: monitoring
          labels: {aws-usage: "application"}
        attachPolicyARNs:
        - "arn:aws:iam::454014481298:policy/thanos-metrics-s3storage-policy"
    managedNodeGroups:
      - name: ng0
        minSize: 1
        maxSize: 3
        desiredCapacity: 2
        ssh:
          allow: true
          publicKeyName: thanosdemo
        labels: {role: mngworker}
        iam:
          withAddonPolicies:
            imageBuilder: true
            autoScaler: true
            externalDNS: true
            certManager: true
            ebs: true
            albIngress: true
            xRay: true
            cloudWatch: true
            appMesh: true
    cloudWatch:
      clusterLogging:
        enableTypes: ["*"]

Next, we complete the following steps:

1. Run the command # eksctl create cluster -f eks-cluster-config.yaml to create the Amazon EKS cluster with the configuration stored in file eks-cluster-config.yaml.

2. After eksctl completes provisioning the cluster, verify the cluster health using the command kubectl get nodes:

NAME STATUS ROLES AGE VERSION
ip-192-168-XX-15.us-west-2.compute.internal Ready <none> 6h v1.16.13-eks-2ba888
ip-192-168-YY-20.us-west-2.compute.internal Ready <none> 6h v1.16.13-eks-2ba888

3. Verify the IAM OIDC provider by running the command aws eks describe-cluster --name thanosdemo --query "cluster.identity.oidc.issuer":

"https://oidc.eks.us-west-2.amazonaws.com/id/3423376DF7D6CC41B662FC8309BXXXX"

4. Verify the ServiceAccount created for Prometheus POD in monitoring namespace with the command kubectl describe serviceaccount prometheus-prometheus-oper-prometheus -n monitoring:

Name:                prometheus-prometheus-oper-prometheus
Namespace:           monitoring
Labels:              aws-usage=application
Annotations:         eks.amazonaws.com/role-arn: arn:aws:iam::454014481298:role/eksctl-thanosdemo-addon-iamserviceaccount-mo-Role1-1JIHLG6FSKXRK
Image pull secrets:  <none>
Mountable secrets:   prometheus-prometheus-oper-prometheus-token-shzqd
Tokens:              prometheus-prometheus-oper-prometheus-token-shzqd
Events:              <none>

Installing Helm CLI

Before we can get started, let’s install Helm CLI and configure the Helm repository. Complete the following steps:

1. Install the Helm CLI:

curl -sSL https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 | bash

2. Verify Helm version:

helm version —short

3. Configure the chart repository:

helm repo add stable https://kubernetes-charts.storage.googleapis.com/

Installing and configuring Prometheus and Thanos

1. Get the prometheus-operator chart default configuration values by running the command helm show values stable/prometheus-operator > values_default.yaml.

2. The prometheus-operator chart creates the Kubernetes resources required to run Prometheus as part of the installation. We must disable ServiceAccount creation for Prometheus POD as ServiceAccount prometheus-prometheus-oper-prometheus was created during the cluster install. Configure to create: false and add the ServiceAccount name under Deploy a Prometheus Instance section in the values_default.yaml file:

## Deploy a Prometheus instance
##
prometheus:
  enabled: true
  ## Annotations for Prometheus
  ##
  annotations: {}
  ## Service account for Prometheuses to use.
  ## ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/
  ##
  serviceAccount:
    create: false
    name: "prometheus-prometheus-oper-prometheus"

3. Add the Thanos Sidecar configuration after thanos with the command {} in values_default.yaml:

thanos:
      baseImage: quay.io/thanos/thanos
      version: v0.12.2
      objectStorageConfig:
        key: thanos-storage-config.yaml
        name: thanos-storage-config

4. Configure objectStorageConfig with the configuration file with the command thanos-storage-config.yaml:

type: s3
config:
  bucket: thanos-metrics-s3storage #S3 bucket name
  endpoint: s3.us-west-2.amazonaws.com #S3 Regional endpoint
  encryptsse: true

Note: Learn more about additional object storage configuration options in the Thanos documentation.

5. Create Kubernetes secret:

kubectl -n monitoring create secret generic thanos-storage-config —from-file=thanos-storage-config.yaml=thanos-storage-config.yaml

6. Install Thanos Sidecar with Prometheus POD:

helm install prometheus stable/prometheus-operator -f values_sa.yaml -n monitoring

7. Check the status of Prometheus POD and Thanos Sidecar with the command kubectl get po -n monitoring -l app=prometheus:

NAME READY STATUS RESTARTS AGE
prometheus-prometheus-prometheus-oper-prometheus-0 4/4 Running 1 4h49m
prometheus-prometheus-prometheus-oper-prometheus-1 4/4 Running 1 4h49m

8. Check the status of Thanos Sidecar container in Prometheus POD: kubectl describe pod prometheus-prometheus-prometheus-oper-prometheus-0 -n monitoring

In Prometheus POD, the status of Thanos Sidecar is:

thanos-sidecar:
    Container ID:  docker://65d6ba0d1de338d671cf75a7888e982b896198eb49c6b9214d2f3004a21f2f27
    Image:         quay.io/thanos/thanos:v0.12.2
    Image ID:      docker-pullable://quay.io/thanos/thanos@sha256:bc134406dcfb3cb235a75891f1ff992893ae0005bc6649b7df9d0259f0776f6f
    Ports:         10902/TCP, 10901/TCP
    Host Ports:    0/TCP, 0/TCP
    Args:
      sidecar
      --prometheus.url=http://127.0.0.1:9090/
      --tsdb.path=/prometheus
      --grpc-address=[$(POD_IP)]:10901
      --http-address=[$(POD_IP)]:10902
      --objstore.config=$(OBJSTORE_CONFIG)
      --log.level=info
      --log.format=logfmt
    State:          Running
      Started:      Sun, 06 Sep 2020 00:47:12 +0000
    Ready:          True

Deploy Thanos Querier

Thanos Querier assists in retrieving metrics from all Prometheus instances. It can be used with Grafana because of its compatibility with original PromQL and HTTP APIs.

1. Add metric store configuration as thanos-query-deployment.yaml under spec.spec.containers args query section:

--store=thanos-store.monitoring.svc.cluster.local:10901
--store=prometheus-operated.monitoring.svc.cluster.local:10901

The preceding store configuration adds Thanos Store service to retrieve historical metric data from object storage (S3 bucket) and Prometheus service for the latest metric data. We will be deploying Thanos Store service in the next step.

2. Apply the Query deployment, service, and serviceMonitor manifests to create Kubernetes objects:

kubectl apply -f thanos-query-deployment.yaml -f thanos-query-service.yaml -f thanos-query-serviceMonitor.yaml

Deploy Thanos Store

Thanos Store collaborates with querier for retrieving historical data from the given bucket.

Make the following changes in the Thanos Store configuration files:

1. Add ServiceAccountName to spec.template.spec to enable S3 bucket access in thanos-store-statefulSet.yaml:

serviceAccountName: prometheus-prometheus-oper-prometheus.

2. Change the spec.template.spec.containers.env in thanos-store-statefulSet.yaml to:

env:
name: OBJSTORE_CONFIG
valueFrom:
secretKeyRef:
key: thanos-storage-config.yaml
name: thanos-storage-config

3. Apply the Store statefulSet, service, and serviceMonitor manifests:

kubectl apply -f thanos-store-statefulSet.yaml -f thanos-store-service.yaml -f thanos-store-serviceMonitor.yaml

Deploy Thanos Compactor

Thanos Compactor completes the downsampling for historical data. The compactor needs a local disk space to store intermediate data for processing.

Make the following changes in Thanos Compactor configuration files:

1. Add the ServiceAccountName to spec.template.spec to enable S3 bucket access in thanos-compact-statefulSet.yaml:

serviceAccountName: prometheus-prometheus-oper-prometheus

2. Change the spec.template.spec.containers.env in thanos-compact-statefulSet.yaml to:

env:
        - name: OBJSTORE_CONFIG
          valueFrom:
            secretKeyRef:
              key: thanos-storage-config.yaml
              name: thanos-storage-config

3. Apply the Compact statefulSet, service, and serviceMonitor manifests:

kubectl apply -f thanos-compact-statefulSet.yaml -f thanos-compact-service.yaml -f thanos-compact-serviceMonitor.yaml

4. Check the status of all Thanos components:

kubectl get all -n monitoring

Configure Thanos as Grafana data source

To start viewing metric data with Grafana UI, we can add Thanos Querier service as one of the data sources. Do so by going to Grafana, Configuration, Data Sources, Add data source.

Screenshot of Thanos Querier as data source within the Grafana UI.

Cleaning up

To avoid incurring future charges, delete the resources. Use the following commands to clean up the Thanos environment:

1. Remove Thanos Querier, Store, and Compactor:

kubectl get all -n prometheus --no-headers=true | awk '/thanos/{print $1}' |xargs  kubectl delete -n prometheus

2. Remove Thanos Sidecar by removing the sidecar configuration added during the Thanos configuration process. Finish by applying changes with:

helm -n monitoring -f values.yaml upgrade prometheus stable/prometheus-operator

3. Delete the S3 bucket being used for storing metric data:

aws s3 rb s3://bucket-name --force

4. To delete the EKS cluster:

eksctl delete cluster --name=thanosdemo

Costs

Thanos enables users to archive metric data from Prometheus in an object store such as Amazon S3. This provides virtually unlimited storage for our monitoring system. For cost considerations, Thanos adds the price of storing and querying data from the object storage and running the store node to existing Prometheus setup. Compute used by queriers, compactors, and rule nodes require similar compute resources, as they save by not doing the same work directly on Prometheus servers.

In a typical Prometheus setup, the data that is accessed locally travels over the network in Thanos. Data transferred within the same AWS Region is free between Amazon S3 object store and Thanos.

For metric data, Prometheus uses and average of one-to-two bytes per sample for storage. If we store around 100,000 samples with a size of two bytes per day using Thanos, the storage consumption is around 196 KB on Amazon S3. This costs < 0.05 USD per day. The cost of retrievals by the store node depends on individual querying pattern, and you can add around 20% to the total storage cost to account for retrieval cost estimation.

Applying appropriate downsampling, resolution, and retention policies on Thanos object storage allows further optimization.

Conclusion

In this blog post, we explored how to transform Prometheus into a robust monitoring system. Using Thanos with Prometheus enables us to scale Prometheus horizontally. By using open source Thanos components and Amazon S3, we get a global view, virtually unlimited retention, and potential metric high availability.