Containers

Kubernetes Logging powered by AWS for Fluent Bit

September 8, 2021: Amazon Elasticsearch Service has been renamed to Amazon OpenSearch Service. See details.


Centralized logging is an instrumental component of running and managing Kubernetes clusters at scale. Developers need access to logs for debugging and monitoring applications, operations teams need access for monitoring applications, and security needs access for monitoring. These teams have different requirements for processing and storage of logs. In this blog post, we will look at a solution to centralize your logs using AWS for Fluent Bit combined with Amazon CloudWatch.

AWS for Fluent Bit is a container built on Fluent Bit and is designed to be a log filter, parser, and router to various output destinations. AWS for Fluent Bit adds support for AWS services such as Amazon CloudWatch, Amazon Kinesis Data Firehose, and Amazon Kinesis Data Streams.

Before I dive into the solution, let’s look at how logs are processed by Fluent Bit and sent to the output destination. Logs are first ingested via an Input. For Kubernetes, our Input is the container log files generated by Docker from the stdout and stderr of the containers on that host. This input processes the Docker log format and ensure that the time is properly set on the log entry.

[INPUT]
    Name                    tail
    Tag                        kube.*
    Path                      /var/log/containers/*.log
    DB                         /var/log/flb_kube.db
    Parser                   docker
    Docker_Mode       On
    Mem_Buf_Limit     5MB
    Skip_Long_Lines   On
    Refresh_Interval    10

Next, logs are filtered by a set of Fluent Bit filters. This solution leverages the Kubernetes filter to enrich the log entries with Pod Labels and Annotations for easy querying in the log storage solution.

[FILTER]
    Name                kubernetes
    Match               kube.*
    Kube_URL            https://kubernetes.default.svc.cluster.local:443
    Merge_Log           On
    Merge_Log_Key       data
    K8S-Logging.Parser  On
    K8S-Logging.Exclude On

By default, the Kubernetes filter assumes the log data is in the JSON format and attempt to parse that data. For example, if our log entry was serialized from JSON to a string, we want the structured data available in our log entry. For example, if we had the following log entry, we would want the structured data to be available in our backend system. The plugin will also preserve the original entry from the application.

Input:

"{ \"message\": \"A new user signed up!\", \"service\": \"user-service\", \"metadata\": { \"source\": \"mobile\" } }"

Output:

{
    "data": {
        "message": "A new user signed up!",
        "service": "user-service",
        "metadata": {
            "source": "mobile"
        }
    },
    "log": "{ \"message\": \"A new user signed up!\", \"service\": \"user-service\", \"metadata\": { \"source\": \"mobile\" } }"
}

The parser can be customized to use custom parsers such as NGINX or Apache. By adding an annotation to the Kubernetes Pod, we can override the default JSON parser. Now, different log formats can be parsed and deserialized from their string formats into structured formats.

annotations:
    fluentbit.io/parser: nginx

The Kubernetes Filter also enriches the data with Kubernetes metadata. It calls the Kubernetes API Server and query for information about that pod. This adds an additional key to the log entry called Kubernetes.

kubernetes: {
    annotations: {
        "kubernetes.io/psp": "eks.privileged"
    },
    container_hash: "<some hash>",
    container_name: "myapp",
    docker_id: "<some id>",
    host: "ip-10-1-128-166.us-east-2.compute.internal",
    labels: {
        app: "myapp",
        "pod-template-hash": "<some hash>"
    },
    namespace_name: "default",
    pod_id: "198f7dd2-2270-11ea-be47-0a5d932f5920",
    pod_name: "myapp-5468c5d4d7-n2swr"
}

Now that we have our logs parsed and enriched with metadata we send them to our output destination. It is common to have multiple outputs for different use cases. For example, you may want all logs sent to Amazon CloudWatch so that developers can access the logs and also export them to S3 for long-term storage. Some development teams may use an ELK (Elasticsearch, Logstash, Kibana) stack. For ELK, we can leverage the Kinesis Data Firehose plugin to stream the logs to Amazon Elasticsearch and S3. This toolset is commonly used for developers to live stream logs for debugging while maintaining long-term audit requirements via S3. For more information on splitting logs to multiple outputs, see this blog post about Fluent Bit streams. We keep this example simple and just use Amazon CloudWatch Logs.

First, we need to configure the output for CloudWatch Logs. We configure Fluent Bit to send the logs to a specific log group and to create that group if it doesn’t exist.

[OUTPUT]
        Name cloudwatch
        Match   **
        region us-east-2
        log_group_name fluentbit-cloudwatch
        log_stream_prefix fluentbit-
        auto_create_group true

Now that we have our Fluent Bit configuration we need to deploy our cluster and DaemonSet. First, we use the eksctl to create a new cluster with IAM Roles for Service Accounts enabled. For more information on how to do this, visit the eksctl documentation. When setting up your cluster, make sure that there is a Service Account called “alb-ingress-controller” in the “kube-system” namespace with the proper ALB Ingress Controller permissions and a “fluentbit” in the “fluentbit-system” namespace with permission to write to Amazon CloudWatch Logs. Now that we have an EKS cluster with the FluentBit service account.

helm repo add incubator http://storage.googleapis.com/kubernetes-charts-incubator
helm upgrade -i aws-alb incubator/aws-alb-ingress-controller \
  --namespace kube-system \
  --set clusterName=fluentbit-demo-cluster \
  --set awsRegion=us-east-2 \
  --set awsVpcID=<vpc id of cluster> \
  --set image.tag=v1.1.5 \
  --set rbac.create=true \
  --set rbac.serviceAccountName=alb-ingress-controller

Once the cluster is provisioned, we deploy our DaemonSet. To start, we create a fluentbit.yml file, which defines the ClusterRole, ClusterRoleBinding, ConfigMap, and DaemonSet used to deploy our Fluent Bit agents.

---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: fluentbit
rules:
  - apiGroups: [""]
    resources:
      - namespaces
      - pods
    verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: fluentbit
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: fluentbit
subjects:
  - kind: ServiceAccount
    name: fluentbit
    namespace: fluentbit-system
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluentbit-config
  namespace: fluentbit-system
  labels:
    app.kubernetes.io/name: fluentbit
data:
  fluent-bit.conf: |
    [SERVICE]
        Parsers_File /fluent-bit/parsers/parsers.conf

    [INPUT]
        Name              tail
        Tag               kube.*
        Path              /var/log/containers/*.log
        DB                /var/log/flb_kube.db
        Parser            docker
        Docker_Mode       On
        Mem_Buf_Limit     5MB
        Skip_Long_Lines   On
        Refresh_Interval  10

    [FILTER]
        Name                kubernetes
        Match               kube.*
        Kube_URL            https://kubernetes.default.svc.cluster.local:443
        Merge_Log           On
        Merge_Log_Key       data
        K8S-Logging.Parser  On
        K8S-Logging.Exclude On

    [OUTPUT]
        Name cloudwatch
        Match   **
        region us-east-2
        log_group_name fluentbit-cloudwatch
        log_stream_prefix fluentbit-
        auto_create_group true
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentbit
  namespace: fluentbit-system
  labels:
    app.kubernetes.io/name: fluentbit
spec:
  selector:
    matchLabels:
      name: fluentbit
  template:
    metadata:
      labels:
        name: fluentbit
    spec:
      serviceAccountName: fluentbit
      containers:
        - name: aws-for-fluent-bit
          imagePullPolicy: Always
          image: amazon/aws-for-fluent-bit:latest
          volumeMounts:
            - name: varlog
              mountPath: /var/log
            - name: varlibdockercontainers
              mountPath: /var/lib/docker/containers
              readOnly: true
            - name: fluentbit-config
              mountPath: /fluent-bit/etc/
          resources:
            limits:
              memory: 500Mi
            requests:
              cpu: 500m
              memory: 100Mi
      volumes:
        - name: varlog
          hostPath:
            path: /var/log
        - name: varlibdockercontainers
          hostPath:
            path: /var/lib/docker/containers
        - name: fluentbit-config
          configMap:
            name: fluentbit-config

Once this is deployed, you should see logs streaming into your Amazon CloudWatch Log Group for the system containers such as CoreDNS and the AWS Node container. To show how this can be used for applications, we deploy an NGINX container and specify a custom Fluent Bit parser. Below is the demo.yml file used for our demo application.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
      annotations:
        fluentbit.io/parser: nginx
    spec:
      containers:
        - name: nginx
          image: nginx:1.17-alpine
          ports:
            - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: nginx-service
spec:
  type: NodePort
  selector:
    app: nginx
  ports:
    - protocol: TCP
      port: 80
      targetPort: 80
---
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: nginx-ingress
  annotations:
    kubernetes.io/ingress.class: alb
    alb.ingress.kubernetes.io/scheme: internet-facing
spec:
  rules:
    - http:
        paths:
          - path: /*
            backend:
              serviceName: nginx-service
              servicePort: 80

Once this is deployed to the cluster, we wait for the ALB to be configured. You can get the ALB endpoint by running the following command:

❯ kubectl get ingress/nginx-ingress                                                                                    
NAME            HOSTS   ADDRESS                                                                  PORTS   AGE
nginx-ingress   *       3d6aa6b1-default-nginxingr-29e9-1180704856.us-east-2.elb.amazonaws.com   80      3h3m

Then use to simulate HTTP requests, like hey, to simulate load on the application and generate logs:

hey -n 30 -c 1 http://3d6aa6b1-default-nginxingr-29e9-1180704856.us-east-2.elb.amazonaws.com/

You should now see entries in your logs that look similar to this:

{
   "data":{
      "agent":"hey/0.0.1",
      "code":"200",
      "host":"-",
      "method":"GET",
      "path":"/",
      "referer":"-",
      "remote":"10.1.128.166",
      "size":"612",
      "user":"-"
   },
   "kubernetes":{
      "annotations":{
         "fluentbit.io/parser":"nginx",
         "kubernetes.io/psp":"eks.privileged"
      },
      "container_hash":"0e61b143db3110f3b8ae29a67f107d5536b71a7c1f10afb14d4228711fc65a13",
      "container_name":"nginx",
      "docker_id":"b90a89309ac90fff2972822c66f11736933000c5aa6376dff0c11a441fa427ee",
      "host":"ip-10-1-128-166.us-east-2.compute.internal",
      "labels":{
         "app":"nginx",
         "pod-template-hash":"5468c5d4d7"
      },
      "namespace_name":"default",
      "pod_id":"198f7dd2-2270-11ea-be47-0a5d932f5920",
      "pod_name":"nginx-5468c5d4d7-n2swr"
   },
   "log":"10.1.128.166 - - [19/Dec/2019:17:41:12 +0000] \"GET / HTTP/1.1\" 200 612 \"-\" \"hey/0.0.1\" \"52.95.4.2\"\n",
   "stream":"stdout",
   "time":"2019-12-19T17:41:12.70630778Z"
}

Awesome! The AWS for Fluent Bit DaemonSet is now streaming logs from our application, adding Kubernetes metadata, parsing the logs, and sending it to Amazon CloudWatch for monitoring and alerting.

Note: If you are running your containers on AWS Fargate, you need to run a separate sidecar container per Pod as Fargate doesn’t support DaemonSets.

In conclusion, this architecture can be configured to stream logs to Amazon CloudWatch, Amazon Kinesis Data Firehose, Amazon Kinesis Data Streams, and many other backends in a structured manner that can be monitored and analyzed. This enables administrators to provide access to the necessary groups within the organization.