AWS Storage Blog
Persistent storage for container logging using Fluent Bit and Amazon EFS
UPDATE 9/8/2021: Amazon Elasticsearch Service has been renamed to Amazon OpenSearch Service. See details.
Logging is a powerful debugging mechanism for developers and operations teams when they must troubleshoot issues. Containerized applications write logs to standard output, which is redirected to local ephemeral storage, by default. These logs are lost when the container is terminated and are not available to troubleshoot issues unless they are stored on persistent storage or routed successfully to another centralized logging destination.
In this blog, I cover how to persist logs from your Amazon Elastic Kubernetes Service (Amazon EKS) containers on highly durable, highly available, elastic, and POSIX-compliant Amazon Elastic File System (Amazon EFS) file systems. We explore two different use cases:
- Use case 1: Persist your application logs directly on an Amazon EFS file system when default standard output (stdout) cannot be used. This use case applies to:
- Applications running on AWS Fargate for Amazon EKS. Fargate requires applications to write logs to a file system instead of stdout.
- Traditional applications that are containerized and need the ability to write application logs to a file.
- Use case 2: Persist your container logs centrally on an Amazon EFS file system using the Fluent Bit file plugin. When you are routing your container logs using Fluent Bit to external sources like Elasticsearch for centralized logging, there could be risk of losing the logs when these external sources are under heavy load, or must be restarted. Storing these logs on EFS provides developers and operations teams peace of mind, as they know a copy of their logs are available on EFS.
Using the Amazon EFS Container Storage Interface (CSI) driver, now generally available, EFS enables customers to persist data and state from their containers running in Amazon EKS. EFS provides fully managed, elastic, highly available, scalable, and high-performance, cloud-native shared file systems. Amazon EFS provides shared persistent storage that can scale automatically and enables deployment of highly available applications that have access to the same shared data across all Availability Zones in the Region. If a Kubernetes pod is shut down and relaunched, the CSI driver reconnects the EFS file system, even if the pod is relaunched in a different Availability Zone.
Using our fully managed Amazon EKS, AWS makes it easy to run Kubernetes without needing to install and operate your own Kubernetes control plane or worker nodes. EKS runs Kubernetes control plane instances across multiple Availability Zones, to ensure high availability. It also automatically detects and replaces unhealthy control plane instances, and provides automated version upgrades and patching for them. In addition, with our recently launched support for Amazon EFS file systems on AWS Fargate, EKS pods running on AWS Fargate can now mount EFS file systems using the EFS CSI driver.
Fluent Bit is an open source log shipper and processor, to collect data from multiple sources and forward it to different destinations for further analysis. Fluent Bit is a lightweight and performant log shipper and forwarder that is a successor to Fluentd. Fluent Bit is a part of the Fluentd Ecosystem but uses much fewer resources. It creates a tiny footprint on your system’s memory. You can route logs to Amazon CloudWatch, Amazon Elasticsearch Service, Amazon Redshift, and a wide range of other destinations supported by the Fluent Bit.
Prerequisites:
Before we dive into our use cases, let’s review the prerequisites. These steps should be completed:
- Installed the aws-iam-authenticator
- Installed the Kubernetes command line utility kubectl version 1.14 or later
- Installed eksctl a simple command line utility for creating and managing Kubernetes clusters on Amazon EKS)
- Basic understanding of Kubernetes and Amazon EFS
- Basic understanding of log shipping and forwarding using Fluent Bit
- Version 1.18.17 or later of the AWS CLI installed (to install or upgrade the AWS CLI, see this documentation on installing the AWS CLI)
If you are new to Fluent Bit, I recommend reading the following blogs from my fellow colleagues:
Use case 1: Persist your application logs directly on an Amazon EFS file system when default stdout cannot be used
As mentioned earlier, containers are ephemeral and logs written to local storage are lost when the container shuts down. By using Amazon EFS, you can persist your application logs from your AWS Fargate or Amazon EKS containers. You can then use Fluent Bit to collect those logs and forward them to your own log management server or external sources like Amazon CloudWatch, Elasticsearch, etc.
Configure the required infrastructure
Once you have the preceding prerequisites ready, you can start deploying an Amazon EKS cluster and creating a new Amazon EFS file system.
1. Deploy an Amazon EKS cluster
Deploy an Amazon EKS cluster using the following command. This command creates an Amazon EKS cluster in the us-east-1 Region with one node group containing c5.large node.
It takes approximately 15–20 minutes to provision the new cluster. When the cluster is ready, you can check the status by running:
The following is the output from the preceding command:
2. Create an Amazon EFS file system
2.1. First, get the VPC ID for the Amazon EKS cluster we created in step 1.
2.2. Identify the subnet IDs for your Amazon EKS node group:
2.3. Create a security group for your Amazon EFS mount target:
2.4. To authorize inbound access to the security group for the Amazon EFS mount to allow inbound traffic to the NFS port (2049) from the VPC CIDR block, use the following command:
2.5. Create an Amazon EFS file system by running the following command:
2.6. Create Amazon EFS mount targets using the subnet IDs identified in step 2.2.
Repeat step 2.6 for the second subnet ID.
2.7. Create an Amazon EFS access point. Amazon EFS access points are application-specific entry points into an EFS file system that make it easier to manage application access to shared datasets. Access points enable you to enforce a user identify based on the POSIX UID/GID specified. Create an EFS access point and enforce a UID/GID of 1001:1001 using the following command:
3. Deploy the Amazon EFS CSI driver to your Amazon EKS cluster
3.1. The CSI driver currently supports static provisioning for Amazon EFS. This means that an EFS file system must be created manually first, as outlined in step 2 earlier. This step is not required for using Fargate, as the CSI driver is already installed in the Fargate stack and support for EFS is provided out of the box. Run the following command to deploy the stable version of CSI driver. Encryption in-transit is enabled by default when using CSI driver version 1.0:
3.2. Verify that the CSI driver is successfully deployed:
4. Create a storage class, persistent volume (PV), and persistent volume claim (PVC):
4.1. First create a storage class by running the following command:
4.2. Next, create a persistent volume (PV). Here, specify the Amazon EFS file system and access point created for use case 1:
Replace volumeHandle
value with the Amazon EFS file system ID and EFS access point.
4.3. Next, create a persistent volume claim:
Note: Because Amazon EFS is an elastic file system, it does not enforce any file system capacity limits. The actual storage capacity value in persistent volumes and persistent volume claims is not used when creating the file system. However, since storage capacity is a required field in Kubernetes, you must specify a valid value, such as, 5Gi in this example. This value does not limit the size of your Amazon EFS file system.
4.4. Deploy the storage class, persistent volume, and persistent volume claim as shown here:
4.5. Check that the storage class, persistent volume, and persistent volume claims were created using the following command:
The following is the output from the preceding command:
5. Deploy the application
In this step, I am deploying a single application that continuously writes current date to /var/log/app.log. I am mounting my Amazon EFS file system on /var/log to persist logs from my application on durable EFS storage.
5.1. Create a file called yaml and copy the following code. Replace the claimName with the PVC you created in step 4.3 and 4.4.
5.2. Deploy the application by running:
5.3. Check that the pod was successfully created by running:
You should see the following output:
If your pod fails to start, you can troubleshoot by running the following command:
5.4. Now, verify that your application is successfully creating the log file by running:
A look into my Amazon EFS file system metrics shows the write activity from my application. This confirms that my logs are successfully stored in EFS and I no longer need to worry about these logs getting lost if my pod is shut down:
With the application logs safely persisted on Amazon EFS, you can now use Fluent Bit to collect and transfer these logs to your own log management solution. The logs can also be sent to other external sources for further analysis. You can learn how to forward these logs to Amazon CloudWatch in this blog.
Use case 2: Persist your container logs centrally on an Amazon EFS file system using the Fluent Bit file plugin
For the second use case, I configure the Fluent Bit file output plugin to write our Amazon EKS container logs to a file on Amazon EFS. I walk through setting up Fluent Bit as the log processor that collects all the stdout from all the pods in Kubernetes and outputs them to your file system on Amazon EFS. If your logs are lost under heavy load while being forwarded to external source like Elasticsearch, you can have peace of mind knowing a copy of them is available on Amazon EFS.
You can enable a lifecycle policy to transition the logs from the Amazon EFS Standard storage class to the EFS Infrequently Accessed (IA) storage class to reduce costs by up to 92%.
The following is the example configuration for the file output plugin:
1. Create an Amazon EFS file system:
Since I am creating a separate namespace for this use case, I need a new Amazon EFS volume. Repeat step 2 in the “Configure the required infrastructure” section to create a new EFS file system and EFS access point.
2. Create a new namespace to deploy Fluent Bit DaemonSet:
First, create a namespace called “fluent-bit-efs-demo” to keep out deployment separate from other applications running in the Kubernetes cluster.
Next, create a service account fluent-bit-efs
in the fluent-bit-efs-demo
namespace to provide permissions to Fluent Bit to collect logs from Kubernetes cluster components and applications running on the cluster.
In the ClusterRole
allow permissions to get, list, and watch namespaces and pods in your Kubernetes cluster. Bind the ServiceAccount
to the ClusterRole
using the ClusterRoleBinding
resource.
Copy the preceding code to a file named rbac.yaml and create the resources by executing the following command:
3. Configure your Amazon EFS file system using the CSI driver
3.1. Download the spec to create a persistent volume (PV). Here specify a PV name, and the Amazon EFS file system and access point created earlier:
Replace volumeHandle
value with the EFS file system ID and access point.
3.2. Next, download the spec to create a persistent volume claim:
Update the name and namespace as shown here:
3.3. Next, deploy the persistent volume and persistent volume claim as shown here:
3.4. Check persistent volume, and persistent volume claims were created using the following command:
4. Create a Fluent Bit config map:
4.1. Next, Create a config map file file-configmap.yaml to define the log parsing and routing for Fluent Bit:
4.2. Deploy the config map by running:
4.3. Next, define the Kubernetes DaemonSet using the config map in a file called file-fluent-bit-daemonset.yaml.
4.4. Launch the DaemonSet by executing:
4.5. Verify that the pod was successfully deployed by running:
The following is the output from the preceding command:
4.6. Verify the logs by running:
4.7. You can verify that the Amazon EFS file system was mounted successfully on the pod by running:
5. Deploy an NGINX application
5.1. Copy the following code to a nginx.yaml
file:
5.2. Deploy the application by running:
5.3. Verify that your NGINX pods are running:
6. Validate the logs on Amazon EFS
6.1. Generate some load for the NGINX containers:
You can see the NGINX logs along with other container logs written to your Amazon EFS file system:
Tail one of the NGINX log files in a new window. You should see the requests coming from the load you are generating using curl:
6.2. Check the Amazon EFS file system metrics. You can see I/O activity on your file system:
Cleaning up
If you no longer need the Amazon EKS cluster and Amazon EFS file system delete them by running:
If you were using an existing Amazon EKS cluster and must clean up the individual components created during the demo, run the following command:
Summary
In this blog, I covered how to use Amazon EFS file system to persist application logs from containers running on Amazon EKS or self-managed Kubernetes clusters. I explored two different use cases:
- Use case 1: Persist your application logs directly on an Amazon EFS file system when default stdout cannot be used. This use case applies to:
- Applications running on AWS Fargate for Amazon EKS. Fargate requires applications to write logs to a file system instead of stdout.
- Traditional applications that are containerized and need the ability to write application logs to a file.
By using Amazon EFS, you can persist your application logs from your AWS Fargate or Amazon EKS containers. You can then use fluent bit to collect these logs and forward it to your own log management server or external sources like Amazon CloudWatch, Elasticsearch, etc.
- Use case 2: Persist your container logs centrally on an Amazon EFS file system using Fluent Bit file plugin.
When you are routing your container logs using Fluent Bit to external sources like Elasticsearch for centralized logging, there could be risk of losing the logs when these external sources are under heavy load, or must be restarted. Storing these logs on EFS provides developers and operations teams peace of mind, as they know a copy of their logs are available on EFS.
Thank you for reading this blog post. Please leave a comment if you have any questions or feedback.