How to set Fluentd and Fluent Bit input parameters in FireLens
This post was contributed by Ben Anscombe, DevOps Engineer at Space Ape Games and Wesley Pettit, Software Engineer at AWS.
September 8, 2021: Amazon Elasticsearch Service has been renamed to Amazon OpenSearch Service. See details.
FireLens for Amazon Elastic Container Service (Amazon ECS) was launched last year to make it easy for ECS customers to send and process logs using standard open source logging tools – Fluentd and Fluent Bit. If you are not already familiar with FireLens take a look at the documentation and the under the hood blog on why and how it was built.
Space Ape Games’ use of FireLens
Space Ape’s mission is to build the greatest games on mobile. Our games have been downloaded over 85 million times and have always needed scalable services. With our forthcoming titles, we will need to scale to higher levels than ever before. To achieve this, we chose Fargate to run our backend game services. Fargate allows us to scale up and down rapidly with the minimum of management overhead. It also allows us to deploy new versions of our backend services safely so we can keep our players’ fun uninterrupted.
We also have a centralized logging system comprising an EC2 based Logstash cluster, which filters and aggregates logs from both game servers and mobile clients, forwarding data to Amazon Elasticsearch Service.
FireLens allows us to send our container logs to Logstash in a simple, highly efficient way with near real time delivery. FireLens also scales linearly with our Fargate tasks by virtue of being part of the task itself, which reduces complexity because we only need to focus on scaling Logstash and Elasticsearch. Scaling is especially important to us because player activity can greatly fluctuate based on factors such as the time of day and whether we are running an in-game event.
One feature that we really like in Fluent Bit is the Memory Buffer Limit (Mem_Buf_Limit), which can be set as a parameter in the Input section of the Fluent Bit configuration. This is a feature not available by default in FireLens, but it was something we really wanted to implement for our Fargate tasks.
Why set Memory Buffer Limit?
Our entire logging pipeline must be resilient and fault tolerant; the collector (Fluent Bit) must be able to tolerate the logging destination being unavailable for a period of time. In this case, FireLens won’t be able to send its logs and will start to buffer these in memory, attempting to retry sending them up to the Retry Limit.
During load test scenarios that simulated this scenario, where the downstream logging system became unavailable, we found that with high log volumes it was possible for the FireLens memory buffer to use up all the memory available to the Fargate task, causing the whole task to be killed due to an OutOfMemoryError.
Setting the Memory Buffer Limit means that Fluent Bit will buffer logs up to the Limit, and after that new logs will not be written to memory until the buffers start to clear again.
This is essentially a trade-off of integrity of logs against the availability of service, which may not be appropriate in all cases. Each user has to weigh the relative risk of their chosen destination being unavailable for against the impact an outage would cause. For us, player experience comes first so the availability of our game servers is paramount. Compared to the importance of the logs that we collect from the containers, which are only used to support debugging the servers, it’s clear that the trade-off that maximizes server availability is preferable.
Note that the Memory Buffer Limit is not a hard limit on the memory consumption of the FireLens container (as memory is also used for other purposes). We tested FireLens with Mem_Buf_Limit set to 100MB and the FireLens container has so far stayed below 250MB total memory usage in high load scenarios.
Background: how FireLens configures Fluentd and Fluent Bit
Before we learn how to set input parameters, we need to understand how FireLens works in detail, including how it generates the Input section of the Fluent Bit configuration.
As explained in “Under the Hood: FireLens for ECS Tasks“:
Fluentd and Fluent Bit are powerful, but large feature sets are always accompanied by complexity. When we designed FireLens, we envisioned two major segments of users:
1. Those who want a simple way to send logs anywhere, powered by Fluentd and Fluent Bit.
2. Those who want the full power of Fluentd and Fluent Bit, with AWS managing the undifferentiated labor that’s needed to pipe a Task’s logs to these log routers.
Thus, while FireLens simply enabled Fluentd and Fluent Bit in ECS, configuration management features were built to make using them easy. This involved two things:
- The Input plugin definitions to accept/collect logs from the runtime are generated by the ECS Agent.
- A config translation mechanism was built to translate options in a container’s log configuration to Output plugin definitions.
Consequently, the configuration file for Fluentd or Fluent Bit is “fully managed” by ECS. With the config-file-type option, you can import your own configuration. However, the input definitions are always generated by ECS, and your additional config is then imported using the Fluentd/Fluent Bit include statement. Internally, Fluentd and Fluent Bit concatenate the two config files together, so your config is appended to the generated config.
The generated config is always mounted into your log routing container at set locations:
- Fluent Bit:
Most Fluentd and Fluent Bit images (including the Fluent OSS distributions and the AWS for Fluent Bit distribution) use these default configuration paths. These config paths are specified in the entrypoint definitions for the containers, for example:
However, you can override the default entrypoint by building your own Fluentd or Fluent Bit image and specifying a different config path. That is the method we will use to set
Tutorial: setting input parameters
The input configurations for FireLens can be seen at the links below. The input definitions are always the same, they do not change based on user input:
As you can see, logs are always read from a Unix Socket mounted into the container at
As a FireLens user, you can set your own input configuration by overriding the default entry point command for the Fluent Bit container. For the purposes of this tutorial, we will focus on Fluent Bit and show how to set the Mem_Buf_Limit parameter. The same method can be applied to set other input parameters and could be used with Fluentd as well.
First, construct a Fluent Bit config file, with the following input section:
Your full Fluent Bit config file should additionally include your output(s) and any other Fluent Bit features that you want to enable.
Save this file as fluent-bit.conf and save it in your project directory. Then, create a Dockerfile with the following contents:
ADD fluent-bit.conf /fluent-bit/alt/fluent-bit.conf
CMD ["/fluent-bit/bin/fluent-bit", "-e", "/fluent-bit/firehose.so", "-e", "/fluent-bit/cloudwatch.so", "-e", "/fluent-bit/kinesis.so", "-c", "/fluent-bit/alt/fluent-bit.conf"]
The config file path has been intentionally bolded. Notice that the entrypoint is the same as the public AWS for Fluent Bit image, except that the path is changed.
Use Docker build to create your custom Fluent Bit image, and push it to Amazon ECR. Reference the image in the container definition for your FireLens container. A portion of a container definition is shown below:
In the log configuration for your application containers, specify the awsfirelens log driver without any options:
Finally, note that to make your config dynamic at runtime, you can use environment variables in Fluent Bit config:
You can then set the values of those environment variables in the FireLens container, allowing you to dynamically change your config values at runtime.
The AWS team is tracking an effort to support setting custom Fluent Bit input options in FireLens through the Task Definition interface at the public AWS Container roadmap. With the technique shown in this blog, you can set custom input options right now.