AWS Database Blog

Make your dashboards faster and more cost-effective with Grafana query caching and Amazon Timestream

This is a guest post by Michael Mandrus, Senior Software Engineer at Grafana Labs, co-authored with Igor Shvartser, Senior Technical Product Manager at Amazon Timestream.

For many organizations, performant and cost-effective application monitoring and analytics are a requirement for mission-critical applications. With this requirement comes the increasing use of operational dashboards and visualizations, especially during activity spikes often found in DevOps, security, and Internet of Things (IoT) applications, to name a few. These dashboards are often viewed by numerous analysts simultaneously and reloaded many times in a short period. This heavy use may lead to unnecessary spikes in cost and query latencies that slow teams down. In more time-sensitive situations, it’s vital that time is not lost waiting for a dashboard to load.

Grafana is a leading open observability platform for visualization that can integrate with time series databases to monitor software stacks. It provides a database cache feature that supplements your database by removing unnecessary pressure from it, typically in the form of frequently accessed read data. Grafana conveniently integrates with Amazon Timestream, a fast, serverless, and secure time-series database and analytics service that can scale to process trillions of time series events per day.

Customers across a broad range of industry verticals have adopted Timestream with Grafana to derive real-time insights from dashboards, monitor and alert on critical business applications, and analyze millions of real-time events across websites and applications. Using Grafana with Timestream enables you to build operational dashboards and load results from the cache rather than the source Timestream table. This reduces dashboard load times, lowers query costs, and decreases the likelihood that query requests will be throttled.

In this post, we demonstrate how to create a Grafana dashboard with your data in Timestream and how to configure a query cache using Grafana query caching.

Solution overview

Grafana allows users to create dashboards built from a collection of panels to visualize data from a variety of data sources. It integrates with many data sources through data source plugins, which can be downloaded and installed from the Grafana Plugins Catalog. After you have installed a plugin, you can create a data source instance, which is a configured connection to a specific database. Refer to the Timestream plugin for information on configuring and using the data source. Please also be aware of costs accrued when using AWS services for this solution.

After you have configured a data source instance, you can use it to create dashboard panels, built by authoring queries and displaying the results using one of Grafana’s available visualizations. When you load a panel, the query that is run is a combination of your authored query and the time range specified on the dashboard.

Grafana’s query caching feature (available in Grafana Cloud and Grafana Enterprise) creates cache keys using a data source instance, a query, and a time range. When a panel is loaded, Grafana first checks a local cache for the requested data and, if found, returns it immediately. If not found, Grafana runs the query against the data source, then stores the results in the local cache. This means that although the initial load of a dashboard will take a typical amount of time, subsequent loads with similar time ranges will be nearly instantaneous. This is achieved by rounding time ranges to the nearest interval, increasing the likelihood of cache hits. You can configure query caching and its time-to-live (TTL) per data source instance.

Getting started with Timestream

To get started with Timestream, visit Getting Started which provides a tutorial and sample applications. The tutorial shows you how to create a database populated with sample data sets and run sample queries. The fully functional sample application shows you how to create a database and table, populate the table with sample data, and run sample queries. You can also go directly to the AWS Console or use the AWS Command Line Interface (CLI) or AWS SDKs.

You can also try Amazon Timestream with a 1-month free trial when you use Timestream for the first time. The Timestream Free Tier gives you the opportunity to experiment with and adopt Timestream at zero cost for a duration of one month, while adhering to specific usage quotas.

Configure the Timestream plugin

In this section, we walk you through configuring and using the Timestream plugin with database caching. For prerequisites and instructions on setting up a Timestream database and querying it from Grafana, refer to Grafana.

  1. Install the Timestream plugin.
  2. Add a new data source.
  3. Enter the connection details.
  4. Enter additional details and choose Save & test to verify connection. Note that the configuration details in the screenshot are examples and may differ from your details.
  5. On the Cache tab, choose Enable.
  6. Configure cache settings (optional).
  7. Create a panel by authoring a query and selecting a visualization. Note that the configuration details and SQL query in the screenshot are examples and may differ from your details.
  8. Reload the panel and observe that the response is now cached.

And that’s it! Continue building your dashboard until you have the visualizations you need. The following is an example Timestream dashboard with an especially wide time range selected. Although size of the query (a month’s worth of Timestream data) initially caused a delay in fully loading this dashboard, a refresh using query caching took under 100 milliseconds—a 99% decrease—and required no interaction with the Timestream database.

Considerations

There are currently two noteworthy considerations when using query caching in Grafana:

  • Cache keys are driven by specific timestamps. This means if your time range doesn’t round to a time range already stored in the cache, Grafana will need to issue entirely new queries to the database. For example, if you query for t0 to t1, then query for t0 to t2, Grafana will run a query for t0 to t2 instead of just t1 to t2. The same applies to subsets of results.
  • If several users load the same dashboard simultaneously and the data is not currently cached, each query will be sent to the data source in parallel instead of deduplicated, which may result in a cache stampede. One way to monitor for cache stampedes is to monitor the Grafana metric grafana_http_requests_in_flight. During a cache stampede, this metric will begin to increase based on the load. To prevent this from happening, configure Grafana using the max_conns_per_host and max_open_conns_default parameters.

The Grafana team is actively exploring potential enhancements for these limitations and considering them for inclusion in future versions of Grafana. Stay tuned for progress updates and any potential fixes in upcoming releases.

Conclusion

In this post, we described how to use Grafana query caching with Timestream. Query caching is a key feature for increasing performance of and lowering query costs related to operational dashboards. For additional documentation covering the use of Grafana with Timestream, and to create a sample application and dashboard, check out the Timestream developer guide for Grafana.

Grafana Cloud is the easiest way to get started with metrics, logs, traces, and dashboards. Grafana recently added new features to our generous forever-free tier, including access to all Enterprise plugins for three users. Plus, there are plans for every use case. Sign up for free now.

To learn more about Timestream and get started, visit Amazon Timestream.


About the authors

Michael Mandrus is a Senior Software Engineer on the Operator Experience Team at Grafana Labs, focused on making Grafana instances easy to manage at any scale. He is driven by the need to give customers an excellent experience working with his team and products.

Igor Shvartser is a Senior Product Manager for Amazon Timestream. His fascination with data, working alongside customers, and building exceptional products has led him to AWS where he’s empowering teams with purpose-built databases.