AWS Database Blog
Timestream for InfluxDB 3 workload analysis and best practices
Selecting the right instance size for your Amazon Timestream for InfluxDB 3 deployment is one of the most impactful decisions you’ll make when architecting your time series infrastructure. An undersized instance can lead to degraded query performance and ingestion bottlenecks, while an oversized instance means paying for unused capacity.
In this blog post we will walk you through on how to choose the sizing for your deployment. You can use this guide to help align your business use-case with AWS offerings for Timestream for InfluxDB 3. Visit our documentation to get started using the Amazon Timestream managed InfluxDB 3 today.
Sizing guidance
Understanding performance characteristics
Choosing the right instance size for your Amazon Timestream for InfluxDB 3 deployment requires understanding how the database performs under different workload patterns.
Performance in time series databases is influenced by numerous interconnected factors. The complexity of your queries matters significantly as simple aggregations over recent data perform differently than complex analytical queries spanning months of historical data. Your data model plays a crucial role; high-cardinality datasets with millions of unique tag combinations behave differently than low-cardinality monitoring data. Write patterns affect performance as well, from the size of your batches to the rate of ingestion and the structure of your line protocol points.
Concurrent operations create their own dynamics. When multiple applications query the database simultaneously while data continues to flow in, resources must be balanced between ingestion and query processing. Query time ranges are of significant impact as InfluxDB 3’s caching and in-memory optimizations make recent data queries exceptionally fast, while historical queries (in Enterprise clusters) use the compactor and Parquet storage for efficient retrieval.
Our goal with this post isn’t to provide exact predictions for your workload, but rather to provide you with an informed starting point for your deployment.
Data and query considerations
When aiming to improve query and ingestion performance, consider the following:
If you have low cardinality (fewer unique tag combinations), you can potentially see improved performance, as queries need to filter through fewer distinct series. While cardinality is not a problem anymore for InfluxDB 3, this does not mean runaway cardinality has no performance impact. With runaway cardinality, query operations will potentially use more memory and compaction performance will be impacted.
If your queries are simple (single-field retrievals rather than aggregations), you can benefit from InfluxDB 3’s Last Value Cache and see significantly faster response times. If your queries perform multi-hour or multi-day aggregation on raw data, query performance will decrease as the size of your data grows. Always downsample using the processing engine when you can and pre-format your data according to how you want query results presented.
If your write batches are larger (5,000 points per batch for medium to 4xlarge instances and 10,000+ points per batch for larger instances), you can achieve better write throughput as the overhead per request decreases.
If your queries target longer time ranges or historical data beyond the recent cache window, downsampling, for all editions, and Enterprise edition’s compactor can help maintain good performance.
If you have fewer concurrent readers, you can see better per-query performance as resources aren’t divided among as many simultaneous operations.
Monitoring your deployment
After deploying your instance, monitoring performance is crucial for validating your sizing decisions and identifying optimization opportunities. Amazon CloudWatch provides essential metrics for Timestream for InfluxDB 3 including CPU utilization and memory usage. The System Metrics Plugin included with InfluxDB 3’s processing engine collects server-level performance data similar to CloudWatch, including detailed CPU statistics (overall and per-core), memory usage breakdowns, disk I/O performance, and network interface statistics. For detailed, long-term metrics, the System Metrics Plugin is a better option compared to CloudWatch since CloudWatch logs metrics every 10 seconds with the same level of granularity while the System Metrics Plugin can be configured to use a custom schedule.
For deeper insights into database-specific performance metrics, you can scrape the metrics endpoint to collect comprehensive internal metrics that help you closely monitor query performance, write throughput, and other service-level indicators. We provide a comprehensive metrics collection solution that scrapes the /metrics endpoint from your Timestream for InfluxDB instances. This solution deploys an Amazon EC2 instance running Telegraf to continuously collect internal engine metrics such as query performance, write throughput, and memory usage patterns, then ingests them into CloudWatch for visualization in a pre-configured Grafana dashboard. The dashboard includes panels for monitoring key performance indicators aligned with instance sizing specifications, helping you validate your sizing decisions and identify optimization opportunities.
For deployment instructions, configuration options, and example scripts, visit our GitHub repository. The repository includes a complete AWS CDK application that automates the setup process, from Telegraf configuration to Grafana dashboard creation.
Getting started with sizing
When planning your initial deployment, we recommend this approach:
- Estimate your baseline requirements: Calculate your expected write rate (points per second) and query concurrency (simultaneous queries). Add additional headroom to accommodate growth and handle usage spikes.
- Choose a starting instance size:
Use the following as a guide for determining instance size in ideal conditions. Deploy an instance that roughly meets your needs, then adjust according to performance:
Writes (lines per second) Reads (Queries per second) Instance class ~35,000 ~40 db.influx.medium <100,000 ~140 db.influx.large ~120,000 ~290 db.influx.xlarge ~130,000 ~400 db.influx.2xlarge ~140,000 ~425 db.influx.4xlarge <200,000 <430 db.influx.8xlarge ~200,000 <430 db.influx.12xlarge ~210,000 ~430 db.influx.16xlarge ~230,000 ~430 db.influx.24xlarge - For workloads focused on recent data monitoring (3-5 days), start with db.influx.xlarge
- For higher write throughput or query concurrency needs, consider db.influx.2xlarge or db.influx.4xlarge
- For maximum throughput requirements, evaluate db.influx.8xlarge or db.influx.24xlarge
- If you need historical data retention and analysis, choose Enterprise edition with appropriate instance size
- Create and configure a parameter group for your deployment according to your instance size and workload characteristics.
- Test with representative data: Load a dataset that matches your production cardinality and time range, then run your actual query patterns to validate performance.
- Monitor and adjust: Use CloudWatch metrics and the System Metrics Plugin to track CPU, memory, query latency, and write throughput. Scale up if you consistently see high resource utilization or degraded performance.
- A healthy instance, operating at production levels, should have a CPU and memory profile between 40% and 70% on average.
- If you are normally below 40% and your workload is stable, you are overprovisioning.
- If you have spikes that breach 70% threshold, consider optimizations or scaling. Keep in mind that queries have a bigger impact on CPU usage than writes.
Remember that you can scale your instance size up or down as your workload evolves with brief interruption of downtime. Start conservatively, monitor carefully, and adjust based on real-world performance data from your specific use case.
Summary
Sizing your Amazon Timestream for InfluxDB 3 deployment effectively requires balancing throughput requirements, query patterns, and cost considerations. Use this blog post as a starting point for determining how Amazon Timestream for InfluxDB 3 can elevate your business needs. By analyzing your workload patterns, query complexity, and historical data needs, you can map your requirements to the appropriate configuration.
Visit the Amazon Timestream for InfluxDB documentation today and start building your time series workflows with the power, flexibility, and scalability that your business demands.