Amazon Timestream FAQs

General

Time series data is a sequence of data points recorded over a time interval for measuring events that change over time. Examples are stock prices over time, temperature measurements over time, and the CPU utilization of an EC2 instance over time. With time-series data, each data point consists of a timestamp, one or more attributes, and the event that changes over time. This data is used to derive insights into the performance and health of an application, detect anomalies, and identify optimization opportunities. For example, DevOps engineers might want to view data that measures changes in infrastructure performance metrics, manufacturers might want to track IoT sensor data that measures changes in equipment across a facility, and online marketers might want to analyze clickstream data that captures how a user navigates a website over time. Time series data is generated from multiple sources in extremely high volumes, it needs to be cost-effectively collected in near real time, and it requires efficient storage that helps organize and analyze the data.

Amazon Timestream is a fast, scalable, and serverless time series database service for IoT and operational applications that makes it easy to store and analyze trillions of events per day up to 1,000 times faster and at as little as 1/10th the cost of relational databases. Amazon Timestream saves you time and cost in managing the lifecycle of time series data by keeping recent data in memory and moving historical data to a cost optimized storage tier based upon user defined policies. Amazon Timestream’s adaptive query engine lets you access and analyze recent and historical data together, without having to specify its location. Amazon Timestream has built-in time series analytics functions, helping you identify trends and patterns in your data in near real-time. Amazon Timestream is serverless and automatically scales up or down to adjust capacity and performance, so you don’t need to manage the underlying infrastructure, freeing you to focus on building your applications.

Amazon Timestream has been designed from the ground up to collect, store, and process time series data. Its serverless architecture supports fully decoupled data ingestion, storage, and query processing systems that can scale independently, enabling Amazon Timestream to offer virtually infinite scale for your application’s needs. Rather than pre-defining the schema at table creation time, a Timestream table’s schema is dynamically created based on the attributes of the incoming time series data, allowing for flexible and incremental schema definition. When stored, Amazon Timestream partitions the data by time and attributes of the data, accelerating data access using a purpose-built index. In addition, Amazon Timestream automates data lifecycle management by offering an in-memory store for recent data, a magnetic store for historical data, and by supporting configurable rules to automatically move data from the memory store to the magnetic store as it reaches a certain age. Amazon Timestream also simplifies data access through its purpose-built adaptive query engine that can seamlessly access and combine data across storage tiers without having to specify the data location, so you can quickly and easily derive insights from your data using SQL. Lastly, Amazon Timestream works seamlessly with your preferred data collection, visualization, analytics, and machine learning services, making it easy for you to include Amazon Timestream in your time series solutions.

You can get started with Amazon Timestream using the AWS management console , CLI, or SDKs. For more information, including tutorials and other getting started content, please see the developer guide .

Pricing and availability

With Amazon Timestream, you pay only for what you use. You are billed separately for writes, data stored, and data scanned by queries. Amazon Timestream automatically scales your writes, storage, and query capacity based on usage. You can set the data retention policy for each table and choose to store data in an in-memory store or magnetic store. For detailed pricing, see the pricing page .

For current region availability, see the pricing page .

Performance and scale

Amazon Timestream offers near real-time latencies for data ingestion. Amazon Timestream’s built-in memory store is optimized for rapid point-in-time queries, and the magnetic store is optimized to support fast analytical queries. With Amazon Timestream, you can run queries that analyze tens of gigabytes of time-series data from the memory store within milliseconds, and analytical queries that analyze terabytes of time-series data from the magnetic store within seconds. Scheduled queries further improve query performance by calculating and storing the aggregates, rollups, and other real-time analytics used to power frequently accessed operational dashboards, business reports, applications, and device-monitoring systems.

As your applications continue to send more data, Amazon Timestream automatically scales to accommodate their data ingestion and query needs. You can store exabytes of data in a single table. As your data grows over time, Amazon Timestream uses its distributed architecture and massive amounts of parallelism to process larger and larger volumes of data while keeping query latencies almost unchanged.

Amazon Timestream’s serverless architecture supports fully decoupled data ingestion, storage, and query processing systems that can scale independently, enabling Amazon Timestream to offer virtually infinite scale for your application’s needs.

For current limits and quotas, see the documentation .

Data ingestion

You can collect time series data from connected devices, IT systems, and industrial equipment, and write it into Amazon Timestream. You can send data to Amazon Timestream using data collection services such as AWS IoT Core, Amazon Kinesis Data Analytics for Apache Flink, or Telegraf, or through the AWS SDKs. For more information, see the documentation .

No. Amazon Timestream dynamically creates a table’s schema based on a set of dimensional attributes and measures. This offers flexible and incremental schema definition that may be adjusted at any time without affecting availability.

Late arrival data is data that has a timestamp in the past. Future data is data that has a timestamp in the future. Amazon Timestream lets you store and access both kinds.

To store late arrival data, you simply write the data into Amazon Timestream, and the service will automatically determine whether it gets written to the memory store or to the magnetic store, based on the timestamp of the data and on the configured data retention for the memory and magnetic stores.

To store future data, model your data as a multi-measure record, and represent the future timestamp as a measure within the record.

Amazon Timestream uses the timestamp of the time series event being written into the database. It supports timestamps with nanosecond granularity.

You can collect data from your IoT devices and store that data in Amazon Timestream using AWS IoT Core rule actions. For more detailed information, see the documentation .

You can create AWS Lambda functions that interact with Amazon Timestream. For more detailed information, see the documentation .

You can use Apache Flink to transfer your time series data from Amazon Kinesis directly into Amazon Timestream. For more detailed information, see the documentation.

You can use Apache Flink to send your time series data from Amazon MSK directly into Amazon Timestream. For more detailed information, see the documentation .

You can send time series data collected using open source Telegraf directly into Amazon Timestream using the Telegraf connector. For more detailed information, see the documentation .

Data storage

Amazon Timestream organizes and stores time-series data using its timestamp and organizes data across time based on its dimensional attributes. See Werner Vogels blog  for more details. Using Amazon Timestream, you can automate data lifecycle management by simply configuring data retention policies to automatically move data from the memory store to the magnetic store as it reaches the configured age.

Amazon Timestream’s memory store is a write-optimized store that accepts and deduplicates the incoming time series data. It also accepts and processes late arriving data from devices and applications with intermittent connectivity. The memory store is also optimized for latency sensitive point-in-time queries.

Amazon Timestream’s magnetic store is a read-optimized store that contains historical data. The magnetic store is also optimized for fast analytical queries that scan hundreds of terabytes of data.

Data access, analytics, and machine learning

You can use SQL to query your time series data stored in Amazon Timestream. You can also conduct time series analytics functions for interpolation, regression, and smoothing using SQL. For more information, see the documentation .

Amazon Timestream’s scheduled queries offer a fully managed, serverless, and scalable solution for calculating and storing aggregates, rollups, and other real-time analytics used to power frequently accessed operational dashboards, business reports, applications, and device monitoring systems.

With scheduled queries, you simply define the real-time analytics queries that calculate aggregates, rollups, and other real-time analytics on your incoming data, and Amazon Timestream periodically and automatically runs these queries and reliably writes the query results into a separate table. You can then point your dashboards, reports, applications, and monitoring systems to simply query the destination tables, instead of querying the considerably larger source tables containing the incoming time-series data. This leads to performance and cost reductions by an order of magnitude because the destination tables contain much less data than the source tables, thereby offering faster and cheaper data access and storage.

You can use a JDBC driver to connect Amazon Timestream to your preferred business intelligence tools and other applications. See the documentation for additional details.

You can visualize and analyze time-series data in Amazon Timestream using Amazon QuickSight and Grafana. You can also use Amazon SageMaker with Amazon Timestream for your ML needs.

You can create rich and interactive dashboards for your Amazon Timestream time-series data using Amazon QuickSight. For more information, see the documentation.

You can visualize your Amazon Timestream time-series data and create alerts using Grafana, a multi-platform, open-source analytics and interactive visualization tool. To learn more and find sample applications, see the documentation .

You can use Amazon SageMaker notebooks to integrate your ML models with Amazon Timestream. For more information, see the documentation .

Availability

Amazon Timestream provides 99.99% availability. Please refer to the service level agreement (SLA).

Security and compliance

In Amazon Timestream, data is always encrypted, whether at rest or in transit. Amazon Timestream also enables you to specify an AWS KMS customer managed key (CMK) for encrypting data in the magnetic store.

Amazon Timestream is HIPAA eligible, ISO (9001, 27001, 27017, and 27018), PCI DSS, FedRAMP (Moderate), and Health Information Trust Alliance (HITRUST) Common Security Framework (CSF) compliant. Furthermore, Amazon Timestream is in scope for AWS’s SOC reports SOC 1, SOC 2, and SOC 3.

You can access Amazon Timestream from your Amazon VPC  using VPC endpoints. Amazon VPC endpoints are easy to configure and provide reliable connectivity to Amazon Timestream APIs without requiring an internet gateway or a Network Address Translation (NAT) instance.

Data protection

You have two backup options available for your Timestream resources: on-demand backups and scheduled backups. On-demand backups are ad-hoc, one-time backups that can be initiated either from the Timestream console or using AWS Backup. On-demand backups are useful when you want to create a backup prior to making a change to your table that may require you to revert the changes. Scheduled backups are recurring backups that you can configure, using AWS Backup policies, at desired frequencies (e.g. 12 hours, 1 day, 1 week, etc.). Scheduled backups are useful when you want to create ongoing backups to meet your data protection goals.

The first backup, either on-demand or scheduled, of the table is a full backup and every subsequent backup of the same table is incremental, copying only the data that has changed since the last backup. 

Backup and restores are charged based on the backup storage size of the selected table, measured on a ‘GB-Month’ basis. Charges will be shown under ‘Backup’ in your AWS bill and include costs for backup storage, data transfers, restores, and early deletes. As the backups are incremental in nature, the storage size of the subsequent backup of the table is the size of the amount of the data changed since the last backup. Please refer to AWS Backup pricing for additional details.

To get started, you need to enable AWS Backup to protect your Timestream resources (this is a one-time action). Once enabled, navigate to AWS Management Console or use AWS Backup’s CLI or SDK to create on-demand or scheduled backups of your data, and copy those backups across accounts and Regions.  You can configure your backup lifecycle management based on your data protection needs. For more information, refer to creating a backup documentation

You can restore your Timestream tables through the AWS Management Console or using AWS Backup’s CLI or SDK. Select the recovery point ID for the resource you want to restore, and provide the required inputs such as destination database name, new table name, and retention properties to start the restore process. Upon successful restore, you can access the data. When you attempt to restore the latest incremental backup of your table, the entire table data is restored. For more information, refer to the documentation .