What is time series data?
Time series data is a sequence of data points recorded over a time interval for measuring events that change over time. Examples are stock prices over time, temperature measurements over time, and the CPU utilization of an EC2 instance over time. With time-series data, each data point consists of a timestamp, one or more attributes, and the event that changes over time. This data is used to derive insights into the performance and health of an application, detect anomalies, and identify optimization opportunities. For example, DevOps engineers might want to view data that measures changes in infrastructure performance metrics, manufacturers might want to track IoT sensor data that measures changes in equipment across a facility, and online marketers might want to analyze clickstream data that captures how a user navigates a website over time. Time series data is generated from multiple sources in extremely high volumes, it needs to be cost-effectively collected in near real time, and it requires efficient storage that helps organize and analyze the data.
What is Amazon Timestream?
Amazon Timestream is a fast, scalable, and serverless time series database service for IoT and operational applications that makes it easy to store and analyze trillions of events per day up to 1,000 times faster and at as little as 1/10th the cost of relational databases. Amazon Timestream saves you time and cost in managing the lifecycle of time series data by keeping recent data in memory and moving historical data to a cost optimized storage tier based upon user defined policies. Amazon Timestream’s adaptive query engine lets you access and analyze recent and historical data together, without having to specify its location. Amazon Timestream has built-in time series analytics functions, helping you identify trends and patterns in your data in near real-time. Amazon Timestream is serverless and automatically scales up or down to adjust capacity and performance, so you don’t need to manage the underlying infrastructure, freeing you to focus on building your applications.
How does Amazon Timestream work?
Amazon Timestream has been designed from the ground up to collect, store, and process time series data. Its serverless architecture supports fully decoupled data ingestion, storage, and query processing systems that can scale independently, enabling Amazon Timestream to offer virtually infinite scale for your application’s needs. Rather than pre-defining the schema at table creation time, a Timestream table’s schema is dynamically created based on the attributes of the incoming time series data, allowing for flexible and incremental schema definition. When stored, Amazon Timestream partitions the data by time and attributes of the data, accelerating data access using a purpose-built index. In addition, Amazon Timestream automates data lifecycle management by offering an in-memory store for recent data, a magnetic store for historical data, and by supporting configurable rules to automatically move data from the memory store to the magnetic store as it reaches a certain age. Amazon Timestream also simplifies data access through its purpose-built adaptive query engine that can seamlessly access and combine data across storage tiers without having to specify the data location, so you can quickly and easily derive insights from your data using SQL. Lastly, Amazon Timestream works seamlessly with your preferred data collection, visualization, analytics, and machine learning services, making it easy for you to include Amazon Timestream in your time series solutions.
How do I get started with Amazon Timestream?
You can get started with Amazon Timestream using the AWS management console, CLI, or SDKs. For more information, including tutorials and other getting started content, please see the developer guide.
Pricing and availability
How much does Amazon Timestream cost?
With Amazon Timestream, you pay only for what you use. You are billed separately for writes, data stored, and data scanned by queries. Amazon Timestream automatically scales your writes, storage, and query capacity based on usage. You can set the data retention policy for each table and choose to store data in an in-memory store or magnetic store. For detailed pricing, see the pricing page.
In what regions is Amazon Timestream available?
For current region availability, see the pricing page.
Performance and scale
What performance can I expect from Amazon Timestream?
Amazon Timestream offers near real-time latencies for data ingestion. Amazon Timestream’s built-in memory store is optimized for rapid point-in-time queries, and the magnetic store is optimized to support fast analytical queries. With Amazon Timestream, you can run queries that analyze tens of gigabytes of time-series data from the memory store within milliseconds, and analytical queries that analyze terabytes of time-series data from the magnetic store within seconds. Scheduled queries further improve query performance by calculating and storing the aggregates, rollups, and other real-time analytics used to power frequently accessed operational dashboards, business reports, applications, and device-monitoring systems.
As your applications continue to send more data, Amazon Timestream automatically scales to accommodate their data ingestion and query needs. You can store exabytes of data in a single table. As your data grows over time, Amazon Timestream uses its distributed architecture and massive amounts of parallelism to process larger and larger volumes of data while keeping query latencies almost unchanged.
How does Amazon Timestream scale?
Amazon Timestream’s serverless architecture supports fully decoupled data ingestion, storage, and query processing systems that can scale independently, enabling Amazon Timestream to offer virtually infinite scale for your application’s needs.
What are the current limits and quotas?
For current limits and quotas, see the documentation.
How can I send data to Amazon Timestream?
You can collect time series data from connected devices, IT systems, and industrial equipment, and write it into Amazon Timestream. You can send data to Amazon Timestream using data collection services such as AWS IoT Core, Amazon Kinesis Data Analytics for Apache Flink, or Telegraf, or through the AWS SDKs. For more information, see the documentation.
Do I need to define a schema before sending data to Amazon Timestream?
No. Amazon Timestream dynamically creates a table’s schema based on a set of dimensional attributes and measures. This offers flexible and incremental schema definition that may be adjusted at any time without affecting availability.
How do I deal with late or future arrival data?
Late arrival data is data that has a timestamp in the past. Future data is data that has a timestamp in the future. Amazon Timestream lets you store and access both kinds.
To store late arrival data, you simply write the data into Amazon Timestream, and the service will automatically determine whether it gets written to the memory store or to the magnetic store, based on the timestamp of the data and on the configured data retention for the memory and magnetic stores.
To store future data, model your data as a multi-measure record, and represent the future timestamp as a measure within the record.
What is the time stamp used by Amazon Timestream when ingesting data?
Amazon Timestream uses the timestamp of the time series event being written into the database. It supports timestamps with nanosecond granularity.
How can I send data to Amazon Timestream using AWS IoT Core?
You can collect data from your IoT devices and store that data in Amazon Timestream using AWS IoT Core rule actions. For more detailed information, see the documentation.
How can I send data to Amazon Timestream using AWS Lambda?
You can create AWS Lambda functions that interact with Amazon Timestream. For more detailed information, see the documentation.
How can I send data to Amazon Timestream using Amazon Kinesis?
You can use Apache Flink to transfer your time series data from Amazon Kinesis directly into Amazon Timestream. For more detailed information, see the documentation.
How can I send data to Amazon Timestream using Amazon MSK?
You can use Apache Flink to send your time series data from Amazon MSK directly into Amazon Timestream. For more detailed information, see the documentation.
How can I send data to Amazon Timestream using open source Telegraf?
You can send time series data collected using open source Telegraf directly into Amazon Timestream using the Telegraf connector. For more detailed information, see the documentation.
How does Amazon Timestream store data?
Amazon Timestream organizes and stores time-series data using its timestamp and organizes data across time based on its dimensional attributes. See Werner Vogels blog for more details. Using Amazon Timestream, you can automate data lifecycle management by simply configuring data retention policies to automatically move data from the memory store to the magnetic store as it reaches the configured age.
What are the benefits of the memory store?
Amazon Timestream’s memory store is a write-optimized store that accepts and deduplicates the incoming time series data. It also accepts and processes late arriving data from devices and applications with intermittent connectivity. The memory store is also optimized for latency sensitive point-in-time queries.
What are the benefits of the magnetic store?
Amazon Timestream’s magnetic store is a read-optimized store that contains historical data. The magnetic store is also optimized for fast analytical queries that scan hundreds of terabytes of data.
Data access, analytics, and machine learning
How can I query data with Amazon Timestream?
You can use SQL to query your time series data stored in Amazon Timestream. You can also conduct time series analytics functions for interpolation, regression, and smoothing using SQL. For more information, see the documentation.
Can I automatically rollup or aggregate or preprocess my data with Amazon Timestream?
Amazon Timestream’s scheduled queries offer a fully managed, serverless, and scalable solution for calculating and storing aggregates, rollups, and other real-time analytics used to power frequently accessed operational dashboards, business reports, applications, and device monitoring systems.
With scheduled queries, you simply define the real-time analytics queries that calculate aggregates, rollups, and other real-time analytics on your incoming data, and Amazon Timestream periodically and automatically runs these queries and reliably writes the query results into a separate table. You can then point your dashboards, reports, applications, and monitoring systems to simply query the destination tables, instead of querying the considerably larger source tables containing the incoming time-series data. This leads to performance and cost reductions by an order of magnitude because the destination tables contain much less data than the source tables, thereby offering faster and cheaper data access and storage.
Can I use a JDBC driver with Amazon Timestream?
You can use a JDBC driver to connect Amazon Timestream to your preferred business intelligence tools and other applications. See the documentation for additional details.
What visualization, analytics, and machine learning (ML) tools can I use with Amazon Timestream?
You can visualize and analyze time-series data in Amazon Timestream using Amazon QuickSight and Grafana. You can also use Amazon SageMaker with Amazon Timestream for your ML needs.
How do I use Amazon Timestream with Amazon QuickSight?
You can create rich and interactive dashboards for your Amazon Timestream time-series data using Amazon QuickSight. For more information, see the documentation.
How do I use Amazon Timestream with Grafana?
You can visualize your Amazon Timestream time-series data and create alerts using Grafana, a multi-platform, open-source analytics and interactive visualization tool. To learn more and find sample applications, see the documentation.
How do I use Amazon Timestream with Amazon SageMaker?
You can use Amazon SageMaker notebooks to integrate your ML models with Amazon Timestream. For more information, see the documentation.
What availability does Amazon Timestream have?
Amazon Timestream provides 99.99% availability. Please refer to the service level agreement (SLA).
Security and compliance
Does Amazon Timestream support data encryption?
In Amazon Timestream, data is always encrypted, whether at rest or in transit. Amazon Timestream also enables you to specify an AWS KMS customer managed key (CMK) for encrypting data in the magnetic store.
What compliance certification readiness does Amazon Timestream meet?
Amazon Timestream is HIPAA eligible, ISO (9001, 27001, 27017, and 27018), PCI DSS, FedRAMP (Moderate), and Health Information Trust Alliance (HITRUST) Common Security Framework (CSF) compliant. Furthermore, Amazon Timestream is in scope for AWS’s SOC reports SOC 1, SOC 2, and SOC 3.
Can I use Amazon Timestream in an Amazon Virtual Private Cloud (VPC)?
You can access Amazon Timestream from your Amazon VPC using VPC endpoints. Amazon VPC endpoints are easy to configure and provide reliable connectivity to Amazon Timestream APIs without requiring an internet gateway or a Network Address Translation (NAT) instance.
What backup options are available for Timestream?
You have two backup options available for your Timestream resources: on-demand backups and scheduled backups. On-demand backups are ad-hoc, one-time backups that can be initiated either from the Timestream console or using AWS Backup. On-demand backups are useful when you want to create a backup prior to making a change to your table that may require you to revert the changes. Scheduled backups are recurring backups that you can configure, using AWS Backup policies, at desired frequencies (e.g. 12 hours, 1 day, 1 week, etc.). Scheduled backups are useful when you want to create ongoing backups to meet your data protection goals.
Are Timestream table backups full or incremental backups ?
The first backup, either on-demand or scheduled, of the table is a full backup and every subsequent backup of the same table is incremental, copying only the data that has changed since the last backup.
How will I be charged and billed for my use of backup and restore capabilities?
Backup and restores are charged based on the backup storage size of the selected table, measured on a ‘GB-Month’ basis. Charges will be shown under ‘Backup’ in your AWS bill and include costs for backup storage, data transfers, restores, and early deletes. As the backups are incremental in nature, the storage size of the subsequent backup of the table is the size of the amount of the data changed since the last backup. Please refer to AWS Backup pricing for additional details.
How do I get started protecting my Timestream data?
To get started, you need to enable AWS Backup to protect your Timestream resources (this is a one-time action). Once enabled, navigate to AWS Management Console or use AWS Backup’s CLI or SDK to create on-demand or scheduled backups of your data, and copy those backups across accounts and Regions. You can configure your backup lifecycle management based on your data protection needs. For more information, refer to creating a backup documentation.
How can I restore my Timestream data?
You can restore your Timestream tables through the AWS Management Console or using AWS Backup’s CLI or SDK. Select the recovery point ID for the resource you want to restore, and provide the required inputs such as destination database name, new table name, and retention properties to start the restore process. Upon successful restore, you can access the data. When you attempt to restore the latest incremental backup of your table, the entire table data is restored. For more information, refer to the documentation.
Visit the Amazon Timestream pricing page.
Instantly get access to the AWS Free Tier.
Get started building with Amazon Timestream in the AWS Management Console.