Amazon Kinesis Data Firehose is the easiest way to load streaming data into data stores and analytics tools. Kinesis Data Firehose is a fully managed service that makes it easy to capture, transform, and load massive volumes of streaming data from hundreds of thousands of sources into Amazon S3, Amazon Redshift, Amazon Elasticsearch Service, Kinesis Data Analytics, and Splunk, enabling near real-time analytics and insights.
Kinesis data delivery streams
Kinesis data delivery stream is the underlying entity of Kinesis Data Firehose. You use Kinesis Data Firehose by creating a Kinesis data delivery stream and then sending data to it.
Easy launch and configuration
You can launch Amazon Kinesis Data Firehose and create a delivery stream to load data into Amazon S3, Amazon Redshift, Amazon Elasticsearch Service, or Splunk with just a few clicks in the AWS Management Console. You can send data to the delivery stream by calling the Firehose API, or running the Linux agent we provide on the data source. Kinesis Data Firehose then continuously loads the data into Amazon S3, Amazon Redshift, Amazon Elasticsearch Service, and Splunk.
Load new data in near real-time
You can specify a batch size or batch interval to control how quickly data is uploaded to destinations. For example, you can set the batch interval to 60 seconds if you want to receive new data within 60 seconds of sending it to your delivery stream. Additionally, you can specify if data should be compressed. The service supports common compression algorithms including GZip and Snappy. Batching and compressing data before uploading enables you to control how quickly you receive new data at the destinations.
Elastic scaling to handle varying data throughput
Once launched, your delivery streams automatically scale up and down to handle gigabytes per second or more of input data rate, and maintain data latency at levels you specify for the stream. No intervention or maintenance is needed.
Support for built-in data format conversion
Columnar data formats such as Apache Parquet and Apache ORC are optimized for cost-effective storage and analytics using services such as Amazon Athena, Amazon Redshift Spectrum, Amazon EMR, and other Hadoop based tools. Amazon Kinesis Data Firehose can convert the format of incoming data from JSON to Parquet or ORC formats before storing the data in Amazon S3, so you can save storage and analytics costs. Learn more »
Integrated data transformations
You can configure Amazon Kinesis Data Firehose to prepare your streaming data before it is loaded to data stores. Simply select an AWS Lambda function from the Amazon Kinesis Data Firehose delivery stream configuration tab in the AWS Management console. Amazon Kinesis Data Firehose will automatically apply that function to every input data record and load the transformed data to destinations. Amazon Kinesis Data Firehose provides pre-built Lambda blueprints for converting common data sources such as Apache logs and system logs to JSON and CSV formats. You can use these pre-built blueprints without any change, or customize them further, or write your own custom functions. You can also configure Amazon Kinesis Data Firehose to automatically retry failed jobs and back up the raw streaming data. Learn more »
Support for multiple data destinations
Amazon Kinesis Data Firehose currently supports Amazon S3, Amazon Redshift, Amazon Elasticsearch Service, and Splunk as destinations. You can specify the destination Amazon S3 bucket, the Amazon Redshift table, the Amazon Elasticsearch domain, or the Splunk cluster into which data should be loaded.
Optional automatic encryption
Amazon Kinesis Data Firehose provides you the option to have your data automatically encrypted after it is uploaded to the destination. As part of the delivery stream configuration, you can specify an AWS Key Management System (KMS) encryption key.
Metrics for monitoring performance
Amazon Kinesis Data Firehose exposes several metrics through the console, as well as Amazon CloudWatch, including volume of data submitted, volume of data uploaded to destination, time from source to destination, and upload success rate. You can use these metrics to monitor the health of your delivery streams, take any necessary actions such as modifying destinations, and ensure that the service is ingesting data and loading it to destinations.
With Amazon Kinesis Data Firehose, you pay only for the volume of data you transmit through the service. There are no minimum fees or upfront commitments. You don’t need staff to operate, scale, and maintain infrastructure or custom applications to capture and load streaming data.