Stream Real-Time Data in Apache Parquet or ORC Format Using Amazon Kinesis Data Firehose

Posted on: May 10, 2018

We have added support for Apache Parquet and Apache ORC formats in Amazon Kinesis Data Firehose, so you can stream real-time data into Amazon S3 for cost-effective storage and analytics.

Apache Parquet and Apache ORC are columnar data formats that allow you to store and query data more efficiently and cost-effectively. You can now configure your Kinesis Data Firehose delivery stream to automatically convert data into Parquet or ORC format before delivering to your S3 bucket. There is no coding required, and you can query S3 data much faster using Amazon Athena and Amazon Redshift Spectrum, allowing you to save storage and query costs. Usage based charges apply for data format conversion in Kinesis Data Firehose. For more information, see the pricing page.

Amazon Kinesis Data Firehose is the easiest way to load streaming data into AWS. To get started with Kinesis Data Firehose, visit the console and the developer guide

For a list of regions where Kinesis Data Firehose is available, see the AWS Region Table.