Amazon S3 Announces New Features for S3 Select

Posted on: Sep 5, 2018

Amazon S3 announces feature enhancements to S3 Select. S3 Select is an Amazon S3 capability designed to pull out only the data you need from an object, which can dramatically improve the performance and reduce the cost of applications that need to access data in S3.

Today, Amazon S3 Select works on objects stored in CSV and JSON format. Based on customer feedback, we’re happy to announce S3 Select support for Apache Parquet format, JSON Arrays, and BZIP2 compression for CSV and JSON objects. We are also adding support for CloudWatch Metrics for S3 Select, which lets you monitor S3 Select usage for your applications. 

Parquet is widely adopted because it supports a wide variety of query engines, such as Hive, Presto and Impala, as well as multiple frameworks, including Spark and MapReduce. S3 Select Parquet allows you to use S3 Select to retrieve specific columns from data stored in S3, and it supports columnar compression using GZIP or Snappy. You can specify format in the results as either CSV or JSON, and you can determine how the records in the result are delimited. 

With JSON Arrays support you can iterate over inner nodes in the JSON objects. You can query these nested JSON objects by specifying path navigation in FROM clause of S3 Select queries.

BZIP2 is a widely adopted compression format that is used to compress textual data and is typically more efficient than many other types of compression algorithms.

CloudWatch metrics for S3 lets you track the health of your applications. These metrics are available at 1-minute intervals and lets you quickly identify and act on operational issues. The new S3 Select specific metrics include S3 Select request count, amount of data scanned, and amount of data returned.

These features for Amazon S3 Select are available in all commercial AWS Regions starting today.

To learn more about Amazon S3 Select, please visit Selecting Content from Objects page in Amazon S3 Developer Guide. To learn more about Amazon CloudWatch Metrics for S3, please visit Monitoring Metrics with Amazon CloudWatch page in Amazon S3 Developer Guide. To get started, please visit the AWS Management Console.