Posted On: Nov 21, 2022
Amazon S3 improves performance of queries running on Trino by up to 9x when using Amazon S3 Select. Trino is an open source SQL query engine used to run interactive analytics on data stored in Amazon S3. With S3 Select, you “push down” the computational work to filter your S3 data instead of returning the entire object. By using Trino with S3 Select, you retrieve only a subset of data from an object, reducing the amount of data returned and accelerating query performance.
Starting today, with AWS’s upstream contribution to open source Trino, you can use Trino with S3 Select to improve your query performance. S3 Select offloads the heavy lifting of filtering and accessing data inside objects to Amazon S3, which reduces the amount of data that has to be transferred and processed by Trino. For example, if you have a data lake built on Amazon S3 and use Trino today, you can use S3 Select’s filtering capability to quickly and easily run interactive ad-hoc queries.