Quickly Filter Data in Amazon Redshift using Interleaved Sorting

Posted on: May 11, 2015

You can use Interleaved Sort Keys to quickly filter data without the need for indices or projections in Amazon Redshift. A table with interleaved keys arranges your data so each sort key column has equal importance. While Compound Sort Keys are more performant if you filter on the leading sort key columns, interleaved sort keys provide fast filtering no matter which sort key columns you specify in your WHERE clause. To create an interleaved sort, simply define your sort keys as INTERLEAVED in your CREATE TABLE statement.

The performance benefit of interleaved sorting increases with table size, and is most effective with highly selective queries that filter on multiple columns. For example, assume your table contains 1,000,000 blocks (1 TB per column) with an interleaved sort key of both customer ID and product ID. You will scan 1,000 blocks when you filter on a specific customer or a specific product, a 1000x increase in query speed compared to the unsorted case. If you filter on both customer and product, you will only need to scan a single block.

The interleaved sorting feature will be deployed in every region over the next seven days. The new cluster version will be 1.0.921.

For more information, please see our AWS Blog Post on Interleaved Sorting and review our documentation on Best Practices for Designing Tables.