Posted On: Apr 13, 2023
Today, Amazon Redshift introduced additional performance enhancements that speed up string-based data processing by 5x to 63x compared to alternative compression encodings such as LZO or ZSTD. Amazon Redshift achieves this through vectorized scans over light weight CPU-efficient dictionary-encoded string columns that allows the database engine to operate directly over compressed data. These techniques are optimal on low cardinality string columns (CHAR or VARCHAR). Low cardinality string columns are columns that have up to a few hundred unique string values.
You can automatically benefit from this new high performance string enhancement by enabling Automatic Table Optimization (ATO) in your Amazon Redshift data warehouse. If you do not have ATO enabled on your tables, you can receive recommendations from the Amazon Redshift Advisor in the Amazon Redshift Console on a string column’s suitability for BYTEDICT encoding. You can also define new tables that have low cardinality string columns with BYTEDICT encoding. String enhancements in Amazon Redshift are now available in all Amazon Web Services (AWS) regions where Amazon Redshift is available.