Optimize your workloads with Amazon Redshift Serverless AI-driven scaling and optimization

The current scaling approach of Amazon Redshift Serverless increases your compute capacity based on the query queue time and scales down when the queuing reduces on the data warehouse. However, you might need to automatically scale compute resources based on factors like query complexity and data volume to meet price-performance targets, irrespective of query queuing. To address this requirement, Redshift Serverless launched the artificial intelligence (AI)-driven scaling and optimization feature, which scales the compute not only based on the queuing, but also factoring data volume and query complexity.

In this post, we describe how Redshift Serverless utilizes the new AI-driven scaling and optimization capabilities to address common use cases. This post also includes example SQLs, which you can run on your own Redshift Serverless data warehouse to experience the benefits of this feature.

Solution overview

The AI-powered scaling and optimization feature in Redshift Serverless provides a user-friendly visual slider to set your desired balance between price and performance. By moving the slider, you can choose between optimized for cost, balanced performance and cost, or optimized for performance. Based on where you position the slider, Amazon Redshift will automatically add or remove resources to ensure better behavior and perform other AI-driven optimizations like automatic materialized views and automatic table design optimization to meet your selected price-performance target.

The slider offers the following options:

Optimizes for cost – Prioritizes cost savings. Redshift attempts to automatically scale up compute capacity when doing so and doesn’t incur additional charges. And it will also attempt to scale down compute for lower cost, despite longer runtime.
Balanced (Recommended) – Offers balance between performance and cost. Redshift scales for performance with a moderate cost increase.
Optimizes for performance – Prioritizes performance. Redshift scales aggressively for maximum performance, potentially incurring higher costs.

In the following sections, we illustrate how the AI-driven scaling and optimization feature can intelligently predict your workload compute needs and scale proactively for three scenarios:

Use case 1 – A long-running complex query. Compute scales based on query complexity.
Use case 2 – A sudden spike in ingestion volume (a three-fold increase, from 720 million to 2.1 billion). Compute scales based on data volume.
Use case 3 – A data lake query scanning large datasets (TBs). Compute scales based on the expected data to be scanned from the data lake. The expected data scan is predicted by machine learning (ML) models based on prior historical run statistics.

In the existing auto scaling mechanism, the use cases don’t increase compute capacity automatically unless queuing is identified across the instance.

Prerequisites

To follow along, complete the following prerequisites:

Create a Redshift Serverless workgroup. For instructions, see Creating a workgroup with a namespace.
While creating the workgroup, choose Performance and Cost Controls and Price-performance target, and adjust the slider to Optimized for performance. For more information, refer to Amazon Redshift adds new AI capabilities, including Amazon Q, to boost efficiency and productivity.
Set up an AWS Identity and Access Management (IAM) role as the default IAM role. Refer to Managing IAM roles created for a cluster using the console for instructions.
We use TPC-DS 1TB Cloud Data Warehouse Benchmark data to demonstrate this feature. Run the SQL statements to create tables and load the TPC-DS 1TB data.

Use case 1: Scale compute based on query complexity

The following query analyzes product sales across multiple channels such as websites, wholesale, and retail stores. This complex query typically takes about 25 minutes to run with the default 128 RPUs. Let’s run this workload on the workgroup created as part of prerequisites.

When a query is run for the first time, the AI scaling system may make a suboptimal decision regarding resource allocation or scaling as the system is still learning the query and data characteristics. However, the system learns from this experience, and when the same query is run again, it can make a more optimal scaling decision. Therefore, if the query didn’t scale during the first run, it is recommended to rerun the query. You can monitor the RPU capacity used on the Redshift Serverless console or by querying the SYS_SERVERLESS_USAGE system view.

The results cache is turned off in the following queries to avoid fetching results from the cache.

SET enable_result_cache_for_session TO off;
with /* TPC-DS demo query */
    ws as
    (select d_year AS ws_sold_year, ws_item_sk,    ws_bill_customer_sk
     ws_customer_sk,    sum(ws_quantity) ws_qty,    sum(ws_wholesale_cost) ws_wc,
        sum(ws_sales_price) ws_sp   from web_sales   left join web_returns on
     wr_order_number=ws_order_number and ws_item_sk=wr_item_sk   join date_dim
     on ws_sold_date_sk = d_date_sk   where wr_order_number is null   group by
     d_year, ws_item_sk, ws_bill_customer_sk   ),
    cs as  
    (select d_year AS cs_sold_year,
     cs_item_sk,    cs_bill_customer_sk cs_customer_sk,    sum(cs_quantity) cs_qty,
        sum(cs_wholesale_cost) cs_wc,    sum(cs_sales_price) cs_sp   from catalog_sales
       left join catalog_returns on cr_order_number=cs_order_number and cs_item_sk=cr_item_sk
       join date_dim on cs_sold_date_sk = d_date_sk   where cr_order_number is
     null   group by d_year, cs_item_sk, cs_bill_customer_sk   ),
    ss as  
    (select
     d_year AS ss_sold_year, ss_item_sk,    ss_customer_sk,    sum(ss_quantity)
     ss_qty,    sum(ss_wholesale_cost) ss_wc,    sum(ss_sales_price) ss_sp
       from store_sales left join store_returns on sr_ticket_number=ss_ticket_number
     and ss_item_sk=sr_item_sk   join date_dim on ss_sold_date_sk = d_date_sk
       where sr_ticket_number is null   group by d_year, ss_item_sk, ss_customer_sk
       ) 
       
       select 
       ss_customer_sk,round(ss_qty/(coalesce(ws_qty+cs_qty,1)),2)
     ratio,ss_qty store_qty, ss_wc store_wholesale_cost, ss_sp store_sales_price,
    coalesce(ws_qty,0)+coalesce(cs_qty,0) other_chan_qty,coalesce(ws_wc,0)+coalesce(cs_wc,0)
     other_chan_wholesale_cost,coalesce(ws_sp,0)+coalesce(cs_sp,0) other_chan_sales_price
    from ss left join ws on (ws_sold_year=ss_sold_year and ws_item_sk=ss_item_sk
     and ws_customer_sk=ss_customer_sk)left join cs on (cs_sold_year=ss_sold_year
     and cs_item_sk=cs_item_sk and cs_customer_sk=ss_customer_sk)where coalesce(ws_qty,0)>0
     and coalesce(cs_qty, 0)>0 order by   ss_customer_sk,  ss_qty desc, ss_wc
     desc, ss_sp desc,  other_chan_qty,  other_chan_wholesale_cost,  other_chan_sales_price,
      round(ss_qty/(coalesce(ws_qty+cs_qty,1)),2);

When the query is complete, run the following SQL to capture the start and end times of the query, which will be used in the next query:

select query_id,query_text,start_time,end_time, elapsed_time/1000000.0 duration_in_seconds
from sys_query_history
where query_text like '%TPC-DS demo query%'
and query_text not like '%sys_query_history%'
order by start_time desc

Let’s assess the compute scaled during the preceding start_time and end_time period. Replace start_time and end_time in the following query with the output of the preceding query:

select * from sys_serverless_usage
where end_time >= 'start_time'
and end_time <= DATEADD(minute,1,'end_time')
order by end_time asc

-- Example
--select * from sys_serverless_usage
--where end_time >= '2024-06-03 00:17:12.322353'
--and end_time <= DATEADD(minute,1,'2024-06-03 00:19:11.553218')
--order by end_time asc

The following screenshot shows an example output.

Use Case 1 output

You can notice the increase in compute over the duration of this query. This demonstrates how Redshift Serverless scales based on query complexity.

Use case 2: Scale compute based on data volume

Let’s consider the web_sales ingestion job. For this example, your daily ingestion job processes 720 million records and completes in an average of 2 minutes. This is what you ingested in the prerequisite steps.

Due to some event (such as month end processing), your volumes increased by three times and now your ingestion job needs to process 2.1 billion records. In an existing scaling approach, this would increase your ingestion job runtime unless the queue time is enough to invoke additional compute resources. But with AI-driven scaling, in performance optimized mode, Amazon Redshift automatically scales compute to complete your ingestion job within usual runtimes. This helps protect your ingestion SLAs.

Run the following job to ingest 2.1 billion records into the web_sales table:

copy web_sales from 's3://redshift-downloads/TPC-DS/2.13/3TB/web_sales/' iam_role default gzip delimiter '|' EMPTYASNULL region 'us-east-1';

Run the following query to compare the duration of ingesting 2.1 billion records and 720 million records. Both ingestion jobs completed in approximately a similar time, despite the three-fold increase in volume.

select query_id,table_name,data_source,loaded_rows,duration/1000000.0 duration_in_seconds , start_time,end_time
from sys_load_history
where
table_name='web_sales'
order by start_time desc

Run the following query with the start times and end times from the previous output:

select * from sys_serverless_usage
where end_time >= 'start_time'
and end_time <= DATEADD(minute,1,'end_time')
order by end_time asc

The following is an example output. You can notice the increase in compute capacity for the ingestion job that processes 2.1 billion records. This illustrates how Redshift Serverless scaled based on data volume.

Use Case 2 Output

Use case 3: Scale data lake queries

In this use case, you create external tables pointing to TPC-DS 3TB data in an Amazon Simple Storage Service (Amazon S3) location. Then you run a query that scans a large volume of data to demonstrate how Redshift Serverless can automatically scale compute capacity as needed.

In the following SQL, provide the ARN of the default IAM role you attached in the prerequisites:

-- Create external schema
create external schema ext_tpcds_3t
from data catalog
database ext_tpcds_db
iam_role '<ARN of the default IAM role attached>'
create external database if not exists;

Create external tables by running DDL statements in the following SQL file. You should see seven external tables in the query editor under the ext_tpcds_3t schema, as shown in the following screenshot.

External Tables

Run the following query using external tables. As mentioned in the first use case, if the query didn’t scale during the first run, it is recommended to rerun the query, because the system will have learned from the previous experience and can potentially provide better scaling and performance for the subsequent run.

The results cache is turned off in the following queries to avoid fetching results from the cache.

SET enable_result_cache_for_session TO off;

with /* TPC-DS demo data lake query */

ws as
(select d_year AS ws_sold_year, ws_item_sk, ws_bill_customer_sk
ws_customer_sk,    sum(ws_quantity) ws_qty,    sum(ws_wholesale_cost) ws_wc,
sum(ws_sales_price) ws_sp   from ext_tpcds_3t.web_sales   left join ext_tpcds_3t.web_returns on
wr_order_number=ws_order_number and ws_item_sk=wr_item_sk   join ext_tpcds_3t.date_dim
on ws_sold_date_sk = d_date_sk   where wr_order_number is null   group by
d_year, ws_item_sk, ws_bill_customer_sk   ),

cs as
(select d_year AS cs_sold_year,
cs_item_sk,    cs_bill_customer_sk cs_customer_sk,    sum(cs_quantity) cs_qty,
sum(cs_wholesale_cost) cs_wc,    sum(cs_sales_price) cs_sp   from ext_tpcds_3t.catalog_sales
left join ext_tpcds_3t.catalog_returns on cr_order_number=cs_order_number and cs_item_sk=cr_item_sk
join ext_tpcds_3t.date_dim on cs_sold_date_sk = d_date_sk   where cr_order_number is
null   group by d_year, cs_item_sk, cs_bill_customer_sk   ),

ss as
(select
d_year AS ss_sold_year, ss_item_sk,    ss_customer_sk,    sum(ss_quantity)
ss_qty,    sum(ss_wholesale_cost) ss_wc,    sum(ss_sales_price) ss_sp
from ext_tpcds_3t.store_sales left join ext_tpcds_3t.store_returns on sr_ticket_number=ss_ticket_number
and ss_item_sk=sr_item_sk   join ext_tpcds_3t.date_dim on ss_sold_date_sk = d_date_sk
where sr_ticket_number is null   group by d_year, ss_item_sk, ss_customer_sk)

SELECT           ss_customer_sk,round(ss_qty/(coalesce(ws_qty+cs_qty,1)),2)
ratio,ss_qty store_qty, ss_wc store_wholesale_cost, ss_sp store_sales_price,
coalesce(ws_qty,0)+coalesce(cs_qty,0) other_chan_qty,coalesce(ws_wc,0)+coalesce(cs_wc,0)    other_chan_wholesale_cost,coalesce(ws_sp,0)+coalesce(cs_sp,0) other_chan_sales_price
FROM ss left join ws on (ws_sold_year=ss_sold_year and ws_item_sk=ss_item_sk and ws_customer_sk=ss_customer_sk)left join cs on (cs_sold_year=ss_sold_year and cs_item_sk=cs_item_sk and cs_customer_sk=ss_customer_sk)
where coalesce(ws_qty,0)>0
and coalesce(cs_qty, 0)>0
order by   ss_customer_sk,  ss_qty desc, ss_wc desc, ss_sp desc,  other_chan_qty,  other_chan_wholesale_cost,  other_chan_sales_price,     round(ss_qty/(coalesce(ws_qty+cs_qty,1)),2);

Review the total elapsed time of the query. You need the start_time and end_time from the results to feed into the next query.

select query_id,query_text,start_time,end_time, elapsed_time/1000000.0 duration_in_seconds
from sys_query_history
where query_text like '%TPC-DS demo data lake query%'
and query_text not like '%sys_query_history%'
order by start_time desc

Run the following query to see how compute scaled during the preceding start_time and end_time period. Replace start_time and end_time in the following query from the output of the preceding query:

select * from sys_serverless_usage
where end_time >= 'start_time'
and end_time <= DATEADD(minute,1,'end_time')
order by end_time asc

The following screenshot shows an example output.

Use Case 3 Output

The increased compute capacity for this data lake query shows that Redshift Serverless can scale to match the data being scanned. This demonstrates how Redshift Serverless can dynamically allocate resources based on query needs.

Considerations when choosing your price-performance target

You can use the price-performance slider to choose your desired price-performance target for your workload. The AI-driven scaling and optimizations provide holistic optimizations using the following models:

Query prediction models – These determine the actual resource needs (memory, CPU consumption, and so on) for each individual query
Scaling prediction models – These predict how the query would behave on different capacity sizes

Let’s consider a query that takes 7 minutes and costs $7. The following figure shows the query runtimes and cost with no scaling.

Scaling Type Example

A given query might scale in a few different ways, as shown below. Based on the price-performance target you chose on the slider, AI-driven scaling predicts how the query trades off performance and cost, and scales it accordingly.

Scaling Types

The slider options yield the following results:

Optimizes for cost – When you choose Optimizes for cost, the warehouse scales up if there is no additional cost or lesser costs to the user. In the preceding example, the superlinear scaling approach demonstrates this behavior. Scaling will only occur if it can be done in a cost-effective manner according to the scaling model predictions. If the scaling models predict that cost-optimized scaling isn’t possible for the given workload, then the warehouse won’t scale.
Balanced (Recommended) – With the Balanced (Recommended) option, the system will scale in favor of performance and there will be a cost increase, but it will be a limited increase in cost. In the preceding example, the linear scaling approach demonstrates this behavior.
Optimizes for performance – With the Optimizes for performance option, the system will scale in favor of performance even though the costs are higher and non-linear. In the preceding example, the sublinear scaling approach demonstrates this behavior. The closer the slider position is to the Optimizes for performance position, the more sublinear scaling is permitted.

The following are additional points to note:

The price-performance slider options are dynamic and they can be changed anytime. However, the impact of these changes will not be realized immediately. The impact of this is effective as the system learns how to scale the current workload and any additional workloads better.
The price-performance slider options, Max capacity and Max RPU-hours are designed to work together. Max capacity and Max RPU-hours are the controls to limit maximum RPUs the data warehouse allowed to scale and maximum RPU hours allowed to consume respectively. These controls are always honored and enforced regardless of the settings on the price-performance target slider.
The AI-driven scaling and optimization feature dynamically adjusts compute resources to optimize query runtime speed while adhering to your price-performance requirements. It considers factors such as query queueing, concurrency, volume, and complexity. The system can either run queries on a compute resource with lower concurrent queries or spin up additional compute resources to avoid queueing. The goal is to provide the best price-performance balance based on your choices.

Monitoring

You can monitor the RPU scaling in the following ways:

Review the RPU capacity used graph on the Amazon Redshift console.
Monitor the ComputeCapacity metric under AWS/Redshift-Serverless and Workgroup in Amazon CloudWatch.
Query the SYS_QUERY_HISTORY view, providing the specific query ID or query text to identify the time period. Use this time period to query the SYS_SERVERLESS_USAGE system view to find the compute_capacity The compute_capacity field will show the RPUs scaled during the query runtime.

Refer to Configure monitoring, limits, and alarms in Amazon Redshift Serverless to keep costs predictable for the step-by-step instructions on using these approaches.

Clean up

Complete the following steps to delete the resources you created to avoid unexpected costs:

Delete the Redshift Serverless workgroup.
Delete the Redshift Serverless associated namespace.

Conclusion

In this post, we discussed how to optimize your workloads to scale based on the changes in data volume and query complexity. We demonstrated an approach to implement more responsive, proactive scaling with the AI-driven scaling feature in Redshift Serverless. Try this feature in your environment, conduct a proof of concept on your specific workloads, and share your feedback with us.

About the Authors

Satesh Sonti is a Sr. Analytics Specialist Solutions Architect based out of Atlanta, specialized in building enterprise data platforms, data warehousing, and analytics solutions. He has over 19 years of experience in building data assets and leading complex data platform programs for banking and insurance clients across the globe.

Ashish Agrawal is a Principal Product Manager with Amazon Redshift, building cloud-based data warehouses and analytics cloud services. Ashish has over 25 years of experience in IT. Ashish has expertise in data warehouses, data lakes, and platform as a service. Ashish has been a speaker at worldwide technical conferences.

Davide Pagano is a Software Development Manager with Amazon Redshift based out of Palo Alto, specialized in building cloud-based data warehouses and analytics cloud services solutions. He has over 10 years of experience with databases, out of which 6 years of experience tailored to Amazon Redshift.

Select your cookie preferences

AWS Big Data Blog