AWS Big Data Blog
Optimize your workloads with Amazon Redshift Serverless AI-driven scaling and optimization
The current scaling approach of Amazon Redshift Serverless increases your compute capacity based on the query queue time and scales down when the queuing reduces on the data warehouse. However, you might need to automatically scale compute resources based on factors like query complexity and data volume to meet price-performance targets, irrespective of query queuing. To address this requirement, Redshift Serverless launched the artificial intelligence (AI)-driven scaling and optimization feature, which scales the compute not only based on the queuing, but also factoring data volume and query complexity.
In this post, we describe how Redshift Serverless utilizes the new AI-driven scaling and optimization capabilities to address common use cases. This post also includes example SQLs, which you can run on your own Redshift Serverless data warehouse to experience the benefits of this feature.
Solution overview
The AI-powered scaling and optimization feature in Redshift Serverless provides a user-friendly visual slider to set your desired balance between price and performance. By moving the slider, you can choose between optimized for cost, balanced performance and cost, or optimized for performance. Based on where you position the slider, Amazon Redshift will automatically add or remove resources to ensure better behavior and perform other AI-driven optimizations like automatic materialized views and automatic table design optimization to meet your selected price-performance target.
The slider offers the following options:
- Optimizes for cost – Prioritizes cost savings. Redshift attempts to automatically scale up compute capacity when doing so and doesn’t incur additional charges. And it will also attempt to scale down compute for lower cost, despite longer runtime.
- Balanced (Recommended) – Offers balance between performance and cost. Redshift scales for performance with a moderate cost increase.
- Optimizes for performance – Prioritizes performance. Redshift scales aggressively for maximum performance, potentially incurring higher costs.
In the following sections, we illustrate how the AI-driven scaling and optimization feature can intelligently predict your workload compute needs and scale proactively for three scenarios:
- Use case 1 – A long-running complex query. Compute scales based on query complexity.
- Use case 2 – A sudden spike in ingestion volume (a three-fold increase, from 720 million to 2.1 billion). Compute scales based on data volume.
- Use case 3 – A data lake query scanning large datasets (TBs). Compute scales based on the expected data to be scanned from the data lake. The expected data scan is predicted by machine learning (ML) models based on prior historical run statistics.
In the existing auto scaling mechanism, the use cases don’t increase compute capacity automatically unless queuing is identified across the instance.
Prerequisites
To follow along, complete the following prerequisites:
- Create a Redshift Serverless workgroup. For instructions, see Creating a workgroup with a namespace.
- While creating the workgroup, choose Performance and Cost Controls and Price-performance target, and adjust the slider to Optimized for performance. For more information, refer to Amazon Redshift adds new AI capabilities, including Amazon Q, to boost efficiency and productivity.
- Set up an AWS Identity and Access Management (IAM) role as the default IAM role. Refer to Managing IAM roles created for a cluster using the console for instructions.
- We use TPC-DS 1TB Cloud Data Warehouse Benchmark data to demonstrate this feature. Run the SQL statements to create tables and load the TPC-DS 1TB data.
Use case 1: Scale compute based on query complexity
The following query analyzes product sales across multiple channels such as websites, wholesale, and retail stores. This complex query typically takes about 25 minutes to run with the default 128 RPUs. Let’s run this workload on the workgroup created as part of prerequisites.
When a query is run for the first time, the AI scaling system may make a suboptimal decision regarding resource allocation or scaling as the system is still learning the query and data characteristics. However, the system learns from this experience, and when the same query is run again, it can make a more optimal scaling decision. Therefore, if the query didn’t scale during the first run, it is recommended to rerun the query. You can monitor the RPU capacity used on the Redshift Serverless console or by querying the SYS_SERVERLESS_USAGE system view.
The results cache is turned off in the following queries to avoid fetching results from the cache.
When the query is complete, run the following SQL to capture the start and end times of the query, which will be used in the next query:
Let’s assess the compute scaled during the preceding start_time
and end_time
period. Replace start_time
and end_time
in the following query with the output of the preceding query:
The following screenshot shows an example output.
You can notice the increase in compute over the duration of this query. This demonstrates how Redshift Serverless scales based on query complexity.
Use case 2: Scale compute based on data volume
Let’s consider the web_sales
ingestion job. For this example, your daily ingestion job processes 720 million records and completes in an average of 2 minutes. This is what you ingested in the prerequisite steps.
Due to some event (such as month end processing), your volumes increased by three times and now your ingestion job needs to process 2.1 billion records. In an existing scaling approach, this would increase your ingestion job runtime unless the queue time is enough to invoke additional compute resources. But with AI-driven scaling, in performance optimized mode, Amazon Redshift automatically scales compute to complete your ingestion job within usual runtimes. This helps protect your ingestion SLAs.
Run the following job to ingest 2.1 billion records into the web_sales
table:
Run the following query to compare the duration of ingesting 2.1 billion records and 720 million records. Both ingestion jobs completed in approximately a similar time, despite the three-fold increase in volume.
Run the following query with the start times and end times from the previous output:
The following is an example output. You can notice the increase in compute capacity for the ingestion job that processes 2.1 billion records. This illustrates how Redshift Serverless scaled based on data volume.
Use case 3: Scale data lake queries
In this use case, you create external tables pointing to TPC-DS 3TB data in an Amazon Simple Storage Service (Amazon S3) location. Then you run a query that scans a large volume of data to demonstrate how Redshift Serverless can automatically scale compute capacity as needed.
In the following SQL, provide the ARN of the default IAM role you attached in the prerequisites:
Create external tables by running DDL statements in the following SQL file. You should see seven external tables in the query editor under the ext_tpcds_3t
schema, as shown in the following screenshot.
Run the following query using external tables. As mentioned in the first use case, if the query didn’t scale during the first run, it is recommended to rerun the query, because the system will have learned from the previous experience and can potentially provide better scaling and performance for the subsequent run.
The results cache is turned off in the following queries to avoid fetching results from the cache.
Review the total elapsed time of the query. You need the start_time
and end_time
from the results to feed into the next query.
Run the following query to see how compute scaled during the preceding start_time
and end_time
period. Replace start_time
and end_time
in the following query from the output of the preceding query:
The following screenshot shows an example output.
The increased compute capacity for this data lake query shows that Redshift Serverless can scale to match the data being scanned. This demonstrates how Redshift Serverless can dynamically allocate resources based on query needs.
Considerations when choosing your price-performance target
You can use the price-performance slider to choose your desired price-performance target for your workload. The AI-driven scaling and optimizations provide holistic optimizations using the following models:
- Query prediction models – These determine the actual resource needs (memory, CPU consumption, and so on) for each individual query
- Scaling prediction models – These predict how the query would behave on different capacity sizes
Let’s consider a query that takes 7 minutes and costs $7. The following figure shows the query runtimes and cost with no scaling.
A given query might scale in a few different ways, as shown below. Based on the price-performance target you chose on the slider, AI-driven scaling predicts how the query trades off performance and cost, and scales it accordingly.
The slider options yield the following results:
- Optimizes for cost – When you choose Optimizes for cost, the warehouse scales up if there is no additional cost or lesser costs to the user. In the preceding example, the superlinear scaling approach demonstrates this behavior. Scaling will only occur if it can be done in a cost-effective manner according to the scaling model predictions. If the scaling models predict that cost-optimized scaling isn’t possible for the given workload, then the warehouse won’t scale.
- Balanced (Recommended) – With the Balanced (Recommended) option, the system will scale in favor of performance and there will be a cost increase, but it will be a limited increase in cost. In the preceding example, the linear scaling approach demonstrates this behavior.
- Optimizes for performance – With the Optimizes for performance option, the system will scale in favor of performance even though the costs are higher and non-linear. In the preceding example, the sublinear scaling approach demonstrates this behavior. The closer the slider position is to the Optimizes for performance position, the more sublinear scaling is permitted.
The following are additional points to note:
- The price-performance slider options are dynamic and they can be changed anytime. However, the impact of these changes will not be realized immediately. The impact of this is effective as the system learns how to scale the current workload and any additional workloads better.
- The price-performance slider options, Max capacity and Max RPU-hours are designed to work together. Max capacity and Max RPU-hours are the controls to limit maximum RPUs the data warehouse allowed to scale and maximum RPU hours allowed to consume respectively. These controls are always honored and enforced regardless of the settings on the price-performance target slider.
- The AI-driven scaling and optimization feature dynamically adjusts compute resources to optimize query runtime speed while adhering to your price-performance requirements. It considers factors such as query queueing, concurrency, volume, and complexity. The system can either run queries on a compute resource with lower concurrent queries or spin up additional compute resources to avoid queueing. The goal is to provide the best price-performance balance based on your choices.
Monitoring
You can monitor the RPU scaling in the following ways:
- Review the RPU capacity used graph on the Amazon Redshift console.
- Monitor the ComputeCapacity metric under AWS/Redshift-Serverless and Workgroup in Amazon CloudWatch.
- Query the SYS_QUERY_HISTORY view, providing the specific query ID or query text to identify the time period. Use this time period to query the SYS_SERVERLESS_USAGE system view to find the
compute_capacity
Thecompute_capacity
field will show the RPUs scaled during the query runtime.
Refer to Configure monitoring, limits, and alarms in Amazon Redshift Serverless to keep costs predictable for the step-by-step instructions on using these approaches.
Clean up
Complete the following steps to delete the resources you created to avoid unexpected costs:
Conclusion
In this post, we discussed how to optimize your workloads to scale based on the changes in data volume and query complexity. We demonstrated an approach to implement more responsive, proactive scaling with the AI-driven scaling feature in Redshift Serverless. Try this feature in your environment, conduct a proof of concept on your specific workloads, and share your feedback with us.
About the Authors
Satesh Sonti is a Sr. Analytics Specialist Solutions Architect based out of Atlanta, specialized in building enterprise data platforms, data warehousing, and analytics solutions. He has over 19 years of experience in building data assets and leading complex data platform programs for banking and insurance clients across the globe.
Ashish Agrawal is a Principal Product Manager with Amazon Redshift, building cloud-based data warehouses and analytics cloud services. Ashish has over 25 years of experience in IT. Ashish has expertise in data warehouses, data lakes, and platform as a service. Ashish has been a speaker at worldwide technical conferences.
Davide Pagano is a Software Development Manager with Amazon Redshift based out of Palo Alto, specialized in building cloud-based data warehouses and analytics cloud services solutions. He has over 10 years of experience with databases, out of which 6 years of experience tailored to Amazon Redshift.