How do I correlate the query plan with the query report in Amazon Redshift?

Last updated: 2020-10-28

I'm trying to correlate the query plan with the query report in my Amazon Redshift cluster. How can I do that?

Short description

To determine the usage required to run a query in Amazon Redshift, use the EXPLAIN command. The EXPLAIN command displays the execution plan for a query statement without actually running the query. The execution plan outlines the query planning and execution steps involved.

Then, use the SVL_QUERY_REPORT system view to view query information at a cluster slice level. You can use the slice-level information for detecting uneven data distribution across the cluster, which impacts query performance.

Note: In the SVL_QUERY_REPORT, the rows column indicates the number of rows that are getting processed per cluster slice. The rows_pre_filter column indicates the total number of rows emitted before filtering the rows marked for deletion.

Amazon Redshift processes the query plan and translates the plan into steps, segments, and streams. For more information, see Query planning and execution workflow.

Resolution

Creating a table and fetching the explain plan and SVL query report for the query

1.    Create two tables with different sort keys and distribution keys.

2.    Run the following query where join was not performed on a distribution key:

select eventname, sum (pricepaid) from sales, event where sales.eventid = event.eventid group by eventname order by 2 desc;

This query distributes the inner table to all compute nodes.

3.    Retrieve the query plan:

EXPLAIN <query>;
                                               QUERY PLAN                                               
--------------------------------------------------------------------------------------------------------
 XN Merge  (cost=1002815368414.24..1002815368415.67 rows=571 width=27)
   Merge Key: sum(sales.pricepaid)
   ->  XN Network  (cost=1002815368414.24..1002815368415.67 rows=571 width=27)
         Send to leader
         ->  XN Sort  (cost=1002815368414.24..1002815368415.67 rows=571 width=27)
               Sort Key: sum(sales.pricepaid)
               ->  XN HashAggregate  (cost=2815368386.67..2815368388.10 rows=571 width=27)
                     ->  XN Hash Join DS_BCAST_INNER  (cost=109.98..2815367496.05 rows=178125 width=27)
                           Hash Cond: ("outer".eventid = "inner".eventid)
                           ->  XN Seq Scan on sales  (cost=0.00..1724.56 rows=172456 width=14)
                           ->  XN Hash  (cost=87.98..87.98 rows=8798 width=21)
                                 ->  XN Seq Scan on event  (cost=0.00..87.98 rows=8798 width=21)
(12 rows)

4.    Run the following query:

select * from svl_query_report where query = <query_id> order by segment, step, elapsed_time, rows;

Mapping the query plan with the query report

1.    Run the following query to obtain the svl_query_report:

select query,slice,segment,step,start_time,end_time,elapsed_time,rows,bytes,label from svl_query_report where query = 938787 and segment = 0 order by segment, step, elapsed_time, rows;

EXPLAIN <query>;
->  XN Hash  (cost=87.98..87.98 rows=8798 width=21)
   ->  XN Seq Scan on event  (cost=0.00..87.98 rows=8798 width=21)

Here's an example output:

 query  | slice | segment | step |         start_time         |         end_time          | elapsed_time | rows | bytes  |            label              
--------+-------+---------+------+----------------------------+---------------------------+--------------+------+--------+------------------------------
 938787 |     0 |       0 |    0 | 2020-05-22 11:11:48.828309 | 2020-05-22 11:11:48.82987 |         1561 | 4383 | 128626 | scan   tbl=278788 name=event 
 938787 |     1 |       0 |    0 | 2020-05-22 11:11:48.828309 | 2020-05-22 11:11:48.82987 |         1561 | 4415 | 128918 | scan   tbl=278788 name=event 
 938787 |     0 |       0 |    1 | 2020-05-22 11:11:48.828309 | 2020-05-22 11:11:48.82987 |         1561 | 4383 |      0 | project                      
 938787 |     1 |       0 |    1 | 2020-05-22 11:11:48.828309 | 2020-05-22 11:11:48.82987 |         1561 | 4415 |      0 | project                      
 938787 |     0 |       0 |    2 | 2020-05-22 11:11:48.828309 | 2020-05-22 11:11:48.82987 |         1561 | 4383 | 126660 | bcast                        
 938787 |     1 |       0 |    2 | 2020-05-22 11:11:48.828309 | 2020-05-22 11:11:48.82987 |         1561 | 4415 | 126920 | bcast                        

 (6 rows)

This output indicates that when the segment value is 0, Amazon Redshift performs a sequential scan operation to scan the event table.

2.    Run the following query to obtain the query report of segment 1:

select query,slice,segment,step,start_time,end_time,elapsed_time,rows,bytes,label from svl_query_report where query = 938787 and segment = 1 order by segment, step, elapsed_time, rows;

Here's an example output:

query  | slice | segment | step |       start_time          |          end_time          | elapsed_time | rows | bytes  |     label           
--------+-------+---------+------+----------------------------+----------------------------+--------------+------+--------+-------------------------------------------
938787 |     1 |       1 |    0 | 2020-05-22 11:11:48.826864 | 2020-05-22 11:11:48.830037 |         3173 |    0 |      0 | scan   tbl=376297 name=Internal Worktable 
 38787 |     0 |       1 |    0 | 2020-05-22 11:11:48.826864 | 2020-05-22 11:11:48.831142 |         4278 | 8798 | 253580 | scan   tbl=376297 name=Internal Worktable 
938787 |     1 |       1 |    1 | 2020-05-22 11:11:48.826864 | 2020-05-22 11:11:48.830037 |         3173 |    0 |      0 | project                                   
938787 |     0 |       1 |    1 | 2020-05-22 11:11:48.826864 | 2020-05-22 11:11:48.831142 |         4278 | 8798 |      0 | project                                   
938787 |     1 |       1 |    2 | 2020-05-22 11:11:48.826864 | 2020-05-22 11:11:48.830037 |         3173 |    0 |      0 | hash   tbl=439                            
938787 |     0 |       1 |    2 | 2020-05-22 11:11:48.826864 | 2020-05-22 11:11:48.831142 |         4278 | 8798 | 359156 | hash   tbl=439                            
(6 rows)

The query continues to run until the segment value is 1 and a hash table operation is performed on the inner table in the join.

3.    Run the following query to get the SVL_QUERY_REPORT for a query with a segment value of 2:

EXPLAIN <query>;
->  XN Hash Join DS_BCAST_INNER  (cost=109.98..2815367496.05 rows=178125 width=27)
                           Hash Cond: ("outer".eventid = "inner".eventid)
   ->  XN Seq Scan on sales  (cost=0.00..1724.56 rows=172456 width=14)

4.    Run the following query:

select query,slice,segment,step,start_time,end_time,elapsed_time,rows,bytes,label from svl_query_report where query = 938787 and segment = 2 order by segment, step, elapsed_time, rows;

Here's an example output:

  query  | slice | segment | step |         start_time         |          end_time          | elapsed_time | rows  |  bytes  |            label             
--------+-------+---------+------+----------------------------+----------------------------+--------------+-------+---------+------------------------------
 938787 |     1 |       2 |    0 | 2020-05-22 11:11:48.839297 | 2020-05-22 11:11:48.865857 |        26560 | 86519 | 1730380 | scan   tbl=278792 name=sales 
 938787 |     0 |       2 |    0 | 2020-05-22 11:11:48.838371 | 2020-05-22 11:11:48.865857 |        27486 | 85937 | 1718740 | scan   tbl=278792 name=sales 
 938787 |     1 |       2 |    1 | 2020-05-22 11:11:48.839297 | 2020-05-22 11:11:48.865857 |        26560 | 86519 |       0 | project                      
 938787 |     0 |       2 |    1 | 2020-05-22 11:11:48.838371 | 2020-05-22 11:11:48.865857 |        27486 | 85937 |       0 | project                      
 938787 |     1 |       2 |    2 | 2020-05-22 11:11:48.839297 | 2020-05-22 11:11:48.865857 |        26560 | 86519 |       0 | project                      
 938787 |     0 |       2 |    2 | 2020-05-22 11:11:48.838371 | 2020-05-22 11:11:48.865857 |        27486 | 85937 |       0 | project                      
 938787 |     1 |       2 |    3 | 2020-05-22 11:11:48.839297 | 2020-05-22 11:11:48.865857 |        26560 | 86519 |       0 | hjoin  tbl=439               
 938787 |     0 |       2 |    3 | 2020-05-22 11:11:48.838371 | 2020-05-22 11:11:48.865857 |        27486 | 85937 |       0 | hjoin  tbl=439               
 938787 |     1 |       2 |    4 | 2020-05-22 11:11:48.839297 | 2020-05-22 11:11:48.865857 |        26560 | 86519 |       0 | project                      
 938787 |     0 |       2 |    4 | 2020-05-22 11:11:48.838371 | 2020-05-22 11:11:48.865857 |        27486 | 85937 |       0 | project                      
 938787 |     1 |       2 |    5 | 2020-05-22 11:11:48.839297 | 2020-05-22 11:11:48.865857 |        26560 | 86519 |       0 | project                      
 938787 |     0 |       2 |    5 | 2020-05-22 11:11:48.838371 | 2020-05-22 11:11:48.865857 |        27486 | 85937 |       0 | project                      
 938787 |     1 |       2 |    6 | 2020-05-22 11:11:48.839297 | 2020-05-22 11:11:48.865857 |        26560 |   576 |   34916 | aggr   tbl=448               
 938787 |     0 |       2 |    6 | 2020-05-22 11:11:48.838371 | 2020-05-22 11:11:48.865857 |        27486 |   576 |   34916 | aggr   tbl=448               
 938787 |     1 |       2 |    7 | 2020-05-22 11:11:48.839297 | 2020-05-22 11:11:48.865857 |        26560 |   576 |   17728 | dist                         
 938787 |     0 |       2 |    7 | 2020-05-22 11:11:48.838371 | 2020-05-22 11:11:48.865857 |        27486 |   576 |   17768 | dist                         
(16 rows)   

In this example output, the query is run when the segment value is 2, and performs a sequential scan operation to scan the sales table. In the same segment, an aggregate operation is performed to aggregate results and a hash join operation is performed to join tables. The join columns for one of the tables is not a distribution key or a sort key. As a result, the inner table is distributed to all the compute nodes as DS_BCAST_INNER, which can be seen in the EXPLAIN plan. 

5.    Run the following query to get the SVL_QUERY_REPORT for a query with a segment value of 3:

select query,slice,segment,step,start_time,end_time,elapsed_time,rows,bytes,label from svl_query_report where query = 938787 and segment = 3 order by segment, step, elapsed_time, rows;

->  XN Sort  (cost=1002815368414.24..1002815368415.67 rows=571 width=27)
               Sort Key: sum(sales.pricepaid)
   ->  XN HashAggregate  (cost=2815368386.67..2815368388.10 rows=571 width=27)

Here's an example output:

 query  | slice | segment | step |         start_time         |          end_time          | elapsed_time | rows | bytes |                   label                   
--------+-------+---------+------+----------------------------+----------------------------+--------------+------+-------+-------------------------------------------
 938787 |     1 |       3 |    0 | 2020-05-22 11:11:48.836693 | 2020-05-22 11:11:48.866529 |        29836 |  546 | 28792 | scan   tbl=376298 name=Internal Worktable 
 938787 |     0 |       3 |    0 | 2020-05-22 11:11:48.836654 | 2020-05-22 11:11:48.866529 |        29875 |  606 | 31824 | scan   tbl=376298 name=Internal Worktable 
 938787 |     1 |       3 |    1 | 2020-05-22 11:11:48.836693 | 2020-05-22 11:11:48.866529 |        29836 |  273 | 16580 | aggr   tbl=451                            
 938787 |     0 |       3 |    1 | 2020-05-22 11:11:48.836654 | 2020-05-22 11:11:48.866529 |        29875 |  303 | 18336 | aggr   tbl=451                            
 938787 |     1 |       3 |    2 | 2020-05-22 11:11:48.836693 | 2020-05-22 11:11:48.866529 |        29836 |  273 |     0 | project                                   
 938787 |     0 |       3 |    2 | 2020-05-22 11:11:48.836654 | 2020-05-22 11:11:48.866529 |        29875 |  303 |     0 | project                                   
 938787 |     1 |       3 |    3 | 2020-05-22 11:11:48.836693 | 2020-05-22 11:11:48.866529 |        29836 |  273 | 14396 | sort   tbl=453                            
 938787 |     0 |       3 |    3 | 2020-05-22 11:11:48.836654 | 2020-05-22 11:11:48.866529 |        29875 |  303 | 15912 | sort   tbl=453                            
 938787 |     1 |       3 |    4 | 2020-05-22 11:11:48.836693 | 2020-05-22 11:11:48.866529 |        29836 |    0 |     0 | merge                                     
 938787 |     0 |       3 |    4 | 2020-05-22 11:11:48.836654 | 2020-05-22 11:11:48.866529 |        29875 |    0 |     0 | merge                                     
 938787 |     1 |       3 |    5 | 2020-05-22 11:11:48.836693 | 2020-05-22 11:11:48.866529 |        29836 |    0 |     0 | aggr   tbl=456                            
 938787 |     0 |       3 |    5 | 2020-05-22 11:11:48.836654 | 2020-05-22 11:11:48.866529 |        29875 |    0 |     0 | aggr   tbl=456                            
 938787 |     1 |       3 |    6 | 2020-05-22 11:11:48.836693 | 2020-05-22 11:11:48.866529 |        29836 |    0 |     0 | project                                   
 938787 |     0 |       3 |    6 | 2020-05-22 11:11:48.836654 | 2020-05-22 11:11:48.866529 |        29875 |    0 |     0 | project                                   
(14 rows)

The query continues to run until the segment value is 3 and a hash aggregate operation and sort operation are performed. A hash aggregate operation is performed on unsorted grouped aggregate functions. The sort operation is performed to evaluate the ORDER BY clause.

6.    Run the following query to get the SVL_QUERY_REPORT for a query with a segment value of 4 and 5: 

select query,slice,segment,step,start_time,end_time,elapsed_time,rows,bytes,label from svl_query_report where query = 938787 and segment = 4  order by segment, step, elapsed_time, rows;

 query  | slice | segment | step |         start_time         |          end_time          | elapsed_time | rows | bytes |                 label                 
--------+-------+---------+------+----------------------------+----------------------------+--------------+------+-------+----------------------------------------
 938787 |     1 |       4 |    0 | 2020-05-22 11:11:48.915148 | 2020-05-22 11:11:48.915215 |           67 |  273 | 14396 | scan   tbl=453 name=Internal Worktable
 938787 |     0 |       4 |    0 | 2020-05-22 11:11:48.915148 | 2020-05-22 11:11:48.915424 |          276 |  303 | 15912 | scan   tbl=453 name=Internal Worktable
 938787 |     1 |       4 |    1 | 2020-05-22 11:11:48.915148 | 2020-05-22 11:11:48.915215 |           67 |  273 |  8888 | return                                 
 938787 |     0 |       4 |    1 | 2020-05-22 11:11:48.915148 | 2020-05-22 11:11:48.915424 |          276 |  303 |  8864 | return                                 
 938787 |  6411 |       5 |    1 | 2020-05-22 11:11:48.914501 | 2020-05-22 11:11:48.916049 |         1548 |  576 |     0 | project
 938787 |  6411 |       5 |    2 | 2020-05-22 11:11:48.914501 | 2020-05-22 11:11:48.916049 |         1548 |  576 | 18360 | return
 938787 |  6411 |       5 |    0 | 2020-05-22 11:11:48.914501 | 2020-05-22 11:11:48.916049 |         1548 |    0 |     0 | merge
 (7 rows)

After all the segments are used, the query runs a network operation on segments 4 and 5 to send intermediate results to the leader node. The results are sent to the leader node for additional processing.

After the query is run, use the following query to check the execution time of the query in milliseconds:

select datediff (ms, exec_start_time, exec_end_time) from stl_wlm_query where query= 938787;

date_diff 
-----------
 101
(1 row)

Optimizing your query

To optimize your query while analyzing the query plan, perform the following steps:

1.    Identify the steps with the highest cost.

2.    Check if there are any high-cost sort operations. Note that performance of a query depends on the data distribution method along with the data being scanned by the query. Be sure to select the proper distribution style for a table to minimize the impact of the redistribution step. Additionally, use a sort key for suitable columns to improve query speed and reduce the number of blocks that need to be scanned. For more information on how to choose distribution and sort keys, see Amazon Redshift Engineering’s advanced table design playbook: distribution styles and distribution keys.

The following examples use the STL_ALERT_EVENT_LOG table to identify and correct potential query performance issues:

select query, btrim(event) as event, btrim(solution) as solution from stl_alert_event_log where query = 940313;

 query  |              event               |          solution          
--------+----------------------------------+-------------------------
 940313 | Missing query planner statistics | Run the ANALYZE command
 (1 row)

In this example output, the query for the ANALYZE command can be used to improve query performance because the statistics for the query are outdated.

You can also use the EXPLAIN plan to see if there are any alerts that are being populated for the query:

explain select eventname, sum (pricepaid) from sales, event where sales.eventid = event.eventid group by eventname order by 2 desc;
                                                QUERY PLAN                   
                                     
-----------------------------------------------------------------------------------------------------------
 XN Merge  (cost=1029210993681.72..1029210993682.22 rows=200 width=330)
   Merge Key: sum(sales.pricepaid)
   ->  XN Network  (cost=1029210993681.72..1029210993682.22 rows=200 width=330)
         Send to leader
         ->  XN Sort  (cost=1029210993681.72..1029210993682.22 rows=200 width=330)
               Sort Key: sum(sales.pricepaid)
               ->  XN HashAggregate  (cost=29210993673.57..29210993674.07 rows=200 width=330)
                     ->  XN Hash Join DS_BCAST_INNER  (cost=109.98..29210955741.88 rows=7586340 width=330)
                           Hash Cond: ("outer".eventid = "inner".eventid)
                           ->  XN Seq Scan on sales  (cost=0.00..1724.56 rows=172456 width=16)
                           ->  XN Hash  (cost=87.98..87.98 rows=8798 width=322)
                                 ->  XN Seq Scan on event  (cost=0.00..87.98 rows=8798 width=322)
 ----- Tables missing statistics: event, sales -----
 ----- Update statistics by running the ANALYZE command on these tables -----

3.    Check the join types.

Note: A nested loop is the least optimal join because it is mainly used for cross-joins and some inequality joins.

The following example shows a cross-join between two tables. A nested loop join is being used and the first cost value is 0.00. This cost value is the relative cost for returning the first row of the cross-join operation. The second value (3901467082.32) provides the relative cost of completing the cross-join operation. Note the cost difference between the first and last row. The nested loops negatively impact your cluster’s performance by overloading the queue with long-running queries:

explain select * from sales cross join event;
                                             QUERY PLAN                   
                                    
---------------------------------------------------------------------------------------------------------
 XN Nested Loop DS_BCAST_INNER  (cost=0.00..3901467082.32 rows=1517267888 width=88)
   ->  XN Seq Scan on sales  (cost=0.00..1724.56 rows=172456 width=53)
   ->  XN Seq Scan on event  (cost=0.00..87.98 rows=8798 width=35)
 ----- Nested Loop Join in the query plan - review the join predicates to avoid Cartesian products -----
(4 rows)

Note: Amazon Redshift selects a join operator based on the distribution style of the table and location of the data required.

To optimize the query performance, the sort key and distribution key have been changed to "eventid" for both tables. In the following example, the merge join is being used instead of a hash join:

explain select eventname, sum (pricepaid) from sales, event where sales.eventid = event.eventid group by eventname order by 2 desc;
                                           QUERY PLAN                       
                           
------------------------------------------------------------------------------------------------
 XN Merge  (cost=1000000004965.12..1000000004966.55 rows=571 width=27)
   Merge Key: sum(sales.pricepaid)
   ->  XN Network  (cost=1000000004965.12..1000000004966.55 rows=571 width=27)
         Send to leader
         ->  XN Sort  (cost=1000000004965.12..1000000004966.55 rows=571 width=27)
               Sort Key: sum(sales.pricepaid)
               ->  XN HashAggregate  (cost=4937.55..4938.98 rows=571 width=27)
                     ->  XN Merge Join DS_DIST_NONE  (cost=0.00..4046.93 rows=178125 width=27)
                           Merge Cond: ("outer".eventid = "inner".eventid)
                           ->  XN Seq Scan on sales  (cost=0.00..1724.56 rows=172456 width=14)
                           ->  XN Seq Scan on event  (cost=0.00..87.98 rows=8798 width=21)
(11 rows)

4.    Identify any broadcast operators with high-cost operations.

Note: For small tables, broadcast operators aren't always considered non-optimal because the redistribution of small tables does not impact query performance as much relatively.

5.    Run the following query to check the execution time of the query.

select datediff (ms, exec_start_time, exec_end_time) from stl_wlm_query where query= 940715;

 date_diff 
-----------
 34
 (1 row)

A difference in execution time for both queries confirms that the query plan has correctly correlated to the query report.


Did this article help?


Do you need billing or technical support?