Amazon Aurora PostgreSQL parameters, Part 3: Optimizer parameters

Organizations today have a strategy to migrate from traditional databases and as they plan their migration, they don’t want to compromise on performance, availability, and security features. Amazon Aurora is a cloud native relational database service that combines the speed and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. The PostgreSQL-compatible edition of Aurora delivers up to 3X the throughput of standard PostgreSQL running on the same hardware, enabling existing PostgreSQL applications and tools to run without requiring modification. The combination of PostgreSQL compatibility with Aurora enterprise database capabilities provides an ideal target for commercial database migrations.

Amazon Aurora PostgreSQL-Compatible Edition has enhancements at the engine level which improves the performance for high concurrent OLTP (Online Transaction Processing – typically facilitate and manage transaction-oriented applications) workload, and also helps bridge the feature gap between commercial engines and open-source engines. While the default parameter settings for Aurora PostgreSQL are good for most of the workloads, customers who migrate their workloads from commercial engines may need to tune some of the parameters according to performance and other non-functional requirements. Even for workloads which are migrated from RDS PostgreSQL to Aurora PostgreSQL, we may need to relook at some of the parameter settings because of architectural differences and engine level optimizations.

In this four part series, we explain parameters specific to Aurora PostgreSQL. We also delve into certain PostgreSQL database parameters that apply to Aurora PostgreSQL, how they behave differently, and how to set these parameters to leverage or control additional features in Aurora PostgreSQL.

In part one of this series, we discussed the instance memory-related parameters and Query Plan Management parameters that can help you tune Amazon Aurora PostgreSQL. In part two, we discussed parameters related to replication, security, and logging. In this part, we will cover Aurora PostgreSQL optimizer parameters that can improve performance of queries. In part four, we will cover parameters which can align Aurora PostgreSQL closer to American National Standards Institute (ANSI) standards and reduce the migration effort when migrating from commercial engines.

Query Optimizer

PostgreSQL uses a cost-based optimizer that works well in most cases. You can review the cost of different steps of a run plan by using the EXPLAIN command. EXPLAIN returns the query plan that was chosen by the optimizer, without running it. The optimizer uses statistics collected about the table’s data, metadata, and resource constraints to arrive at an optimal query plan.

Certain metadata about table constraints isn’t factored in by the PostgreSQL optimizer. NOT NULL and FOREIGN KEY constraints are examples of this. If the optimizer factors these during the plan generation phase, the corresponding plan is more efficient.

Amazon Aurora PostgreSQL introduced these enhancements in version v2.4.0 (compatible with PostgreSQL 10.11) and 3.1.0 (compatible with PostgreSQL 11.6). All the subsequent minor version releases for Aurora PostgreSQL 2.x, and Aurora PostgreSQL 3.x have these optimizations. These additional query optimizer features are also available in newer major versions release such as Aurora PostgreSQL 4.x (compatible with PostgreSQL 12).

In this post, we discuss how these optimizations work by first going through the default PostgreSQL behavior and then the effect of the enhancements on the same plans on the Aurora platform. You can control all these optimizations with parameters, which allow you to revert back to default PostgreSQL behavior if you need to.

You can enable these parameters in the DB cluster parameter group or DB parameter group. Any change to these parameters is dynamic in nature and doesn’t require a restart of the Aurora PostgreSQL DB instance. You can also change these parameters for a specific session which can be useful for testing the effect of these parameters.

If you’re new to PostgreSQL or databases, we highly recommend that you run the commands and examples. This can help build familiarity with reading and understanding output from the EXPLAIN command. For this purpose, we have used four tables of pgbench schema, which can be easily recreated by the pgbench tool, along with sample data:

pgbench -i   -s  50000  -d pgbench  --foreign-keys  -U  username

To show the impact of some of the optimizations present in Aurora PostgreSQL, we used a scale of 50,000, meaning pgbench creates data for 50,000 branches. This is roughly 700 GB of data. You can choose a smaller scale as per the instance size.

apg_enable_not_in_transform

This parameter improves the query plan by rewriting a NOT IN clause to an anti join. An anti join optimization is similar to how the IN clause is converted to EXISTS or JOIN during query optimization. Converting NOT IN to NOT EXISTS is complex because of plausible nullability of columns involved in NOT IN. Consider the following query:

select count(bid) from pgbench_branches b 
          where not exists 
                (select 1 from pgbench_accounts a where a.bid=b.bid);

If bid in pgbench_account has null values, the result of this query is very different if we express our requirement using the NOT EXISTS clause:

pgtraining=> select count(bid) from pgbench_branches b where b.bid not in (select a.bid from pgbench_accounts a );
 count
-------
     0
(1 row)

pgtraining=> select count(bid) from pgbench_branches b where not exists (select 1 from pgbench_accounts a where a.bid=b.bid);
 count
-------
     2
(1 row)

pgtraining=> select count(*) from pgbench_accounts where bid is null;
 count
-------
     1
(1 row)

If the IN or NOT IN list has NULL, the evaluation also returns NULL and the matching row isn’t part of your output. In contrast, when using NOT EXISTS, all rows except matching rows are returned (hence anti join).

But what if it’s ensured that outer query doesn’t have any null value for the bid column in pgbench_accounts? We can ensure this by adding a not null constraint. The following is the output of the same select after deleting all null values and adding a not-null column:

pgtraining=> select count(*) from pgbench_accounts where bid is null;
 count
-------
     0
(1 row)

Now both the queries have the same result:

pgtraining=> select count(bid) from pgbench_branches b where not exists (select 1 from pgbench_accounts a where a.bid=b.bid);
 count
-------
     2
(1 row)

pgtraining=> select count(bid) from pgbench_branches b where b.bid not in (select a.bid from pgbench_accounts a );
 count
-------
     2
(1 row)

The following code is the query plan after setting the column to not-null and adding an index:

pgtraining=> explain (analyze,timing) select count(bid) from pgbench_branches b where not exists (select 1 from pgbench_accounts a where a.bid=b.bid);
                                                                              QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=641.16..641.17 rows=1 width=8) (actual time=11.869..11.869 rows=1 loops=1)
   ->  Nested Loop Anti Join  (cost=0.57..641.16 rows=1 width=4) (actual time=11.866..11.867 rows=0 loops=1)
         ->  Seq Scan on pgbench_branches b  (cost=0.00..15.00 rows=1000 width=4) (actual time=0.005..0.084 rows=1000 loops=1)
         ->  Index Only Scan using pgbennch_accounts_bid on pgbench_accounts a  (cost=0.57..1933.72 rows=100000 width=4) (actual time=0.011..0.011 rows=1 loops=1000)
               Index Cond: (bid = b.bid)
               Heap Fetches: 0
 Planning Time: 0.359 ms
 Execution Time: 11.898 ms
(8 rows)

pgtraining=> explain (analyze,timing) select count(bid) from pgbench_branches b where b.bid not in (select a.bid from pgbench_accounts a );

                                                                               QUERY PLAN              
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1457409519.32..1457409519.33 rows=1 width=8) (actual time=6725216.832..6725216.833 rows=1 loops=1)
   ->  Seq Scan on pgbench_branches b  (cost=0.57..1457409518.07 rows=500 width=4) (actual time=6725216.829..6725216.829 rows=0 loops=1)
         Filter: (NOT (SubPlan 1))
         Rows Removed by Filter: 1000
         SubPlan 1
           ->  Materialize  (cost=0.57..2664819.57 rows=100000000 width=4) (actual time=0.009..3717.008 rows=49950001 loops=1000)
                 ->  Index Only Scan using pgbennch_accounts_bid on pgbench_accounts a  (cost=0.57..1774194.57 rows=100000000 width=4) (actual time=0.033..9695.716 rows=99900001 loops=1)
                       Heap Fetches: 0
 Planning Time: 0.067 ms
 Execution Time: 6725383.759 ms
(10 rows)

The Execution Time reported in the EXPLAIN output for both queries indicates that NOT IN runs for more than 1.5 hours, whereas NOT EXISTS starts fetching the same result set after running for less than 12 milliseconds—an improvement of 560,000 times over NOT IN.

Now let’s see how Aurora PostgreSQL rewrites the NOT IN query and generates a optimal plan using the index to perform a faster anti join if apg_enable_not_in_transform is on:

pgtraining=> set apg_enable_not_in_transform=on;
SET
pgtraining=> explain (analyze,timing) select count(bid) from pgbench_branches b where b.bid not in (select a.bid from pgbench_accounts a );
                                                                              QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=641.16..641.17 rows=1 width=8) (actual time=11.919..11.920 rows=1 loops=1)
   ->  Nested Loop Anti Join  (cost=0.57..641.16 rows=1 width=4) (actual time=11.917..11.917 rows=0 loops=1)
         ->  Seq Scan on pgbench_branches b  (cost=0.00..15.00 rows=1000 width=4) (actual time=0.005..0.084 rows=1000 loops=1)
         ->  Index Only Scan using pgbennch_accounts_bid on pgbench_accounts a  (cost=0.57..1933.72 rows=100000 width=4) (actual time=0.011..0.011 rows=1 loops=1000)
               Index Cond: (bid = b.bid)
               Heap Fetches: 0
 Planning Time: 0.564 ms
 Execution Time: 12.001 ms
(8 rows)

OLTP systems that typically have some form of batch reporting and light analytic workload where such queries are common can gain noticeable performance improvements by way of this feature. We recommend that you evaluate with rigorous testing of this parameter at the instance level, or choose the session-based approach as demonstrated.

apg_enable_semijoin_push_down

This parameter in Aurora PostgreSQL enables the use of semi join filters for hash joins. Based on selectivity estimates on the inner table, the optimizer builds a bloom filter strategy using the values in the inner table of the hash join. For a hash join the table with lower cardinality is chosen for building a hash. Once a hash for each value is generated, column values from the outer (typically bigger) table is used for evaluating the hash condition to generate matching rows for the join output.

When apg_enable_semijoin_push_down is enabled, the optimizer avoids a second pass over those rows in the result set which might not satisfy the join condition. To do this, a bloom filter is applied at the time of the sequential scan which is used for building the outer result set. This can slow down the sequential scan a bit, but the bloom filter helps reduce the result set that needs to be checked for hash condition in subsequent stages.

For example, a join between distinct branches in pgbench_account and those rows in pgbench_teller where tbalance is greater than a certain amount result in a hash join. The following code is the explain plan when this parameter is not enabled:

pgtraining=> show apg_enable_semijoin_push_down;
 apg_enable_semijoin_push_down
-------------------------------
 off
(1 row)
pgtraining=> explain (verbose, analyze ) select count(*) from (select distinct bid from pgbench_accounts) a inner join pgbench_tellers b on a.bid=b.bid where tbalance<90000;



                                                                               QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=144499212.92..144499212.93 rows=1 width=8) (actual time=3608401.672..3608401.673 rows=1 loops=1)
   Output: count(*)
   ->  Hash Join  (cost=144488077.36..144498687.16 rows=210303 width=0) (actual time=3608235.118..3608377.416 rows=500000 loops=1)
         Inner Unique: true
         Hash Cond: (b.bid = pgbench_accounts.bid)
         ->  Seq Scan on public.pgbench_tellers b  (cost=0.00..9298.95 rows=499330 width=4) (actual time=0.009..62.637 rows=500000 loops=1)
               Output: b.tid, b.bid, b.tbalance, b.filler
               Filter: (b.tbalance < 90000)
         ->  Hash  (cost=144487812.06..144487812.06 rows=21224 width=4) (actual time=3608234.981..3608234.981 rows=50000 loops=1)
               Output: pgbench_accounts.bid
               Buckets: 65536 (originally 32768)  Batches: 1 (originally 1)  Memory Usage: 2270kB
               ->  HashAggregate  (cost=144487387.57..144487599.81 rows=21224 width=4) (actual time=3608222.159..3608228.311 rows=50000 loops=1)
                     Output: pgbench_accounts.bid
                     Group Key: pgbench_accounts.bid
                     ->  Seq Scan on public.pgbench_accounts  (cost=0.00..131985642.06 rows=5000698206 width=4) (actual time=1.654..2355970.400 rows=5000000000 loops=1)
                           Output: pgbench_accounts.aid, pgbench_accounts.bid, pgbench_accounts.abalance, pgbench_accounts.filler
 Planning Time: 0.097 ms
 Execution Time: 3608402.563 ms
(18 rows)

If we allow Aurora PostgreSQL to apply a semi join filter when the result set from pgbench_tellers is being built, the subsequent join step becomes simpler. In the following explain plan, Aurora PostgreSQL can apply a bloom filter (with estimated selectivity of 0.42117), which reduces the number of rows that need to be evaluated for the hash condition:

pgtraining=> set apg_enable_semijoin_push_down=on ;
SET
pgtraining=> explain (verbose, analyze ) select count(*) from (select distinct bid from pgbench_accounts) a inner join pgbench_tellers b on a.bid=b.bid where tbalance<90000;
                                                                               QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=144499212.92..144499212.93 rows=1 width=8) (actual time=2891246.069..2891246.070 rows=1 loops=1)
   Output: count(*)
   ->  Hash Join  (cost=144488077.36..144498687.16 rows=210303 width=0) (actual time=2890934.583..2891213.472 rows=500000 loops=1)
         Inner Unique: true
         Hash Cond: (b.bid = pgbench_accounts.bid)
         SemiJoin Filter SJF1 Created Based on: (b.bid = pgbench_accounts.bid)
         SemiJoin Est Selectivity: 0.42117
         SemiJoin Est Rows Filtered: 289027.4
         SemiJoin Est Outer Distincts: 50393.0
         SemiJoin Est Inner Distincts: 21224.0
         ->  Seq Scan on public.pgbench_tellers b  (cost=0.00..9298.95 rows=499330 width=4) (actual time=0.891..172.662 rows=500000 loops=1)
               Output: b.tid, b.bid, b.tbalance, b.filler
               Filter: (b.tbalance < 90000)
               SemiJoin Filter Use: SJF1 Type: HashTable  Lookups: 29695  Removes: 0
         ->  Hash  (cost=144487812.06..144487812.06 rows=21224 width=4) (actual time=2890933.569..2890933.569 rows=50000 loops=1)
               Output: pgbench_accounts.bid
               Buckets: 65536 (originally 32768)  Batches: 1 (originally 1)  Memory Usage: 2270kB
               ->  HashAggregate  (cost=144487387.57..144487599.81 rows=21224 width=4) (actual time=2890919.922..2890926.349 rows=50000 loops=1)
                     Output: pgbench_accounts.bid
                     Group Key: pgbench_accounts.bid
                     ->  Seq Scan on public.pgbench_accounts  (cost=0.00..131985642.06 rows=5000698206 width=4) (actual time=0.697..1591058.525 rows=5000000000 loops=1)
                           Output: pgbench_accounts.aid, pgbench_accounts.bid, pgbench_accounts.abalance, pgbench_accounts.filler
 Planning Time: 0.100 ms
 Execution Time: 2891247.234 ms
(24 rows)

Even though the sequential scan runs slightly slower because of the semi join pushdown, overall query performance is better (less than 49 minutes as compared to a little over 1 hour when the semi join pushdown isn’t enabled). The optimization achieved depends on the selectivity of the inner result set.

Once again, although this is an instance parameter, the scope of the number of SQL queries on your workload that are directly impacted could be quite a few. A session-based usage could allow for the right usage.

apg_enable_remove_redundant_inner_joins

This parameter improves query performance by removing a join operation, which doesn’t impact the final result of the query (the query result remains the same irrespective of whether the join is run). An example use case is when a join is performed between a child and parent table using a foreign key, like the following query:

SELECT teller.tid,teller.bid,teller.tbalance FROM pgbench_tellers teller
JOIN pgbench_branches branches
ON teller.bid=branches.bid WHERE tbalance>100;

If we look at the time spent running different steps involved in the query plan, you can see that most of the time has been spent on running the join. The overall run of the query takes 173.109 milliseconds and the hash join itself takes 155.068 milliseconds to return all the rows from that step:

pgtraining=> EXPLAIN (ANALYZE, TIMING) SELECT teller.tid,teller.bid,teller.tbalance FROM pgbench_tellers teller
JOIN pgbench_branches branches
ON teller.bid=branches.bid WHERE tbalance>100;
                                                                QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=1673.10..13298.21 rows=497136 width=12) (actual time=11.549..155.068 rows=497452 loops=1)
   Hash Cond: (teller.bid = branches.bid)
   ->  Seq Scan on pgbench_tellers teller  (cost=0.00..10320.03 rows=497136 width=12) (actual time=0.008..62.330 rows=497452 loops=1)
         Filter: (tbalance > 100)
         Rows Removed by Filter: 2550
   ->  Hash  (cost=1013.60..1013.60 rows=52760 width=4) (actual time=11.483..11.483 rows=50002 loops=1)
         Buckets: 65536  Batches: 1  Memory Usage: 2270kB
         ->  Seq Scan on pgbench_branches branches  (cost=0.00..1013.60 rows=52760 width=4) (actual time=0.005..5.510 rows=50002 loops=1)
 Planning Time: 0.143 ms
 Execution Time: 173.109 ms
(10 rows)

Now let’s look at the table structure for pgbench_tellers:

pgtraining=> \d pgbench_tellers
              Table "public.pgbench_tellers"
  Column  |     Type      | Collation | Nullable | Default
----------+---------------+-----------+----------+---------
 tid      | integer       |           | not null |
 bid      | integer       |           |          |
 tbalance | integer       |           |          |
 filler   | character(84) |           |          |
Indexes:
    "pgbench_tellers_pkey" PRIMARY KEY, btree (tid)
Foreign-key constraints:
    "pgbench_tellers_bid_fkey" FOREIGN KEY (bid) REFERENCES pgbench_branches(bid)
Referenced by:
    TABLE "pgbench_history" CONSTRAINT "pgbench_history_tid_fkey" FOREIGN KEY (tid) REFERENCES pgbench_tellers(tid)

In the preceding code, pgbench_tellers is a child table and the FOREIGN KEY constraint makes sure that none of the rows in pgbench_teller have a value that doesn’t exist in the bid column of pgbench_branches. So in this case, we can avoid the join if we use a simple SQL:

pgtraining=> EXPLAIN (ANALYZE, TIMING) SELECT teller.tid,teller.bid,teller.tbalance FROM pgbench_tellers teller WHERE tbalance>100;
                                                           QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on pgbench_tellers teller  (cost=0.00..10320.03 rows=497136 width=12) (actual time=0.011..63.147 rows=497452 loops=1)
   Filter: (tbalance > 100)
   Rows Removed by Filter: 2550
 Planning Time: 0.037 ms
 Execution Time: 81.025 ms
(5 rows)

Most users are aware of these SQL optimization techniques and tune their queries. Unlike the previous two parameters, this parameter can work to catch a rogue or ill-performing unoptimized query and avoid additional I/O. For instance, in some cases, application teams or owners of different services might abstract the underlying implementation and allow only a view-based access to other consumers. For example, we might have a view defined as the following code. The view is used to restrict direct access to certain specific columns:

pgtraining=> CREATE OR REPLACE VIEW pgbench_v_teller AS
   SELECT
       teller.tid, 
       teller.bid, 
       teller.tbalance, 
       branches.bbalance AS teller_branch_balance 
   FROM pgbench_tellers teller 
   JOIN pgbench_branches branches 
ON teller.bid=branches.bid;

CREATE VIEW

Now let’s see the default behavior of PostgreSQL (by disabling the parameter – apg_enable_remove_redundant_inner_joins):

pgtraining=> set apg_enable_remove_redundant_inner_joins =off;
SET
pgtraining=> explain (analyze, timing) select tid,bid,tbalance from pgbench_v_teller where tbalance>100;
                                                                QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=1673.10..13298.21 rows=497136 width=12) (actual time=11.679..155.144 rows=497452 loops=1)
   Hash Cond: (teller.bid = branches.bid)
   ->  Seq Scan on pgbench_tellers teller  (cost=0.00..10320.03 rows=497136 width=12) (actual time=0.008..61.824 rows=497452 loops=1)
         Filter: (tbalance > 100)
         Rows Removed by Filter: 2550
   ->  Hash  (cost=1013.60..1013.60 rows=52760 width=4) (actual time=11.612..11.612 rows=50002 loops=1)
         Buckets: 65536  Batches: 1  Memory Usage: 2270kB
         ->  Seq Scan on pgbench_branches branches  (cost=0.00..1013.60 rows=52760 width=4) (actual time=0.005..5.468 rows=50002 loops=1)
 Planning Time: 0.153 ms
 Execution Time: 173.250 ms
(10 rows)

When we query from a view, the view name is replaced with the view query as it is. As you learned in our previous example, the join is performed even though we select columns from only one table. A join might seem necessary to apply the filter, such as selecting only those rows from pgbench_tellers that exist in pgbench_branches. If we review the table structure of pgbench_tellers, it becomes clear that pgbench_teller only has rows that have a corresponding bid in pgbench_branches (due to the referential integrity constraint). It’s preferred if the optimizer can consider this information (which can be fetched from catalog tables).

If we enable apg_enable_remove_redundant_inner_joins, the optimizer starts considering the foreign key relationships to ascertain whether a table can be skipped from INNER JOIN:

pgtraining=> set apg_enable_remove_redundant_inner_joins =on;
SET
pgtraining=> explain (analyze,timing) select tid,bid,tbalance from pgbench_v_teller;
                                                          QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on pgbench_tellers teller  (cost=0.00..9070.02 rows=500002 width=12) (actual time=0.011..64.024 rows=500002 loops=1)
   Filter: (bid IS NOT NULL)
 Planning Time: 0.053 ms
 Execution Time: 82.013 ms
(4 rows)

The optimizer is now only getting rows from pgbench_tellers. Interestingly, the optimizer has introduced an additional filter here that isn’t part of the view definition: bid IS NOT NULL. That’s because the column bid in pgbench_tellers allows null values and the join criteria filtered out those rows.

Both the run time and planning time has reduced. That’s because now the optimizer doesn’t need to check stats for pgbench_branches and avoids evaluating multiple join options.

Conclusion

In this post, we covered additional SQL optimizations that Aurora PostgreSQL can use to improve query performance. Aurora uses parameters to control these optimizations, which can be set at the Aurora cluster, DB instance, or session level.

For more information about Aurora PostgreSQL parameters, see part one and part two in this series. In part four we discuss ANSI compatibility options which can reduce the code change when migrating applications from commercial databases.

About the authors

Sameer Kumar is a Database Specialist Technical Account Manager at Amazon Web Services. He focuses on Amazon RDS, Amazon Aurora and Amazon DocumentDB. He works with enterprise customers providing technical assistance on database operational performance and sharing database best practices.

Gopalakrishnan Subramanian is a Database Specialist solutions architect at Amazon Web Services. He works with our customers to provide guidance and technical assistance on database projects, helping them improving the value of their solutions when using AWS

AWS Database Blog