AWS Database Blog

PostgreSQL 18 on Amazon Aurora and Amazon RDS: Performance enhancements

Managing query performance across composite indexes, diagnosing memory spills in materialized CTEs, and upgrading major versions without plan regressions are everyday challenges for PostgreSQL users. PostgreSQL 18 addresses each of these directly with skip scan optimization that removes redundant indexes, EXPLAIN enhancements that expose storage behavior, and optimizer statistics that survive major version upgrades. You can try these features today on both Amazon Aurora PostgreSQL-Compatible Edition and Amazon Relational Database Service (Amazon RDS) for PostgreSQL.

This is Part 1 of a two-part series covering the key features in PostgreSQL 18. In this post, we focus on performance enhancements: skip scan optimization for multicolumn indexes, enhanced EXPLAIN output, automatic removal of unnecessary self-joins, and several vacuum and autovacuum improvements that help keep your database running efficiently. In Part 2, we cover security, monitoring, developer, and logical replication enhancements. For more information, see the PostgreSQL 18 release notes.

Prerequisites

You can read this post without running any commands, but if you want to try the examples, you must have these prerequisites:

  1. Create an Aurora PostgreSQL cluster or RDS for PostgreSQL instance if you don’t already have one. For instructions, see Create an Amazon Aurora PostgreSQL-Compatible DB cluster or Create a PostgreSQL DB instance, respectively.
  2. Create an Amazon Elastic Compute Cloud (Amazon EC2) instance to install the PostgreSQL client to access the Aurora PostgreSQL or RDS for PostgreSQL instance. For instructions, see Create your EC2 resources and launch your EC2 instance. Or you can set up connectivity between your RDS database and EC2 compute instance in 1-click.
  3. Install the PostgreSQL client. On Amazon Linux 2023, you can use the following commands to download the psql command line tool:

    Install all required build dependencies:

    sudo dnf install -y gcc readline-devel libicu-devel zlib-devel openssl-devel make bison flex perl-core

    Download PostgreSQL 18 source code. Replace 18.x in the commands below with the current minor version (for example, 18.4):

    wget https://ftp.postgresql.org/pub/source/v18.x/postgresql-18.x.tar.gz

    Extract the source:

    tar -xvzf postgresql-18.x.tar.gz

    Navigate to the source directory:

    cd postgresql-18.x

    Configure the build (set bindir to /usr/bin for system-wide installation):

    ./configure --bindir=/usr/bin --with-openssl

    Build and install only the client binaries, libraries, and headers:

    sudo make -C src/bin install
    sudo make -C src/include install
    sudo make -C src/interfaces install

    Verify the installation:

    psql --version

Skip scan optimization for multicolumn indexes

PostgreSQL 18 introduces skip scan optimization, which allows the query planner to use multicolumn B-tree indexes more efficiently when the leading columns are not specified in the WHERE clause. This optimization can improve query performance in scenarios where you have composite indexes, but queries don’t always filter the leading columns.

Understanding the skip scan optimization

In previous PostgreSQL versions, a multicolumn index was used efficiently if the query included the leading column(s) in the WHERE clause. For example, with an index on (status, created_date), a query filtering only on created_date could not use the index.

Skip scan allows PostgreSQL to skip through the distinct values of the leading column(s) and use the index for subsequent columns, effectively treating the multicolumn index as multiple single-column indexes.

Example scenario

Let’s create a table with a multicolumn index and observe the skip scan optimization:

-- Create a table for customer orders
CREATE TABLE customer_orders (
    order_id serial PRIMARY KEY,
    status varchar(20),
    created_date date,
    customer_id integer,
    amount numeric(10,2)
);

-- Create a multicolumn index
CREATE INDEX idx_orders_status_date ON customer_orders(status, created_date);

-- Insert sample data with a few distinct status values
INSERT INTO customer_orders (status, created_date, customer_id, amount)
SELECT
    CASE (random() * 4)::integer
        WHEN 0 THEN 'pending'
        WHEN 1 THEN 'processing'
        WHEN 2 THEN 'shipped'
        WHEN 3 THEN 'delivered'
        ELSE 'cancelled'
    END,
    current_date - (random() * 365)::integer,
    (random() * 10000)::integer,
    (random() * 1000)::numeric(10,2)
FROM generate_series(1, 1000000);

-- Analyze the table
ANALYZE customer_orders;

Now let’s run a query that filters only on created_date (not the leading column):

-- Query filtering on created_date only
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM customer_orders
WHERE created_date = '2026-03-15';

PostgreSQL 18 output (with skip scan):

Bitmap Heap Scan on customer_orders  (cost=47.66..5422.68 rows=2726 width=27) (actual time=0.581..3.223 rows=2659 loops=1)
   Recheck Cond: (created_date = '2026-03-15'::date)
   Heap Blocks: exact=2229
   Buffers: shared hit=2266
   ->  Bitmap Index Scan on idx_orders_status_date  (cost=0.00..46.98 rows=2726 width=0) (actual time=0.305..0.305 rows=2659 loops=1)
         Index Cond: (created_date = '2026-03-15'::date)
         Index Searches: 11
         Buffers: shared hit=37
Planning Time: 0.065 ms
Execution Time: 3.374 ms

In PostgreSQL 17 and earlier, this query would likely use a sequential scan or require a separate index on created_date:

PostgreSQL 17 output (without skip scan):

Gather  (cost=1000.00..13833.43 rows=2721 width=27) (actual time=0.995..72.942 rows=2700 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   Buffers: shared hit=7353
   ->  Parallel Seq Scan on customer_orders  (cost=0.00..12561.33 rows=1134 width=27) (actual time=0.213..61.421 rows=900 loops=3)
         Filter: (created_date = '2026-03-15'::date)
         Rows Removed by Filter: 332433
Planning Time: 0.091 ms
Execution Time: 73.200 ms

When skip scan is beneficial

Skip scan optimization is most effective when:

  • The leading columns have low cardinality (few distinct values).
  • The query filters on non-leading columns.
  • The selectivity of the non-leading column filter is high.

Performance comparison

In our testing with Aurora PostgreSQL 17.10 and Aurora PostgreSQL 18.4, both on db.r6g.large with 1 million rows and 5 distinct values in the leading status column (warm buffer cache):

Without skip scan (Aurora PostgreSQL 17), the planner cannot use the composite index on (status, created_date) because the leading column is not in the WHERE clause. It falls back to a Parallel Seq Scan, reading all 7,353 blocks and finishing in 73.2 ms.

With skip scan (Aurora PostgreSQL 18), the planner uses the existing composite index with 11 index searches, reading only 2,266 blocks and finishing in 3.4 ms. That’s a 21x improvement with no additional index required.

The skip scan provides performance comparable to a dedicated single-column index without the storage overhead and maintenance cost of maintaining one.

Enhanced EXPLAIN

PostgreSQL 18 brings substantial improvements to the EXPLAIN command, making query performance analysis more intuitive and comprehensive. These enhancements provide deeper insights into query execution, resource usage, and optimization opportunities, helping developers and database administrators identify performance bottlenecks more effectively.

Automatic buffer statistics in EXPLAIN ANALYZE

OneOne of the significant usability improvements in PostgreSQL 18 is that EXPLAIN ANALYZE now includes buffer statistics by default, removing the need to explicitly specify the BUFFERS option. This change addresses a long-standing pain point where critical I/O information was hidden unless you remember to add the BUFFERS flag.

Prior to PostgreSQL 18, analyzing query performance required this syntax:

-- PostgreSQL 17 and earlier: BUFFERS option required
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM customer_orders
WHERE created_date >= '2025-06-01'
  AND created_date < '2025-07-01';

Without the BUFFERS option, you would see execution time and row counts but miss information about cache hits, disk reads, and I/O patterns. In PostgreSQL 18, buffer statistics are included automatically:

-- PostgreSQL 18: Buffer statistics included by default
EXPLAIN ANALYZE
SELECT * FROM customer_orders
WHERE created_date >= '2025-06-01'
  AND created_date < '2025-07-01';

The output now shows:

Seq Scan on customer_orders  (cost=0.00..22353.00 rows=84546 width=27)
  (actual time=0.009..53.400 rows=82676 loops=1)
  Filter: ((created_date >= '2025-06-01'::date) AND (created_date < '2025-07-01'::date))
  Rows Removed by Filter: 917324
  Buffers: shared hit=18234 read=1245
Planning Time: 0.073 ms
Execution Time: 56.664 ms

The buffer statistics immediately reveal that most data was served from cache (18,234 shared buffer hits) with only 1,245 blocks read from disk. This helps you understand whether performance issues stem from I/O bottlenecks or other factors.

Memory and disk usage for materialized nodes

PostgreSQL 18 introduces storage visibility for materialized nodes in query plans. When the executor materializes a CTE for repeated access, the plan now reports whether the intermediate results were held in memory or spilled to disk, along with the peak storage consumed. This information was completely invisible in PostgreSQL 17 and earlier. You had to guess based on work_mem settings and row width estimates.

Example 1: Memory storage (default)

EXPLAIN (ANALYZE, BUFFERS)
WITH top_products AS MATERIALIZED (
    SELECT product_id, product_name, price
    FROM products
    WHERE price > 500
)
SELECT * FROM top_products
ORDER BY price DESC
LIMIT 10;

Output:

 Limit  (cost=394.97..394.99 rows=10 width=238) (actual time=3.028..3.031 rows=10.00 loops=1)
   Buffers: shared hit=64
   CTE top_products
     ->  Seq Scan on products  (cost=0.00..189.00 rows=4950 width=22) (actual time=0.012..1.535 rows=4948.00 loops=1)
           Filter: (price > '500'::numeric)
           Rows Removed by Filter: 5052
           Buffers: shared hit=64
   ->  Sort  (cost=205.97..218.34 rows=4950 width=238) (actual time=3.027..3.028 rows=10.00 loops=1)
         Sort Key: top_products.price DESC
         Sort Method: top-N heapsort  Memory: 26kB
         Buffers: shared hit=64
         ->  CTE Scan on top_products  (cost=0.00..99.00 rows=4950 width=238) (actual time=0.013..2.273 rows=4948.00 loops=1)
               Storage: Memory  Maximum Storage: 296kB
               Buffers: shared hit=64
Planning:
   Buffers: shared hit=36
Planning Time: 0.193 ms
Execution Time: 3.057 ms
(18 rows)

The line to focus on is Storage: Memory Maximum Storage: 296kB on the CTE Scan node. This tells you the entire materialized result set, all 4,948 rows, was held in RAM with a peak footprint of 296kB. No temporary files were created. No disk I/O was needed beyond reading the source table from shared buffers.

Example 2: Disk storage (low work_mem)

Now let’s see what happens when we artificially constrain work_mem so the materialized CTE can no longer fit in memory:

-- Set low work_mem to force disk storage
SET work_mem = '64kB';

EXPLAIN (ANALYZE, BUFFERS)
WITH top_products AS MATERIALIZED (
    SELECT product_id, product_name, price
    FROM products
    WHERE price > 500
)
SELECT * FROM top_products
ORDER BY price DESC
LIMIT 10;

Output:

 Limit  (cost=394.97..394.99 rows=10 width=238) (actual time=3.184..3.187 rows=10.00 loops=1)
   Buffers: shared hit=64, temp written=20
   I/O Timings: temp write=0.127
   CTE top_products
     ->  Seq Scan on products  (cost=0.00..189.00 rows=4950 width=22) (actual time=0.013..1.391 rows=4948.00 loops=1)
           Filter: (price > '500'::numeric)
           Rows Removed by Filter: 5052
           Buffers: shared hit=64
   ->  Sort  (cost=205.97..218.34 rows=4950 width=238) (actual time=3.183..3.184 rows=10.00 loops=1)
         Sort Key: top_products.price DESC
         Sort Method: top-N heapsort  Memory: 26kB
         Buffers: shared hit=64, temp written=20
         I/O Timings: temp write=0.127
         ->  CTE Scan on top_products  (cost=0.00..99.00 rows=4950 width=238) (actual time=0.015..2.411 rows=4948.00 loops=1)
               Storage: Disk  Maximum Storage: 160kB
               Buffers: shared hit=64, temp written=20
               I/O Timings: temp write=0.127
Planning Time: 0.102 ms
Execution Time: 3.268 ms
(19 rows)

Notice the differences. Storage: Disk replaces Storage: Memory, and several new indicators appear. temp written=20 shows temporary file I/O occurred, and I/O Timings: temp write=0.127 quantifies the time spent writing to those temp files. Execution time increased from 3.057 ms to 3.268 ms, modest here because the dataset is small. On larger result sets, the penalty from disk spills compounds significantly.

Why this matters in practice

This storage reporting transforms how you diagnose and tune materialized CTEs. When a query that normally runs in milliseconds suddenly slows down, checking whether a previously in-memory CTE has flipped to Storage: Disk gives you an immediate explanation. The dataset grew past the memory threshold. The Maximum Storage value tells you exactly how much space the materialization needed, so you can make a targeted work_mem adjustment rather than guessing.

That said, the right response isn’t always to increase work_mem. Remember that work_mem is allocated per-operation and per-query within each session, so setting it too high on a system with many concurrent connections can lead to memory pressure. A better approach is to use the Maximum Storage value as a guide. If a CTE needs 296kB and your work_mem is 4 MB, you have plenty of room. If it needs 3.9 MB on a 4 MB setting, a modest bump to 8 MB keeps it in memory without being reckless. For queries where materialization overhead is the bottleneck, you might also consider whether the MATERIALIZED hint is truly necessary. Sometimes, letting the planner inline the CTE produces a better plan altogether.

The combination of Storage, Maximum Storage, temp written, and I/O Timings gives you a full picture of materialization cost that was unavailable before PostgreSQL 18. This turns what used to be an educated guess into a data-driven tuning decision.

Index searches metric in EXPLAIN ANALYZE

PostgreSQL 18 introduces a new metric in EXPLAIN ANALYZE output: Index Searches. This metric shows how many times the database traversed the index tree during query execution, providing valuable insights into index usage patterns.

What is the Index Searches counter?

The Index Searches counter appears in Index Scan, Bitmap Index Scan, and Index-Only Scan nodes. It reports the total number of index tree traversals across all node executions and loops.

Example: IN clause with multiple values

Consider a query searching for products with specific prices:

EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM products
WHERE price IN (100.50, 500.75, 750.25, 999.99);

Output:

Bitmap Heap Scan on products  (cost=17.17..30.51 rows=4 width=29)
                              (actual time=0.046..0.047 rows=1.00 loops=1)
  Recheck Cond: (price = ANY ('{100.50,500.75,750.25,999.99}'::numeric[]))
  Heap Blocks: exact=1
  Buffers: shared hit=4 read=5
  ->  Bitmap Index Scan on idx_products_price  (cost=0.00..17.17 rows=4 width=0)
                                                (actual time=0.039..0.039 rows=1.00 loops=1)
        Index Cond: (price = ANY ('{100.50,500.75,750.25,999.99}'::numeric[]))
        Index Searches: 4
        Buffers: shared hit=3 read=5
Planning Time: 1.907 ms
Execution Time: 0.064 ms

Notice Index Searches: 4 in the output. The query searches for 4 different price values, and PostgreSQL performed 4 separate index tree traversals, one for each value in the IN clause. Each search starts from the index root and navigates to the appropriate leaf page.

Why this matters

Before PostgreSQL 18, you saw that an index was used but not how many times the executor descended into the tree. Now you can. A high search count on an IN clause tells you each value triggered a separate root-to-leaf traversal. If those values happen to be contiguous, rewriting the predicate as a BETWEEN range might cut the number of searches down. In nested loop joins, the metric shows the total searches across all loop iterations. You can quickly tell whether the inner index is being hammered and whether a different join strategy would be cheaper.

The three enhancements covered here (automatic buffer statistics, materialized CTE storage and memory reporting, and index search counts) are only a subset of the EXPLAIN improvements shipped in PostgreSQL 18. For the full list of EXPLAIN-related enhancements, see the PostgreSQL 18 release notes.

Automatic removal of unnecessary self-joins

PostgreSQL 18 introduces Self-Join Elimination (SJE), a query planner optimization that automatically detects and removes inner joins of a table to itself when the join is probably redundant. This applies to SELECT queries, where the planner can safely determine that one side of the self-join contributes no additional rows or columns to the result and eliminate it entirely.

When a self-join is unnecessary

A self-join can be eliminated when two conditions are met:

  1. The join condition is on a unique or primary key column. This means at most one matching row per outer row, so the join can’t produce duplicates or filter rows.
  2. No columns from the inner table alias are referenced in the SELECT list or WHERE clause that couldn’t be satisfied by the outer alias alone.

When these conditions hold, the planner rewrites the query to scan the table once, removing the join entirely.

Example: ORM-generated self-join

Self-joins on primary keys are commonly generated by ORMs and query builders that compose queries from reusable fragments without awareness of redundancy. Consider an employees table with a primary key on id:

CREATE TABLE employees (
    id     serial PRIMARY KEY,
    name   text,
    dept   text,
    salary numeric
);

A query joining the table to itself on the primary key:

EXPLAIN (ANALYZE, COSTS OFF)
SELECT e1.id, e1.name, e1.salary
FROM employees e1
JOIN employees e2 ON e1.id = e2.id;

PostgreSQL 17 and earlier (executes a real join):

Hash Join
  Hash Cond: (e1.id = e2.id)
  ->  Seq Scan on employees e1
  ->  Hash
        ->  Seq Scan on employees e2

PostgreSQL 18 (join is eliminated, single scan):

Seq Scan on employees e2  (actual time=0.017..8.749 rows=100000 loops=1)
  Buffers: shared hit=770
Planning Time: 0.318 ms
Execution Time: 12.768 ms

The planner recognizes that joining employees to itself on the primary key id is redundant. Every row matches exactly one row in the other alias, and that row is the same physical row. The join is rewritten to a simple scan.

With a WHERE clause

SJE also works when additional filters are present:

EXPLAIN (ANALYZE, COSTS OFF)
SELECT e1.id, e1.name, e1.dept, e1.salary
FROM employees e1
JOIN employees e2 ON e1.id = e2.id
WHERE e1.dept = 'Engineering';
Seq Scan on employees e2  (actual time=0.031..6.985 rows=33333 loops=1)
  Filter: (dept = 'Engineering'::text)
  Rows Removed by Filter: 66667
  Buffers: shared hit=770
Planning Time: 0.357 ms
Execution Time: 8.486 ms

Again, a single scan. The self-join is eliminated and the filter is applied directly.

When self-join elimination does not apply

SJE requires a unique or primary key join condition. A self-join on a non-unique column, such as finding colleagues in the same department, can’t be eliminated because multiple rows can match:

EXPLAIN (ANALYZE, COSTS OFF)
SELECT e1.id, e1.name, e2.name AS colleague
FROM employees e1
JOIN employees e2 ON e1.dept = e2.dept
WHERE e1.id = 1
LIMIT 5;
Nested Loop
  Join Filter: (e1.dept = e2.dept)
  ->  Index Scan using employees_pkey on employees e1
        Index Cond: (id = 1)
  ->  Seq Scan on employees e2

Here the join is preserved. dept is not unique, so one employee can match many colleagues, and the join genuinely changes the result set.

Scope: SELECT only

The PostgreSQL 18 implementation applies SJE only to SELECT queries. Self-joins in UPDATE, DELETE, and MERGE statements aren’t eliminated. This is intentional. DML queries involve EvalPlanQual (EPQ) re-evaluation for concurrent row updates, where the two table aliases might need to reference different row versions (snapshot compared to latest). Eliminating the join in that context would produce incorrect results. The PostgreSQL community has scoped the initial implementation conservatively, with DML support planned for a future release.

Fine-tuned autovacuum control with autovacuum_vacuum_max_threshold

PostgreSQL 18 introduces autovacuum_vacuum_max_threshold, a new parameter that caps the number of dead tuples that can accumulate before autovacuum triggers a VACUUM, regardless of table size. This addresses a long-standing problem with very large tables where the scale-factor-based formula can delay autovacuum for far too long.

The problem with large tables

Autovacuum decides when to vacuum a table by comparing the number of updated or deleted tuples against a threshold calculated as:

vacuum threshold = autovacuum_vacuum_threshold + autovacuum_vacuum_scale_factor × table_rows

With the defaults (threshold = 50, scale_factor = 0.2), this works well for small and medium tables. But for very large tables, the formula produces thresholds that allow enormous amounts of bloat to accumulate before autovacuum fires:

Table size Dead tuples needed to trigger autovacuum (before PG18)
1 million rows 200,050
100 million rows 20,000,050
500 million rows 100,000,050
1 billion rows 200,000,050

On a 1 billion row table, autovacuum waits for 200 million dead tuples before running. That’s significant table bloat, degraded query performance, and increased risk of transaction ID wraparound pressure.

The workaround before PG18 was to manually set autovacuum_vacuum_scale_factor = 0 and autovacuum_vacuum_threshold = N per table, but this required identifying every large table and maintaining those settings as tables grew.

How autovacuum_vacuum_max_threshold works

PostgreSQL 18 changes the formula to:

vacuum threshold = MIN(autovacuum_vacuum_max_threshold,
                       autovacuum_vacuum_threshold + autovacuum_vacuum_scale_factor × table_rows)

The new parameter acts as a ceiling. When the scale-factor calculation exceeds the max threshold, autovacuum fires at the cap instead. The default is 100,000,000 tuples, meaning no table will accumulate more than 100 million dead tuples before autovacuum runs, regardless of how large the table is.

SHOW autovacuum_vacuum_max_threshold;
 autovacuum_vacuum_max_threshold
---------------------------------
 100000000
(1 row)

For the 1 billion row table example, the effective threshold is now capped at 100 million instead of 200 million.

Configuring in Amazon RDS for PostgreSQL and Aurora PostgreSQL

autovacuum_vacuum_max_threshold is a dynamic parameter (sighup-level). It can be changed in your parameter group and applied without a reboot.

Amazon RDS for PostgreSQL: Modify the DB instance parameter group.

Aurora PostgreSQL: Modify the DB cluster parameter group (applies to all instances in the cluster).

In both cases, the parameter change is applied without a reboot. You can verify using SHOW autovacuum_vacuum_max_threshold; from any connected session.

Lower the global cap for all tables:

-- In parameter group: set to 10 million dead tuples max
autovacuum_vacuum_max_threshold = 10000000

Override per table:

-- Cap a specific high-churn table at 500,000 dead tuples
ALTER TABLE orders SET (autovacuum_vacuum_max_threshold = 500000);

-- Disable the cap for a specific table (use scale factor only)
ALTER TABLE archive_data SET (autovacuum_vacuum_max_threshold = -1);

Disable the global cap entirely:

-- In parameter group: revert to pre-PG18 behavior
autovacuum_vacuum_max_threshold = -1

Choosing the right value

The default of 100 million is a reasonable safety net for most workloads. For databases with large tables with high churn across both old and new rows, lowering it to 10 million can help autovacuum stay ahead of dead tuple accumulation while balancing for freeze activity. For append-only or read-heavy tables where dead tuples accumulate slowly, the default or even -1 (disabled) may be appropriate.

You can check the current effective threshold for any table by querying pg_stat_user_tables alongside pg_class:

SELECT
    relname,
    n_dead_tup,
    n_live_tup,
    LEAST(
        current_setting('autovacuum_vacuum_max_threshold')::bigint,
        current_setting('autovacuum_vacuum_threshold')::bigint +
        (current_setting('autovacuum_vacuum_scale_factor')::float * reltuples)::bigint
    ) AS effective_vacuum_threshold
FROM pg_stat_user_tables
JOIN pg_class ON relname = pg_class.relname
WHERE schemaname = 'public'
ORDER BY n_dead_tup DESC
LIMIT 10;

vacuum_truncate server variable

PostgreSQL 18 introduces vacuum_truncate as a server-wide GUC (Grand Unified Configuration) parameter. This gives database administrators a single control point to disable VACUUM’s file truncation behavior across all tables, without modifying individual table storage options.

Understanding VACUUM truncation

When VACUUM runs, it goes through multiple phases, including scanning the heap, vacuuming indexes and the heap, and eventually truncating the heap. The truncation phase removes empty pages from the end of a table data file and returns that disk space to the operating system. While this sounds beneficial, it comes with an important cost: truncation requires an AccessExclusiveLock on the table, the same lock level used by ALTER TABLE, DROP TABLE, and similar DDL operations.

During the first three phases, VACUUM holds only a ShareUpdateExclusiveLock, which allows concurrent reads and writes. But when it enters the truncation phase, it escalates to AccessExclusiveLock, briefly blocking all other access to the table. On busy OLTP systems with large tables, this lock escalation can cause noticeable query latency spikes.

Impact on read replicas

The truncation lock has a particularly significant effect on Aurora PostgreSQL and RDS for PostgreSQL read replicas. When the primary instance truncates a table, it writes an XLOG_STANDBY_LOCK record to the WAL stream. The replica’s WAL replay process must acquire the same AccessExclusiveLock on the relation before it can apply this record. If a long-running query on the replica is holding a conflicting lock, the replica will cancel that query after a 30-second wait in order to proceed with WAL replay. Users on the read replica see:

ERROR:  canceling statement due to conflict with recovery
DETAIL:  User was holding a relation lock for too long.

This is a common source of unexpected query cancellations on read replicas, particularly when autovacuum is actively truncating large tables on the primary.

Before PostgreSQL 18: per-table control only

Prior to PostgreSQL 18, vacuum_truncate existed only as a storage-level parameter set per table:

-- Disable truncation for a specific table (PostgreSQL 12+)
ALTER TABLE orders SET (vacuum_truncate = off);

This approach required identifying every table where truncation was causing issues and modifying each one individually, a maintenance burden in databases with hundreds of tables.

PostgreSQL 18: server-wide GUC

PostgreSQL 18 promotes vacuum_truncate to a server-wide GUC, so you can control truncation behavior globally through your parameter group. The default value remains on (truncation enabled), preserving backward compatibility.

Check the current setting:

SHOW vacuum_truncate;
 vacuum_truncate
-----------------
 on
(1 row)

Configuring vacuum_truncate in Amazon RDS for PostgreSQL

To disable truncation globally, create a custom parameter group with vacuum_truncate = off and apply it to your instance. This is a dynamic parameter and no reboot is required. You can verify the setting after the reboot:

SHOW vacuum_truncate;
 vacuum_truncate
-----------------
 off
(1 row)

Demonstrating the behavior difference

You can observe the effect of vacuum_truncate directly in the VACUUM verbose output, specifically how many pages were removed, how many remain, and whether a truncation line appears.

With vacuum_truncate = on (default):

VACUUM VERBOSE vacuum_demo;
INFO:  vacuuming "postgres.public.vacuum_demo"
INFO:  table "vacuum_demo": truncated 4425 to 45 pages
INFO:  finished vacuuming "postgres.public.vacuum_demo": index scans: 0
pages: 4380 removed, 45 remain, 4425 scanned (100.00% of total)
tuples: 990000 removed, 10000 remain, 0 are dead but not yet removable

The line truncated 4425 to 45 pages confirms that VACUUM removed 4,380 empty pages and returned the space to the OS. This truncation requires an AccessExclusiveLock.

With vacuum_truncate = off:

VACUUM VERBOSE vacuum_demo;
INFO:  vacuuming "postgres.public.vacuum_demo"
INFO:  finished vacuuming "postgres.public.vacuum_demo": index scans: 0
pages: 0 removed, 4425 remain, 4425 scanned (100.00% of total)
tuples: 990000 removed, 10000 remain, 0 are dead but not yet removable

The truncation line is completely absent. Dead tuples are still removed (990,000 rows cleaned up), but the empty pages remain in the file. No AccessExclusiveLock is acquired, and no WAL truncation record is sent to replicas.

When to use vacuum_truncate = off

Disabling truncation is appropriate when:

  • Read replicas experience frequent query cancellations because of canceling statement due to conflict with recovery errors caused by VACUUM truncation on the primary.
  • Tables have high write churn where deleted space will quickly be reused by new inserts. Truncating and then re-extending the file wastes I/O.
  • Large tables on busy OLTP systems where even a brief AccessExclusiveLock causes visible latency spikes.

Keep truncation enabled (the default) when:

  • Tables are periodically bulk-deleted and the space genuinely won’t be reused, truncation returns disk space to the OS.
  • Storage costs matter and you want VACUUM to actively reclaim space.
  • No read replicas are present, removing the WAL conflict concern.

Per-table override

The server-wide GUC can still be overridden at the table level. If you set vacuum_truncate = off globally but want truncation enabled for a specific table:

ALTER TABLE archive_data SET (vacuum_truncate = on);

Conversely, if the global setting is on but a specific high-traffic table needs truncation disabled:

ALTER TABLE orders SET (vacuum_truncate = off);

This layered control (server default using parameter group, per-table override using storage options) gives you precise control over truncation behavior across your database.

autovacuum_worker_slots

PostgreSQL 18 introduces autovacuum_worker_slots, a new parameter that separates the concept of allocating autovacuum worker slots from using them. This separation allows autovacuum_max_workers to be adjusted at runtime without a server restart, a significant operational improvement for tuning autovacuum under changing workloads.

The problem before PostgreSQL 18

Before PostgreSQL 18, autovacuum_max_workers was the only parameter controlling autovacuum concurrency. Because it directly controlled how many backend slots were reserved at startup, it had a postmaster context, meaning any change required a full server restart. If you needed to temporarily increase autovacuum workers to catch up on table bloat during a high-write period, you had no choice but to restart the instance.

How it works in PostgreSQL 18

PostgreSQL 18 introduces a two-level model:

  • autovacuum_worker_slots sets the maximum number of backend slots reserved for autovacuum workers at startup. This is a postmaster-level parameter and still requires a restart to change. It acts as a hard ceiling.
  • autovacuum_max_workers controls how many of those slots are actually used at any given time. It’s now a dynamic parameter (sighup context), meaning it can be changed at runtime by reloading the configuration, and no restart required.
SELECT name, setting, context,
       CASE context
           WHEN 'postmaster' THEN 'Requires restart'
           WHEN 'sighup'     THEN 'Runtime (no restart)'
       END AS change_method
FROM pg_settings
WHERE name IN ('autovacuum_worker_slots', 'autovacuum_max_workers')
ORDER BY name;
          name           | setting |  context   |    change_method
-------------------------+---------+------------+----------------------
 autovacuum_max_workers  | 3       | sighup     | Runtime (no restart)
 autovacuum_worker_slots | 16      | postmaster | Requires restart
(2 rows)

By default, autovacuum_worker_slots is 16 and autovacuum_max_workers is 3. You can increase autovacuum_max_workers up to 16 at any time without restarting the instance.

Configuring in Amazon RDS for PostgreSQL

In RDS and Aurora PostgreSQL, you can update autovacuum_max_workers in your parameter group with Apply immediately, no reboot needed. To increase the ceiling (autovacuum_worker_slots), update the parameter group and reboot the instance.

Example: temporarily increase autovacuum workers during a high-write period

Update autovacuum_max_workers in your parameter group from 3 to 6, apply immediately, and verify:

SHOW autovacuum_max_workers;
 autovacuum_max_workers
------------------------
 6
(1 row)

Revert it back when the workload normalizes.

Important: autovacuum_max_workers can’t exceed autovacuum_worker_slots. If you attempt to set it higher, PostgreSQL caps it at the slots value. To allow higher autovacuum_max_workers, first increase autovacuum_worker_slots (which requires a restart), then adjust autovacuum_max_workers freely within the new ceiling.

Per-table vacuum and analyze timing

PostgreSQL 18 adds four new columns to pg_stat_all_tables (and its variants pg_stat_user_tables, pg_stat_sys_tables) that track the cumulative time spent on maintenance operations per table:

Column Description
total_vacuum_time Cumulative time spent in manual VACUUM operations (milliseconds)
total_autovacuum_time Cumulative time spent in autovacuum operations (milliseconds)
total_analyze_time Cumulative time spent in manual ANALYZE operations (milliseconds)
total_autoanalyze_time Cumulative time spent in autoanalyze operations (milliseconds)

Before PostgreSQL 18, pg_stat_user_tables showed when the last vacuum or analyze ran (last_vacuum, last_autovacuum) but gave no indication of how long it took. Identifying which tables consumed the most maintenance time required either parsing log files or using pg_stat_progress_vacuum during an active operation. These new columns make that information permanently available and cumulative.

Example

Create a table with 5 million rows, delete half of them, then run VACUUM VERBOSE and ANALYZE VERBOSE:

CREATE TABLE timing_demo (id serial, val text);
INSERT INTO timing_demo SELECT i, md5(i::text) FROM generate_series(1, 5000000) i;
DELETE FROM timing_demo WHERE id % 2 = 0;

VACUUM VERBOSE timing_demo;
INFO:  vacuuming "postgres.public.timing_demo"
INFO:  finished vacuuming "postgres.public.timing_demo": index scans: 0
pages: 0 removed, 41667 remain, 41667 scanned (100.00% of total)
tuples: 2500000 removed, 2500000 remain, 0 are dead but not yet removable
...
system usage: CPU: user: 0.18 s, system: 0.00 s, elapsed: 0.18 s
ANALYZE VERBOSE timing_demo;
INFO:  analyzing "public.timing_demo"
INFO:  "timing_demo": scanned 30000 of 41667 pages, containing 1799980 live rows and 0 dead rows
INFO:  finished analyzing table "postgres.public.timing_demo"
...
system usage: CPU: user: 0.10 s, system: 0.00 s, elapsed: 0.10 s

Now query the new timing columns:

SELECT relname,
       round(total_vacuum_time::numeric, 2)     AS total_vacuum_ms,
       round(total_analyze_time::numeric, 2)    AS total_analyze_ms
FROM pg_stat_user_tables
WHERE relname = 'timing_demo';
   relname   | total_vacuum_ms | total_analyze_ms
-------------+-----------------+------------------
 timing_demo |          189.00 |           107.00
(1 row)

The elapsed: 0.18 s from VACUUM VERBOSE maps directly to 189ms in total_vacuum_time, and elapsed: 0.10 s from ANALYZE VERBOSE maps to 107ms in total_analyze_time. The view accumulates these values across every manual and automatic operation, giving you a running total of maintenance time per table.

Because these columns are cumulative, running VACUUM and ANALYZE again adds to the totals, giving you a running picture of maintenance overhead over the lifetime of the table.

Practical use cases

Identify tables with high maintenance overhead

SELECT relname,
       round(total_vacuum_time::numeric, 2)      AS vacuum_ms,
       round(total_autovacuum_time::numeric, 2)  AS autovacuum_ms,
       round(total_analyze_time::numeric, 2)     AS analyze_ms,
       round(total_autoanalyze_time::numeric, 2) AS autoanalyze_ms
FROM pg_stat_user_tables
ORDER BY (total_vacuum_time + total_autovacuum_time) DESC
LIMIT 10;

Compare manual and automatic maintenance time

Tables where total_autovacuum_time is very high relative to total_vacuum_time might benefit from tuning autovacuum thresholds. Tables where both are high are candidates for investigating bloat, index health, or write patterns.

Clean up resources

If you created database objects while following along with this post, run the following commands to remove them:

-- Remove tables created in this post
DROP TABLE IF EXISTS customer_orders;
DROP TABLE IF EXISTS employees;
DROP TABLE IF EXISTS products;
DROP TABLE IF EXISTS vacuum_demo;
DROP TABLE IF EXISTS timing_demo;

-- Reset any session-level parameter changes
RESET work_mem;

If you created an Aurora PostgreSQL cluster or RDS for PostgreSQL instance specifically for this walkthrough and no longer need it, delete the instance to avoid ongoing charges. For instructions, refer to Deleting a DB instance or Deleting an Aurora DB cluster. Similarly, terminate any Amazon EC2 instance that you created solely for the PostgreSQL client connection.

Conclusion

With skip scan, your existing composite indexes now handle queries they previously could not without maintaining additional indexes. Enhanced EXPLAIN turns CTE tuning from guesswork into data-driven decisions. Self-join elimination automatically fixes a class of ORM-generated inefficiency. And the new autovacuum controls help keep your largest tables healthy without manual per-table configuration. These improvements are available today on both Aurora PostgreSQL-Compatible Edition and Amazon RDS for PostgreSQL.

In Part 2, we cover security enhancements such as MD5 password support, monitoring improvements for parallel workers and replication, the new uuidv7() function for developers, optimizer statistics preservation during upgrades, and logical replication enhancements. To get started, create an Aurora PostgreSQL DB cluster or an RDS for PostgreSQL DB instance on PostgreSQL 18 and try these features today.


About the authors

Nazneen Jafri

Nazneen Jafri

Nazneen is a Senior Database Engineer at AWS, where she specializes in Amazon RDS and Aurora PostgreSQL. She drives engineering features that enhance database services and develops innovative solutions for customers navigating complex, high-stakes database challenges. Nazneen is a PostgreSQL contributor and regularly speaks at PostgreSQL and AWS conferences.

Baji Shaik

Baji Shaik

Baji is a Sr. Database Engineer at AWS. He focuses on enhancing database services, developing innovative solutions for complex customer scenarios, and helping customers with critical database challenges. He is an eminent author, having written several books on PostgreSQL.

Ranjan Burman

Ranjan Burman

Ranjan is a Senior Database Specialist Solutions Architect at AWS. His deep expertise in Amazon RDS and Amazon Aurora enables him to help organizations design mission critical, robust, enterprise-grade database solutions that optimize performance, scalability, and cost-efficiency.

Sukhpreet Kaur Bedi

Sukhpreet Kaur Bedi

Sukhpreet is a Senior Database Specialist Solutions Architect with AWS focusing on Amazon RDS for PostgreSQL and Aurora PostgreSQL engines. She helps customers innovate on the AWS platform by building highly available, scalable, and secure database architectures. She is a PostgreSQL enthusiast and a regular speaker at PostgreSQL and AWS conferences.