Parallel vacuuming in Amazon RDS for PostgreSQL and Amazon Aurora PostgreSQL

PostgreSQL is a powerful open-source object-relational database system with over 30 years of active development that has earned it a strong reputation for reliability, feature robustness, and performance. AWS offers Amazon Relational Database Service (Amazon RDS) and Amazon Aurora as fully managed relational database services. Amazon RDS for PostgreSQL and Amazon Aurora PostgreSQL-compatible edition makes it easy to set up, operate, and scale PostgreSQL deployments in the cloud.

In this post, we talk about how parallel vacuuming works in Amazon RDS for PostgreSQL, which is a feature that was introduced in PostgreSQL 13. We also cover the parallel VACUUM parameters and demonstrate running parallel VACUUM in Amazon RDS for PostgreSQL.

In a PostgreSQL database, an important feature for reclaiming the storage occupied by dead tuples (a version of a row of a table) and indexes is called VACUUM. In a normal PostgreSQL operation, tuples that are deleted or obsoleted by an update aren’t physically removed from their table; they remain present until a VACUUM is performed. Frequent UPDATE and DELETE operations can leave a lot of unused space in the table or index files on the disk. The expired rows that UPDATE and DELETE cause are dead rows, which create more dead tuples or a bloated table. Therefore, you need to run a maintenance task like VACUUM periodically to minimize table bloat.

VACUUM phases:

While autovacuum does take care of automating the VACUUM process, running VACUUM periodically is particularly efficient for frequently updated tables to minimize table bloat. This is because only when a certain number of dead rows have accumulated in the table are the cleanup steps taken by the VACUUM process. However, as datasets get larger, vacuuming takes longer to complete, which would lead to tables and indexes growing larger than ideal. VACUUM is always performed by a single process and one index at a time, which is why the runtime is longer for large tables.

Parallelism in PostgreSQL

Today, PostgreSQL can already devise query plans that can use multiple CPUs to answer queries faster, which is parallelism of a SQL query. VACUUM being one of the most important and critical maintenance operations to help DBAs control bloat, the introduction of parallel vacuuming helps speed up the process.

When we run VACUUM without the FULL option, it first scans the table from the top and collects garbage tuples in memory. If the table has more than one index, VACUUM runs for all of them one by one. Once the table is vacuumed it is followed by cleanup for indexes and truncating empty pages. To learn more about VACUUM and its various options and limitations, refer to the PostgreSQL documentation.

As VACUUM process runs in phases, in the heap scan phase, VACUUM can bypass the pages that don’t have dead tuples by using a visibility map. This map provides information on which table blocks are known to only have tuples that are visible to all transactions, but the index VACUUM and index cleanup phases demand a complete scan. With parallel vacuuming enabled, when you run VACUUM on a table, any indexes are processed in parallel, thereby making the VACUUM operation run faster and reduce the bloat as early as possible. It also allows you to use multiple CPUs to process indexes because VACUUM can perform the index VACUUM phase and index cleanup phase with parallel workers.

Parallel VACUUM parameters in RDS for PostgreSQL

The PARALLEL option was introduced in PostgreSQL 13 and is disabled by default. When we run VACUUM with the PARALLEL option, parallel workers are used for the index VACUUM and index cleanup phase. The number of parallel workers (degree of parallelism) is decided based on the number of indexes in the table or as specified by the user. If you’re running parallel VACUUM without an integer argument, it calculates the degree of parallelism based on the number of indexes in the table.

The following are important parameters for parallel vacuuming in RDS for PostgreSQL and Aurora PostgreSQL:

max_worker_processes – Sets the maximum number of concurrent worker processes
min_parallel_index_scan_size – The minimum amount of index data that must be scanned in order for a parallel scan to be considered
max_parallel_maintenance_workers – The maximum number of parallel workers that can be started by a single utility command

You can add these parameter settings in RDS for PostgreSQL using DB parameter groups. In our case, we modify an existing DB parameter group called rds-pg-13, as shown in the following screenshots. For instructions, refer to Modifying parameters in a DB parameter group.

How parallel VACUUM works

When we enable or request parallel VACUUM using the PARALLEL option, it’s not always certain that the number of parallel workers specified in the PARALLEL option will be used during the process. For parallel VACUUM to work, the table must have more than one index, and the index size must be greater than min_parallel_index_scan_size. Therefore, there also may be cases where we notice that VACUUM runs with fewer workers than specified in the PARALLEL option, or even with no workers at all.

Because we can use one worker per index, parallel workers are launched only when a table has at least two indexes. It’s also worth noting that this option can’t be used with the VACUUM FULL option.

Run parallel VACUUM in PostgreSQL

We now look at an example to see how to run parallel VACUUM in PostgreSQL:

Connect to a running instance of PostgreSQL via psql and check the aforementioned settings:

postgres=> SHOW MAX_WORKER_PROCESSES;
 max_worker_processes 
----------------------
 16
(1 row)

postgres=> SHOW MAX_PARALLEL_MAINTENANCE_WORKERS;
 max_parallel_maintenance_workers 
----------------------------------
 2
(1 row)

postgres=> SHOW MIN_PARALLEL_INDEX_SCAN_SIZE;
 min_parallel_index_scan_size 
------------------------------
 512kB
(1 row)

In our configuration, we use two parallel workers for the indexes, which exceeds the size of 512 KB.

Create the parallel_vacuum database for demonstration purposes and connect to the database:

postgres=> create database parallel_vacuum;
CREATE DATABASE

postgres=> \l
                                      
List of databases


      Name             |   Owner       | Encoding |   Collate   |    Ctype    |    Access privileges    
-----------------+-----------+----------+-------------+-------------+-------------------------
 parallel_vacuum       | adminuser     | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 postgres              | adminuser     | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 rdsadmin              | rdsadmin      | UTF8     | en_US.UTF-8 | en_US.UTF-8 | rdsadmin=CTc/rdsadmin
 template0             | rdsadmin      | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/rdsadmin            +
                       |               |          |             |             | rdsadmin=CTc/rdsadmin
 template1             | adminuser     | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/adminuser           +
                       |               |          |             |             | adminuser=CTc/adminuser
(5 rows)

postgres=> \c parallel_vacuum;
psql (13.7, server 13.7)
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off)
You are now connected to database "parallel_vacuum" as user "adminuser".

Create a table parallel_vacuum and populate it with test data:
Note: autovacuum has been turned off for demonstration purpose only, never do this in production.

parallel_vacuum=> create table parallel_vacuum (i INT, b INT[]) WITH (autovacuum_enabled = off);
CREATE TABLE

parallel_vacuum=> INSERT INTO parallel_vacuum SELECT i, array[3,8,9] FROM generate_series(1,1000000) i;
INSERT 0 1000000

Add three indexes to the table, one for each column and another for both columns together:

parallel_vacuum=> CREATE INDEX idx_i_parallel_vacuum ON parallel_vacuum USING btree (i);
CREATE INDEX

parallel_vacuum=> CREATE INDEX idx_b_parallel_vacuum ON parallel_vacuum USING btree (b);
CREATE INDEX

parallel_vacuum=> CREATE INDEX idx_combined_parallel_vacuum ON parallel_vacuum USING btree (i, b);
CREATE INDEX

List the index statistics using the following query. This query is available on the PostgreSQL wiki.

parallel_vacuum=> SELECT
parallel_vacuum->     t.schemaname,
parallel_vacuum->     t.tablename,
parallel_vacuum->     c.reltuples::bigint                            AS num_rows,
parallel_vacuum->     pg_size_pretty(pg_relation_size(c.oid))        AS table_size,
parallel_vacuum->     psai.indexrelname                              AS index_name,
parallel_vacuum->     pg_size_pretty(pg_relation_size(i.indexrelid)) AS index_size,
parallel_vacuum->     CASE WHEN i.indisunique THEN 'Y' ELSE 'N' END  AS "unique",
parallel_vacuum->     psai.idx_scan                                  AS number_of_scans,
parallel_vacuum->     psai.idx_tup_read                              AS tuples_read,
parallel_vacuum->     psai.idx_tup_fetch                             AS tuples_fetched
parallel_vacuum-> FROM
parallel_vacuum->     pg_tables t
parallel_vacuum->     LEFT JOIN pg_class c ON t.tablename = c.relname
parallel_vacuum->     LEFT JOIN pg_index i ON c.oid = i.indrelid
parallel_vacuum->     LEFT JOIN pg_stat_all_indexes psai ON i.indexrelid = psai.indexrelid
parallel_vacuum-> WHERE
parallel_vacuum->     t.schemaname NOT IN ('pg_catalog', 'information_schema')
parallel_vacuum-> ORDER BY 1, 2;
 

schemaname |    tablename    | num_rows | table_size |          index_name          | index_size | unique | number_of_scans | tuples_read | tuples_fe
tched 
------------+-----------------+----------+------------+------------------------------+------------+--------+-----------------+-------------+----------
------
 public     | parallel_vacuum |  1000000 | 65 MB      | idx_i_parallel_vacuum        | 21 MB      | N      |               0 |           0 |         0
 public     | parallel_vacuum |  1000000 | 65 MB      | idx_b_parallel_vacuum        | 56 MB      | N      |               0 |           0 |         0
 public     | parallel_vacuum |  1000000 | 65 MB      | idx_combined_parallel_vacuum | 56 MB      | N      |               0 |           0 |         0

Now that we have the list of indexes with us, we can set the parallel workers accordingly. Next, we update the parallel_vacuum table to demonstrate the data being modified.

Update the parallel_vacuum table and set the maximum maintenance parallel workers to four, then run VACUUM with the PARALLEL 4 parameter:

parallel_vacuum=> UPDATE parallel_vacuum SET i = i WHERE i < 50000;
UPDATE 49999

parallel_vacuum=> SET MAX_PARALLEL_MAINTENANCE_WORKERS=4;
SET

parallel_vacuum=> VACUUM (PARALLEL 4, VERBOSE) parallel_vacuum;
INFO:  vacuuming "public.parallel_vacuum"
INFO:  launched 2 parallel vacuum workers for index vacuuming (planned: 2)
INFO:  scanned index "idx_i_parallel_vacuum" to remove 49999 row versions
DETAIL:  CPU: user: 0.05 s, system: 0.00 s, elapsed: 0.05 s
INFO:  scanned index "idx_combined_parallel_vacuum" to remove 49999 row versions
DETAIL:  CPU: user: 0.04 s, system: 0.01 s, elapsed: 0.05 s
INFO:  scanned index "idx_b_parallel_vacuum" to remove 49999 row versions
DETAIL:  CPU: user: 0.06 s, system: 0.01 s, elapsed: 0.08 s
INFO:  "parallel_vacuum": removed 49999 row versions in 417 pages
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
INFO:  index "idx_i_parallel_vacuum" now contains 1000000 row versions in 2883 pages
DETAIL:  49999 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "idx_b_parallel_vacuum" now contains 1000000 row versions in 7560 pages
DETAIL:  49999 index row versions were removed.
360 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "idx_combined_parallel_vacuum" now contains 1000000 row versions in 7890 pages
DETAIL:  49999 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  "parallel_vacuum": found 49999 removable, 50080 nonremovable row versions in 834 out of 8750 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 1455
There were 0 unused item identifiers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.06 s, system: 0.01 s, elapsed: 0.10 s.
INFO:  vacuuming "pg_toast.pg_toast_16426"
INFO:  "pg_toast_16426": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 1455
There were 0 unused item identifiers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM

In our example, we have three indexes in the table; therefore, the maximum number of vacuum workers can be two and a leader process. Here in the demonstration 3 processes worked on this command.

Now let’s examine the performance benefits of parallel VACUUM.

Observations of parallel VACUUM and its performance benefits

To verify the performance benefits of parallel VACUUM, let’s first turn on timing:

parallel_vacuum=> \timing
Timing is on.

Now we update the parallel_vacuum table and run VACUUM with and without the PARALLEL option:

parallel_vacuum=> UPDATE parallel_vacuum SET i = i WHERE i < 50000;
UPDATE 49999
Time: 428.887 ms
parallel_vacuum=> VACUUM parallel_vacuum;
VACUUM
Time: 97.066 ms
parallel_vacuum=> UPDATE parallel_vacuum SET i = i WHERE i < 50000;
UPDATE 49999
Time: 474.511 ms
parallel_vacuum=> VACUUM (PARALLEL 4) parallel_vacuum;
VACUUM
Time: 83.729 ms

In this example, we’re able to visualize the parallel VACUUM process and how it uses multiple workers, thereby improving the runtime.

Conclusion

In this post, we showed how the VACUUM process’s ability to execute in parallel helped shorten its overall runtime. According to our analysis, running VACUUM with the PARALLEL option resulted in a performance improvement of 14 seconds (17%). Before using in production, you should thoroughly evaluate your settings, including the number of indexes as indicated in this blog post, as well as your tables. The performance improvement is influenced by the dataset being used and the parameter settings.

If you have any comments or questions about this post, submit them in the comments section.

About the Authors

Kanchan Bhattacharyya is a Specialist Technical Account Manager – Databases with Enterprise Support, India. He joined AWS in 2021 and works with ES customers to provide consultative technical guidance, reviews, and recommendations on their databases running in the AWS Cloud. He also helps ES customers adopt best practices for their databases running in the AWS Cloud. Kanchan specializes in Amazon RDS for SQL Server, RDS for PostgreSQL, Amazon RDS for MySQL, and Amazon Aurora.

Mohammed Asadulla Baig is a Technical Account Manager with Enterprise Support, India. He joined AWS in 2017 and helps customers plan and build highly scalable, resilient, and secure solutions. Along with his work as a TAM, he specializes in databases like Amazon Aurora and RDS for PostgreSQL. He has assisted multiple enterprise customers by enabling them with various AWS services and provided guidance on achieving operational excellence.

AWS Database Blog

Parallel vacuuming in Amazon RDS for PostgreSQL and Amazon Aurora PostgreSQL

Parallelism in PostgreSQL

Parallel VACUUM parameters in RDS for PostgreSQL

How parallel VACUUM works

Run parallel VACUUM in PostgreSQL

Observations of parallel VACUUM and its performance benefits

Conclusion

About the Authors

Resources

Blog Topics

Follow

Learn

Resources

Developers

Help