Fast switchovers with PgBouncer on Amazon RDS Multi-AZ deployments with two readable standbys for PostgreSQL

Amazon RDS Multi-AZ deployments with two readable standbys (Amazon RDS Multi-AZ DB cluster) for PostgreSQL now supports typically 35 seconds or less of write downtime for minor version upgrades and system maintenance updates. By using patches for PgBouncer from AWS, you can further reduce this write downtime to typically one second or less. This greatly reduces disruption to applications relying on the database during the upgrade window.

PgBouncer is an open-source connection pooler for PostgreSQL. To offer fast switchovers, the latest patches for PgBouncer from AWS leverages the newly introduced topology metadata from Amazon Relational Database Service (Amazon RDS) for PostgreSQL to pre-create connection pools to all nodes in the Amazon RDS Multi-AZ DB cluster for PostgreSQL on engine startup.

In this post, we set up a PgBouncer docker container to minimize downtime for writes during minor version upgrades of Amazon RDS Multi-AZ DB clusters. We also discuss the API introduced in Amazon RDS for PostgreSQL that exposes the topology of all nodes in the Amazon RDS Multi-AZ DB cluster.

What is Switchover?

A switchover is a planned maintenance activity when customer application traffic is routed from one database host to another database host in a controlled manner. It is initiated in Amazon RDS Multi-AZ DB clusters during minor version upgrades.

Clients have the option to connect to either the cluster writer endpoint or the cluster reader endpoint. Normally, they would connect to the cluster writer endpoint. This endpoint is directed to the current writer instance. When a switchover occurs due to a minor version upgrade, one of the readers is promoted to be the new writer. Amazon RDS automatically updates the cluster endpoint to point to the new writer by updating the instance endpoint DNS record. Due to inherent delays in DNS propagation, this can take up to 35 seconds to resolve. Normally, the application would require logic to reconnect after the temporary lack of availability of the database. The patches for PgBouncer from AWS automatically detect and switch the connection to the new writer. If you use PgBouncer in front of RDS for PostgreSQL Multi-AZ DB cluster, your application can switch over the connection in typically one second or less when performing a minor version upgrade provided that the reader instances have no replica lag.

Prerequisites

For achieving one second or less downtime with minor version upgrades or system maintenance updates you need to be on Amazon RDS for PostgreSQL 15.4 and above with rds_tools extension 1.4 or higher. If you have an existing cluster on this version, upgrade to at least the R3 release. In addition, you need to create a PgBouncer container using the Dockerfile in the PgBouncer fast switchover repository from AWS.

Additionally, you need an Amazon Elastic Compute Cloud (Amazon EC2) instance in the same VPC as your cluster if your database isn’t publicly accessible. For the fastest switchovers, we recommend provisioning your Amazon EC2 instance and Amazon RDS Multi-AZ DB cluster for PostgreSQL in the same Region.

Install psql, git, and docker. In addition, create a new Docker group so you can run the container without root access. For this post, we are using an Amazon Linux 2023 OS.

sudo yum install postgresql15 docker git -y
sudo usermod -a -G docker ec2-user
newgrp docker
sudo systemctl enable --now docker.service

Now that your Amazon EC2 instance and Amazon RDS Multi-AZ DB cluster are created, let’s access the topology API and create a PgBouncer container.

Connect to the writer instance using EC2 instance with psql client and log in as an rds_superuser (the account that you originally created with the database which defaults to postgres). Once you successfully log in, create the rds_tools extension with a version of 1.4 or higher. This extension provides useful data about your Amazon RDS Multi-AZ DB cluster for PostgreSQL. Use the following SQL command to create the rds_tools extension and verify the extension version.

postgres=> CREATE EXTENSION IF NOT EXISTS rds_tools;
CREATE EXTENSION

postgres=> SELECT extversion FROM pg_extension WHERE extname like 'rds_tools';
 extversion
------------
 1.4
(1 row)

Inspect Amazon RDS Multi-AZ DB cluster for PostgreSQL metadata

To enable fast switchovers, we have implemented a metadata function within Amazon RDS for PostgreSQL. You can access this function by installing the latest rds_tools extension, with a version of 1.4 or higher, and also meet the supported engine version requirements outlined in the prerequisites section. After successfully installing the rds_tools extension, you can inspect topology metadata with the following command on any node in the Amazon RDS Multi-AZ DB cluster:

postgres=> SELECT * FROM rds_tools.show_topology('pgbouncer');
              id               |                                     endpoint                            | port
-------------------------------+-------------------------------------------------------------------------+------
 db-KGLXG75BGVIWKQT7NQ4EXAMPLE1 | my-cluster-instance-3.123456789012.us-west-2.rds.amazonaws.com | 5432
 db-KGLXG75BGVIWKQT7NQ4EXAMPLE2 | my-cluster-instance-2.123456789012.us-west-2.rds.amazonaws.com | 5432
 db-KGLXG75BGVIWKQT7NQ4EXAMPLE3 | my-cluster-instance-1.123456789012.us-west-2.rds.amazonaws.com | 5432
(3 rows)

Here’s a breakdown of each column:

id – This is the DBI Resource ID for each instance in the Amazon RDS Multi-AZ DB cluster.
endpoint – This lists the Amazon Route 53 CNAME for each node in the Amazon RDS Multi-AZ DB cluster.
port – This displays the ports each node listens on in the Amazon RDS Multi-AZ DB cluster.

The argument in the show_topology function is optional but recommended. Passing a proxy identifier allows Amazon RDS to track which proxies are actively querying the topology. This provides useful telemetry for Amazon RDS and helps guide support for additional proxies in the future.

PgBouncer utilizes this metadata to pre-create connection pools to all nodes in the Amazon RDS Multi-AZ DB cluster on startup. This ensures that when there is a new writer, time doesn’t need to be spent establishing a new connection. Additionally, PgBouncer automatically self-heals by recreating a connection pool to the previous writer when the node transitions back to a read standby role.

The PgBouncer patches work with any Amazon RDS Multi-AZ DB cluster for PostgreSQL that can return the topology from a configurable query.

The new metadata is available in Amazon RDS for PostgreSQL versions 15.4, 14.9, 13.12, or higher. If you have an existing Amazon RDS Multi-AZ DB cluster on these versions, upgrade to at least the R3 release, prior to setting up PgBouncer.

Set up a PgBouncer container

Now that you have completed the prerequisites, you are ready to create the PgBouncer container and test the connection to the writer of the Amazon RDS Multi-AZ DB cluster. Connect to the Amazon EC2 instance and then complete the following steps:

Clone the PgBouncer fast switchover repository and build the container:

git clone https://github.com/awslabs/pgbouncer-fast-switchover.git
cd pgbouncer-fast-switchover
docker build -f Dockerfile.local . -t pgbouncer

Get the writer endpoint and ensure you can connect to it from the EC2 instance. In this example, the Amazon RDS Multi-AZ DB cluster name is my-cluster. The writer endpoint returned is my-cluster.cluster-123456789012.us-west-2.rds.amazonaws.com. You can run the following AWS CLI command to get the cluster writer endpoint.
```
aws rds describe-db-clusters --region=us-west-2 | \
    jq '.DBClusters[] | select(.DBClusterIdentifier == "my-cluster") |
     .Endpoint'
```

Test the connection to the writer endpoint by using psql, and confirm that you can successfully query the cluster topology.

$ psql -h <your cluster endpoint> -U <your user> postgres
Password for user testuser:
psql (15.4, server 15.4)
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, compression: off)
Type "help" for help.


postgres=> SELECT * FROM rds_tools.show_topology('pgbouncer');
              id               |                                     endpoint                            | port
-------------------------------+-------------------------------------------------------------------------+------
 db-KGLXG75BGVIWKQT7NQ4EXAMPLE1 | my-cluster-instance-3.123456789012.us-west-2.rds.amazonaws.com | 5432
 db-KGLXG75BGVIWKQT7NQ4EXAMPLE2 | my-cluster-instance-2.123456789012.us-west-2.rds.amazonaws.com | 5432
 db-KGLXG75BGVIWKQT7NQ4EXAMPLE3 | my-cluster-instance-1.123456789012.us-west-2.rds.amazonaws.com | 5432
(3 rows)

If your connection times out, make sure that your Amazon EC2 instance and Amazon RDS Multi-AZ DB cluster are in the same subnet, or adjust your security group settings accordingly. For more information, refer to Scenarios for accessing a DB instance in a VPC and Control traffic to your AWS resources using security groups. If these instances are in different regions, you need to take additional steps. For details, refer to What is VPC peering?
Modify your PgBouncer config to match your cluster writer endpoint and credentials. Update the following in pgbouncer-fast-switchover/pgbouncer.ini:
1. <change-to-cluster-endpoint> to your writer endpoint
2. <change-dbuser> to your cluster user
3. <change-dbpassword> to the password for the cluster

In this example, the configuration is the following:

postgres = host=my-cluster.cluster-123456789012.us-west-2.rds.amazonaws.com 
port=5432 
user=<username> 
password=<password> 
dbname=postgres 
topology_query='select endpoint from rds_tools.show_topology(\'pgbouncer\')'

Now start the container and test connecting through PgBouncer:

docker run -v $(pwd)/pgbouncer.ini:/home/pgbouncer/pgbouncer.ini \
    -v $(pwd)/userlist.txt:/home/pgbouncer/userlist.txt \
    --network host --rm -it pgbouncer /start.sh

In a new terminal window or tab, test connecting to PgBouncer. You can get the default credentials from pgbouncer-fast-switchover/userlist.txt. You can update them from the defaults.
```
psql -h localhost -U testuser postgres
Password for user testuser:
psql (15.4, server 15.4)
Type "help" for help.

postgres=> select 1;
 ?column?
----------
        1
(1 row)
```

Test a fast switchover

Now that you have successfully created a PgBouncer container and connected it to your Amazon RDS Multi-AZ DB cluster, let’s test a fast switchover. To measure downtime, we have included a helper Python script:

#!/usr/bin/python3

import time
import subprocess
from datetime import datetime,timezone

t0 = time.time()

while True:
    ret = subprocess.call("PGPASSWORD=<password> psql -h localhost -p 5432 -U testuser -c 'SELECT 1' -d postgres", shell=True, stdout=subprocess.DEVNULL,)
    t1 = time.time()
    time_diff = (t1 - t0)*1000
    if time_diff > 200:
        print(f"time since last update: {time_diff:.2f}ms")
    t0 = time.time()

You can test the switchover by issuing a minor version upgrade command and using the apply immediately setting. The upgrade can take a bit of time to complete. In this example, we upgrade from version 15.4 to 15.5 using the AWS CLI.

aws rds modify-db-cluster \
    --db-cluster-identifier my-cluster \
    --engine-version 15.5 \
    --auto-minor-version-upgrade \
    --apply-immediately \
    --region us-west-2

While the upgrade is running, run the Python script simultaneously and monitor for downtime. After the writer switches to the upgraded node in the Amazon RDS Multi-AZ DB cluster, the downtime begins. This is measured in the script as milliseconds, in our example, the downtime was 741.55ms.

$ python3 helper.py
time since last update: 741.55ms

New configuration options

The PgBouncer patches from AWS, introduce a few configuration options. Let’s go through them in detail.

topology_query

The topology_query is used to get the endpoints from the Amazon RDS Multi-AZ DB cluster at PgBouncer startup. If you would like to use fast switchovers outside of Amazon RDS for PostgreSQL, the topology table must have a column named endpoint. Each endpoint is used to create a connection pool. When the writer endpoint is detected as down, PgBouncer polls each endpoint using pg_is_in_recovery() to identify the new writer, before resuming client queries.

polling_frequency

The polling_frequency allows you to tune how frequently pg_is_in_recovery is called during a switchover. The default and recommended value is 100 milliseconds.

server_failed_delay

The server_failed_delay allows you to modify how long to wait between failed attempts to connect to a node in the Amazon RDS Multi-AZ DB cluster. By default, this is set to 30 seconds. If this value is too low, it can potentially slow down how long the node takes to recover due to a connection storm.

recreate_disconnected_pools

Connection pools can close during a switchover or because of a timeout. If recreate_disconnected_pools is enabled and topology_query is set for a database, the closed connection pools are automatically recreated by PgBouncer. The default for recreate_disconnected_pools is enabled.

Benchmarks

We created Amazon RDS Multi-AZ cluster for PostgreSQL on an r6gd.2xlarge instance type and ran a read-only workload. Comparing the patched PgBouncer against version 1.19.1, the performance is nearly identical.

The following code shows PgBouncer 1.19.1:

$ pgbench -c 500 -T 900 postgres -p 5432 -U test -j 2 -S -h localhost
pgbench (15.4, server 15.4)
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 1
query mode: simple
number of clients: 500
number of threads: 2
duration: 900s
number of transactions actually processed: 23577114
latency average = 19.081 ms
initial connection time = 229.633 ms
tps = 26203.419683 (without initial connection time)

The following code shows the patched PgBouncer:

$ pgbench -c 500 -T 900 postgres -p 5432 -U test -j 2 -S -h localhost
pgbench (15.4, server 15.4)
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 1
query mode: simple
number of clients: 500
number of threads: 2
duration: 900 s
number of transactions actually processed: 23792655
latency average = 18.907 ms
initial connection time = 296.011 ms
tps = 26444.833154 (without initial connection time)

Acknowledgements

The fast switchovers capability is built on the pgbouncer-rr project described in Query Routing and Rewrite: Introducing pgbouncer-rr for Amazon Redshift and PostgreSQL. Review the original post for additional details, context, and information on other use cases enabled by the project.

Clean up

If you don’t plan on using your PgBouncer container and Amazon RDS Multi-AZ DB cluster beyond testing, make sure to clean up your EC2 instance and your Multi-AZ DB cluster.

Conclusion

In this post, we demonstrated how you can typically achieve less than one second of write downtime during a minor version upgrade on an Amazon RDS Multi-AZ cluster for PostgreSQL using the patches for PgBouncer from AWS. This is a significant improvement over connecting your application directly to the writer, which can have a write downtime up to 35 seconds due to DNS propagation delays. Using PgBouncer in front of your Amazon RDS Multi-AZ DB cluster for PostgreSQL has your application supporting writes after a switchover faster than ever before.

Give the patches for PgBouncer from AWS a try and let us know your feedback in the comments section.

About the authors

Yuli Khodorkovskiy is a Senior Software Engineer working on Amazon RDS for PostgreSQL with a focus on security and is based in Maryland.

Rajat Jain is a Software Development Engineer within the RDS Opensource Engines team. He specializes in the design and development of Control Plane components for open-source engines.