Achieve one second or less of downtime with ProxySQL when upgrading Amazon RDS Multi-AZ deployments with two readable standbys

In this post, we explore how to use ProxySQL to achieve a downtime of typically 1 second or less when performing a minor version upgrade on Amazon RDS for MySQL Multi-AZ deployments with two readable standbys (Amazon RDS Multi-AZ DB cluster). ProxySQL is an open source proxy for MySQL.

Currently, minor version upgrades or system maintenance updates for Amazon Relational Database Service (Amazon RDS) for MySQL databases can take several minutes and require a planned downtime. Custom-built upgrade solutions are not fully automated and are complex to use with database fleets because they need coordination among various application users, require additional infrastructure, are resource intensive, and are expensive to build.

You can now shorten the downtime required to perform minor version upgrades or system maintenance updates to typically 35 seconds or less by using Amazon RDS Multi-AZ DB cluster. When combined with ProxySQL, this downtime can be further reduced to typically 1 second or less.

Solution overview

When upgrading the minor version of Amazon RDS Multi-AZ DB cluster, the process begins by applying the patch to each of the readers in sequence. After both the readers are upgraded, one of them is promoted to be the new writer, followed by an upgrade of the old writer. This process of switching to the new writer takes approximately 1 second or less. However, applications can still see a downtime of at least 35 seconds. This is due to Domain Name System (DNS) propagation delays of the cluster endpoint because the cluster endpoint changes after the new writer is promoted during the upgrade process.

By setting up ProxySQL with your Amazon RDS Multi-AZ DB cluster, you can now reduce the downtime for minor version upgrades and system maintenance updates to typically 1 second or less. ProxySQL has the ability to monitor your database for changes in the writer and reroute traffic to the new writer. When configured properly for Amazon RDS Multi-AZ DB clusters, ProxySQL is not impacted by DNS propagation delays of the cluster endpoint, and quickly detects the change to the new writer to keep your application downtime to typically under 1 second during minor version upgrades.

In the following sections, we demonstrate how to configure ProxySQL for Amazon RDS for MySQL Multi-AZ DB clusters to reduce your application downtime to typically 1 second or less during minor version upgrades or system maintenance updates.

Prerequisites

Before walking through the steps involved to set up ProxySQL, you need the following:

An Amazon RDS for MySQL Multi-AZ DB cluster. You can use an existing cluster or create a new one. Remember to use a cluster running an older MySQL version, if you want to perform a minor version upgrade to test the downtime of 1 second or less.
An Amazon Elastic Compute Cloud (Amazon EC2) instance for ProxySQL. The EC2 instance should be in the same region and VPC as your Amazon RDS Multi-AZ DB cluster. It’s recommended to use Amazon Linux 2 or higher for the EC2 instance. ProxySQL version 2.5.1 or higher is recommended. If you already have an EC2 instance with ProxySQL running in the same region and VPC as your Amazon RDS Multi-AZ DB cluster, you can use the same instance and follow along with this post to set it up. Otherwise, on the new EC2 instance, install ProxySQL and complete the initial setup for configuring ProxySQL. For more information, refer to Getting started.

Collect the endpoint and port of all instances in the Amazon RDS Multi-AZ DB cluster

An Amazon RDS Multi-AZ DB cluster has one writer instance and two reader instances. Each DB instance in the cluster has its own built-in endpoint. You can find the endpoint and port of all the cluster instances from the AWS Management Console, AWS Command Line Interface (AWS CLI), or Amazon RDS API.

Find the instance endpoint and port using the console

To use the console, complete the following steps:

On the Amazon RDS console, choose the region in which DB cluster is created.
In the navigation pane, choose Databases to display the list of all your DB instances.
Choose the name of the DB cluster.
Choose the name of instance-1 of the DB cluster to display all its details.
On the Connectivity & security tab, note down the endpoint and port number of the instance.
Repeat these steps for instance-2 and instance-3 of the DB cluster to collect their endpoint and port information.

In this example, our Amazon RDS Multi-AZ DB cluster mycluster has the following instance endpoints and port.

Instance endpoints	Port
mycluster-instance-1.csempbee1cem.us-west-2.rds.amazonaws.com	3306
mycluster-instance-2.csempbee1cem.us-west-2.rds.amazonaws.com	3306
mycluster-instance-3.csempbee1cem.us-west-2.rds.amazonaws.com	3306

Find the instance endpoint and port using the AWS CLI

To use the AWS CLI, use the describe-db-clusters command to find the DBInstanceIdentifier of all three instances of the cluster. Then use the describe-db-instances command to find the instance endpoint and port of each instance.

Find the instance endpoint and port using the Amazon RDS API

To use the Amazon RDS API, call the DescribeDBClusters API to find the DBInstanceIdentifier of all the instances of the cluster. Then call the DescribeDBInstances API to find the instance endpoint and port of each instance.

Onboard the Amazon RDS Multi-AZ DB cluster onto ProxySQL using instance endpoints

Use the instance endpoints and port information you retrieved in the previous step to onboard the Amazon RDS Multi-AZ DB cluster onto ProxySQL. This will prevent ProxySQL from being impacted by the DNS propagation delay issues described earlier in this post.

Connect to the EC2 instance running ProxySQL.

Connect to ProxySQL through the admin interface to configure it:

mysql -u proxy_admin -p password -h 127.0.0.1 -P6032 --prompt='ProxySQL Admin>'

Insert the endpoints and port of all the instances in the Amazon RDS Multi-AZ DB cluster you retrieved earlier to the mysql_servers table in the ProxySQL configuration. In the following example, we onboarded all the instances of the cluster with hostgroup_id 1. You can use a different ID if needed, but all the instances should be onboarded to the same hostgroup_id.

Add the instance-1 endpoint:

ProxySQL Admin> INSERT INTO mysql_servers(hostgroup_id,hostname,port) VALUES(1,'mycluster-instance-1.csempbee1cem.us-west-2.rds.amazonaws.com',3306);

Add the instance-2 endpoint:

ProxySQL Admin> INSERT INTO mysql_servers(hostgroup_id,hostname,port) VALUES(1,'mycluster-instance-2.csempbee1cem.us-west-2.rds.amazonaws.com',3306);

Add the instance-3 endpoint:

ProxySQL Admin> INSERT INTO mysql_servers(hostgroup_id,hostname,port) VALUES(1,'mycluster-instance-3.csempbee1cem.us-west-2.rds.amazonaws.com',3306);

Verify all the three instances of the cluster were added to the ProxySQL configuration:

ProxySQL Admin> select hostgroup_id, hostname, port, status from mysql_servers;
+--------------+---------------------------------------------------------------+------+--------+
| hostgroup_id | hostname                                                      | port | status |
+--------------+---------------------------------------------------------------+------+--------+
| 1            | mycluster-instance-1.csempbee1cem.us-west-2.rds.amazonaws.com | 3306 | ONLINE |
| 1            | mycluster-instance-2.csempbee1cem.us-west-2.rds.amazonaws.com | 3306 | ONLINE |
| 1            | mycluster-instance-3.csempbee1cem.us-west-2.rds.amazonaws.com | 3306 | ONLINE |
+--------------+---------------------------------------------------------------+------+--------+
3 rows in set (0.00 sec)

Load the mysql_servers table configuration to runtime and persist it to the disk:

ProxySQL Admin> LOAD MYSQL SERVERS TO RUNTIME;
ProxySQL Admin> SAVE MYSQL SERVERS TO DISK;

Configure ProxySQL monitoring for the Amazon RDS Multi-AZ DB cluster

This helps ProxySQL monitor the Amazon RDS Multi-AZ DB cluster for changes in the writer, detect the new writer in 1 second or less, and reroute traffic to the new writer.

In this example, we use the default ProxySQL monitor user monitor and password monitor to configure monitoring. You can use a different monitor user and password if you want.

First, you configure the ProxySQL monitor user on the Amazon RDS Multi-AZ DB cluster.

Create the ProxySQL monitor user and grant it privileges to monitor:

mysql> CREATE USER 'monitor'@'%' IDENTIFIED BY 'monitor';
mysql> GRANT PROCESS, REPLICATION CLIENT ON *.* TO 'monitor'@'%';

Now you configure ProxySQL to monitor the Amazon RDS Multi-AZ DB cluster.

Connect to the EC2 instance running ProxySQL.

Connect to ProxySQL through the admin interface to configure:

mysql -u proxy_admin -p password -h 127.0.0.1 -P6032 --prompt='ProxySQL Admin> '

Configure the monitor user and password:

ProxySQL Admin> UPDATE global_variables SET variable_value='monitor' WHERE variable_name='mysql-monitor_username';

ProxySQL Admin> UPDATE global_variables SET variable_value='monitor' WHERE variable_name='mysql-monitor_password';

Configure the monitoring intervals by setting mysql-monitor_read_only_interval to 100 milliseconds. This helps ProxySQL detect the change to the new writer within 100 milliseconds.
```
ProxySQL Admin> UPDATE global_variables SET variable_value=100 WHERE variable_name='mysql-monitor_read_only_interval';
```

Load the configuration to runtime and persist it to the disk:

Proxy Admin> LOAD MYSQL VARIABLES TO RUNTIME;
Proxy Admin> SAVE MYSQL VARIABLES TO DISK;

Verify ProxySQL can connect to the Amazon RDS Multi-AZ DB cluster and monitor it. There should be no connection or ping errors.

ProxySQL Admin> SELECT * FROM monitor.mysql_server_connect_log ORDER BY time_start_us DESC LIMIT 3;
+---------------------------------------------------------------+------+------------------+-------------------------+---------------+
| hostname                                                      | port | time_start_us    | connect_success_time_us | connect_error |
+---------------------------------------------------------------+------+------------------+-------------------------+---------------+
| mycluster-instance-3.csempbee1cem.us-west-2.rds.amazonaws.com | 3306 | 1699340963758529 | 1776                    | NULL          |
| mycluster-instance-2.csempbee1cem.us-west-2.rds.amazonaws.com | 3306 | 1699340963326166 | 806                     | NULL          |
| mycluster-instance-1.csempbee1cem.us-west-2.rds.amazonaws.com | 3306 | 1699340962893914 | 3000                    | NULL          |
+---------------------------------------------------------------+------+------------------+-------------------------+---------------+
3 rows in set (0.00 sec)

ProxySQL Admin> SELECT * FROM monitor.mysql_server_ping_log ORDER BY time_start_us DESC LIMIT 3;
+---------------------------------------------------------------+------+------------------+----------------------+------------+
| hostname                                                      | port | time_start_us    | ping_success_time_us | ping_error |
+---------------------------------------------------------------+------+------------------+----------------------+------------+
| mycluster-instance-2.csempbee1cem.us-west-2.rds.amazonaws.com | 3306 | 1699341052591621 | 300                  | NULL       |
| mycluster-instance-3.csempbee1cem.us-west-2.rds.amazonaws.com | 3306 | 1699341052591377 | 576                  | NULL       |
| mycluster-instance-1.csempbee1cem.us-west-2.rds.amazonaws.com | 3306 | 1699341052591211 | 752                  | NULL       |
+---------------------------------------------------------------+------+------------------+----------------------+------------+
3 rows in set (0.00 sec)

Configure ProxySQL MySQL replication hostgroups

ProxySQL uses a MySQL replication hostgroup to understand the replication topology, identify the reader and writer instances in the cluster, and monitor the cluster for topology changes.

Each row in mysql_replication_hostgroups table represents a pair of writer and reader hostgroups. ProxySQL monitors the read_only value for all the servers in a hostgroup pair and assigns each server to the writer or reader hostgroup.

In this example, we create writer_hostgroup 1 and reader_hostgroup 2 because we onboarded all the instances in the cluster with hostgroup_id 1 in a previous step.

Create the writer and reader replication hostgroups on ProxySQL:

ProxySQL Admin> INSERT INTO mysql_replication_hostgroups (writer_hostgroup,reader_hostgroup,comment) VALUES (1,2,'aws-1sec-downtime');

ProxySQL Admin> select * from mysql_replication_hostgroups;
+------------------+------------------+------------+-------------------------+
| writer_hostgroup | reader_hostgroup | check_type | comment                 |
+------------------+------------------+------------+-------------------------+
| 1                | 2                | read_only  | aws-1sec-downtime       |
+------------------+------------------+------------+-------------------------+

To enable the replication hostgroup, run the following:

ProxySQL Admin> LOAD MYSQL SERVERS TO RUNTIME;
ProxySQL Admin> SAVE MYSQL SERVERS TO DISK;

Verify both reader instances (read_only=1) and one writer instance (read_only=0) in the cluster have moved to the right host groups. There should be no errors in the log.

ProxySQL Admin> SELECT * FROM monitor.mysql_server_read_only_log ORDER BY time_start_us DESC LIMIT 3;
+---------------------------------------------------------------+------+------------------+-----------------+-----------+-------+
| hostname                                                      | port | time_start_us    | success_time_us | read_only | error |
+---------------------------------------------------------------+------+------------------+-----------------+-----------+-------+
| mycluster-instance-2.csempbee1cem.us-west-2.rds.amazonaws.com | 3306 | 1699346010885708 | 317             | 1         | NULL  |
| mycluster-instance-3.csempbee1cem.us-west-2.rds.amazonaws.com | 3306 | 1699346010885442 | 601             | 1         | NULL  |
| mycluster-instance-1.csempbee1cem.us-west-2.rds.amazonaws.com | 3306 | 1699346010885308 | 746             | 0         | NULL  |
+---------------------------------------------------------------+------+------------------+-----------------+-----------+-------+
3 rows in set (0.00 sec)

ProxySQL is now monitoring the read_only value for all the instances in the cluster. It has created hostgroup 2 automatically, and moved the two reader instances from hostgroup 1 (writer hostgroup) to hostgroup 2 (reader hostgroup).

ProxySQL Admin> select hostgroup_id, hostname, port, status from mysql_servers;
+--------------+---------------------------------------------------------------+------+--------+
| hostgroup_id | hostname                                                      | port | status |
+--------------+---------------------------------------------------------------+------+--------+
| 1            | mycluster-instance-1.csempbee1cem.us-west-2.rds.amazonaws.com | 3306 | ONLINE |
| 2            | mycluster-instance-3.csempbee1cem.us-west-2.rds.amazonaws.com | 3306 | ONLINE |
| 2            | mycluster-instance-2.csempbee1cem.us-west-2.rds.amazonaws.com | 3306 | ONLINE |
+--------------+---------------------------------------------------------------+------+--------+
rows in set (0.00 sec)

As a final step, persist all config to disk:

ProxySQL Admin> SAVE MYSQL SERVERS TO DISK;
ProxySQL Admin> SAVE MYSQL VARIABLES TO DISK;

Configure ProxySQL mysql_users

Next, we configure the MySQL users that client should use to connect to ProxySQL and in turn connect to the cluster DB instances to send traffic.

Insert the user into the mysql_users table on ProxySQL:
```
Proxy Admin> INSERT INTO mysql_users(username,password,default_hostgroup) VALUES ('meena','password',1);
```
Note that this user should exist on the cluster as well, otherwise ProxySQL can’t establish the connection to the cluster DB on behalf of the client to send traffic.

In this example, we specified the default_hostgroup as 1 because our writer_hostgroup is 1 and we intend to send write traffic from this user.

Load the configuration to runtime and disk:

Proxy Admin> LOAD MYSQL USERS TO RUNTIME;
Proxy Admin> SAVE MYSQL USERS TO DISK;

We’re done configuring ProxySQL.

Exit from the ProxySQL admin interface:
```
ProxySQL Admin> exit
```

Validate ProxySQL connection to the Amazon RDS Multi-AZ DB cluster

ProxySQL is now ready to serve application traffic for the Amazon RDS Multi-AZ cluster. You can test this on the ProxySQL EC2 instance by running the following command and verifying it connects to the cluster. Update the arguments -P <port>, -u <user>, and -p<password> as applicable:

mysql -h 127.0.0.1 -P 3306 -u meena -ppassword

Test 1 second or less downtime

Now that you have successfully configured ProxySQL and connected it to your Amazon RDS for MySQL Multi-AZ DB cluster, let’s test for 1 second or less of downtime during a minor version upgrade of the cluster.

Perform a minor version upgrade. In this example, we upgrade the cluster from MySQL 8.0.33 to 8.0.34:

aws rds modify-db-cluster \
  --db-cluster-identifier mycluster \
  --engine-version 8.0.34 \
  --apply-immediately \
  --region us-west-2

The upgrade itself may take time to complete. However, you can monitor the application downtime during the upgrade. The downtime is typically under 1 second, provided the replica lag is near zero when the new writer is elected during the upgrade process.

Here’s a simple bash you can run to measure the downtime:

mysql -h ec2-proxysql.us-west-2.compute.amazonaws.com -P 3306 -u meena -ppassword -e 'CREATE DATABASE testdb';
mysql -h ec2-proxysql.us-west-2.compute.amazonaws.com -P 3306 -u meena -ppassword -e 'CREATE TABLE testdb.testtable (`id` int NOT NULL, `current_timestamp` VARCHAR(255), `server_id` VARCHAR(255), `read_only` int, PRIMARY KEY (`id`))';
while true; do mysql -h ec2-proxysql.us-west-2.compute.amazonaws.com -P 3306 -u meena -ppassword -e 'replace into testdb.testtable values (1, current_timestamp(6), @@server_id, @@read_only); select * from testdb.testtable' >> downtime_test.txt; sleep 0.1; done

Remember to update the arguments -h <ProxySQL_host>, -P <port>, -u <user>, and -p<password> as applicable. This bash creates a table ‘testtable’ under a new database ‘testdb’ on your cluster via ProxySQL and keeps writing entries to the ‘testtable’ while also logging them to a log file downtime_test.txt.

Once your upgrade is done, open the downtime_test.txt file and find the timestamp at which the server_id value changes. This is the timestamp at which the new writer first started handling write traffic and the timestamp before it is the timestamp at which the old writer handled the last write. Subtract the two timestamps, that’s your downtime.

id      current_timestamp               server_id       read_only
1       2023-11-07 23:47:43.721050      1018386611      0         // old writer: last write
id      current_timestamp               server_id       read_only
1       2023-11-07 23:47:44.451208      1325011101      0	      // new writer: first write

In this example, the downtime is 730ms.

Clean up

If you don’t plan on using the EC2 ProxySQL instance or the Amazon RDS Multi-AZ DB cluster beyond testing, delete the resources to avoid incurring future charges.

Summary

In this post, we examined in detail how you can configure open source ProxySQL to achieve typically under 1 second of downtime while upgrading Amazon RDS for MySQL Multi-AZ DB cluster. This is a significant improvement from the typical 35 seconds of downtime you will experience when you perform minor version upgrades or system maintenance updates on your Amazon RDS Multi-AZ DB cluster due to DNS propagation delay issues described earlier in the post.

Go try open source ProxySQL with Amazon RDS for MySQL Multi-AZ deployments with two readable standbys. Let me know how it works!

About the author

Meenakshi Garg is a software engineer at AWS working on building Amazon RDS for MySQL and MariaDB. She is passionate about technology and is based in California.