Recover from a disaster with delayed replication in Amazon RDS for MySQL
Amazon RDS for MySQL now supports a delayed replication, which allows you to set a time period that a replica database lags behind a source database. In a standard MySQL replication configuration, there is minimal delay between the source and the replica. Now you have the option to introduce an intentional delay.
A delay is very helpful when you need to recover from a human error. For example, if someone accidentally drops a table from your primary database, that accident doesn’t have to be saved in a replica. Now, you can stop the replication just before the point at which the table was dropped and promote the replica to become a standalone instance. In this blog post, I walk you through how to use delayed replication to save yourself from just such a scenario.
The following diagram illustrates how you can use a replica with a delay set to 3600 (one hour) to recover from a human error. First, stop replication; second, discover the error in the logs; next, replay the transactions up to the mistake; and last, promote the replica to the master.
Before you get started checking out delayed replication, you need an Amazon RDS for MySQL source database instance running MySQL 5.6.40 or 5.7.22 or later. You also need a MySQL client to connect to your instances and the proper security groups that allow you access to your database.
Make sure you’re retaining your binary logs for a sufficient amount of time. For more information about binary logs, see MySQL Binary Logs. The following example command shows setting the retention value to 24 hours.
Setting up the scenario
Use your own existing Amazon RDS for MySQL database with an existing read replica, or create a new database. For this blog post, I use an existing RDS for MySQL database and create a new read replica.
Create a database
If you don’t already have an Amazon RDS for MySQL instance, create one. Be sure to configure the database with a security group that allows access from your client machine. If you already have a MySQL database that you want to work with, you can skip this step.
Create an RDS for MySQL database by using the AWS Management Console, AWS CLI, AWS SDK, or an AWS CloudFormation template. If you need help creating a MySQL instance, follow the steps in Create and Connect to a MySQL Database with Amazon RDS. The following screenshot shows one database instance already set up and ready to use.
Connect to the database
After your master database instance is created and available, connect to it. If you’re using an Amazon EC2 Linux machine, you might want to set up some environment variables to save some extra typing, as shown in the following commands.
Connect to MySQL using the MySQL command line tool and variables, as follows.
After you’re connected, create some test tables and some data. As a demonstration, I added a few tables to a schema called
myschema. I added the following code.
Create the read replica
Add a read replica for disaster recovery. Use the AWS CLI or the AWS Management Console to add your read replica. The replica can be in the same Availability Zone, a different Availability Zone, or a different AWS Region. Here is how I create a read replica.
After your replica is shown as available, configure a replica delay for it behind your master instance.
Configure the replica delay
You configure replication delay on the source instance either before or after a read replica is created. With delayed replication, you specify the minimum amount of time, in seconds, to delay replication from the source to the read-replica. To configure the replication delay, connect to the replica and call the stored procedure
mysql.rds_set_source_delay with a parameter value between 0 and 86400. The maximum value is 1 day (86400 seconds). We recommend that you set the value to at least 3600 (1 hour) to allow enough time to diagnose and recover from an erroneous SQL statement. Here is how I set this:
Recovering from a dropped table
In my example, 3 hours after creating the replica, I dropped the table mytesttable11 when I actually meant to drop mytesttable01. Because the replica is delayed by one hour, I have time to figure out what happened and keep the mistake from propagating to the replica. I realized my mistake immediately, so I took the following recovery steps.
- Connect to the read replica
blogger-mm-rrand called the stored procedure
If you can stop replication when you first realize an issue, then you can find the logs with the mistake and take action to prevent them from getting replayed on the replica.
- Find the binlog location of the error using
mysqlbinlogutility. For more information about
mysqlbinlogwith RDS, see Log Access Concepts.
- Set your replication delay to zero so that the replication from this point happens as quickly as possible. To do this, run
- After you have found the log and the location that contains your change, make a note of the log name and the location. My error occurred just after location
mysql-bin-changelog.000263. To start replication until it reaches your location in your binary log file, use the procedure
call.rds_start_replication_until. Warning – Any changes that were made after the location specified in the recovery have to be manually reapplied or are permanently lost. Here is how I did this:
- Wait for the RDS Event “Replication has been stopped since the replica reached the stop point specified by the rds_start_replication_until stored procedure. Use
rds_start_replicationto resume replication.” Connect to the replica and verify that all changes have been applied by checking if the Relay_Master_Log_File and Exec_Master_Log_Pos are at least the positions that you passed. Here is how I checked this:
- Promote the replica after it has all the changes prior to the dropped table. Here is how I promoted:
- Remove the old master instance. Here is a command for this:
- Rename the former read replica after the old master is gone by using a command like this:
- Connect to the new master database and resume normal operations.
Flexibility with other scenarios
In addition to setting the replication delay on an existing RDS for MySQL read replica, you might want to set the delay value on the source database instance before you even create a new read replica. Do you have a non-RDS source database? You can also configure a delayed replica in RDS with a non-RDS source. Likewise, you can configure an RDS source for replication with a delayed non-RDS replica.
The ability to use a delayed read replica with Amazon RDS for MySQL opens up new options for disaster recovery. As you can see, a replica delay is very helpful if you experience a situation such as someone accidentally dropping a needed table from your database. You can stop the replication, recover the database up to the point of the accident, promote the replica to a standalone instance, and then resume your application.
About the Author
Wendy Neu has worked as a Data Architect with Amazon since January 2015. Prior to joining Amazon, she worked as a consultant in Cincinnati, OH, helping customers integrate and manage their data from different unrelated data sources.