Why is it taking a long time to perform a point-in-time recovery of my Amazon RDS for MySQL instance?

3 minute read

I've initiated a point-in-time recovery (PITR) in Amazon Relational Database Service (Amazon RDS) for MySQL, and it's taking longer than expected. Why is this happening?

Short description

Point-in-time recovery (PITR) is the process of restoring a database to the state that it was in at a specified date and time. When you initiate a PITR, the most recent backup (either automated or manual) is restored. The transactional logs are then applied to roll the Amazon RDS database forward to the PITR time.

Resolution

Best practices for avoiding a long point-in-time recovery

To avoid a long point-in-time recovery, follow these best practices:

Create a disaster recovery strategy.
Use smaller transactions and run the COMMIT command more frequently.
To run a large transaction, take a snapshot before and after the large transactions. However, transactions that are larger than the max_allowed_packet parameter cause PITR to fail.
Minimize snapshot restore times. Snapshot restores are initiated as part of the point-in-time recovery process. A longer snapshot restore can contribute to a longer point-in-time recovery session. For more information, see Why is it taking so long to restore a snapshot of my Amazon RDS for MySQL DB instance?
A log apply process can take more time depending on the number of logs to be applied. To reduce the number of logs to be applied, consider taking a manual snapshot in-between the automated backups. As point-in-time recovery automatically selects automatic or manual snapshots created near the PITR time, having intermediate manual snapshots can reduce the number of logs to be applied. If you're dealing with a large volume of changes, take a manual snapshot every 3-4 hours.
If you replay any large transactions, a low wait_timeout value can break point-in-time restore processes in Amazon RDS for MySQL. For example, breaks occur if you're performing a large row-based bulk update, insert, or delete and the replay takes longer than wait_timeout. To prevent any disruptions in the PITR process, set the wait_timeout value to "600" (10 minutes) or more. For more information, see the wait_timeout section in Best practices for configuring parameters for Amazon RDS for MySQL.
When row-based binary logging is used, consider setting the binlog_row_image parameter value to "MINIMAL" instead of "FULL". This updated value will reduce the size of the binary logs, minimizing binlog recovery time.
Unless you need a specific binlog format, consider using the MIXED logging format. With mixed logging, statement-based logging is used by default, but the logging mode switches automatically to row-based as needed. This switching can help reduce binlog sizes. For more information about MIXED logging, see Binary logging formats on the MySQL website.

Point-in-time recovery failures

The following scenarios will cause point-in-time recovery to fail:

DB instances that are in a "storage full" state.
Using MyISAM tables in MySQL. MyISAM storage engines do not support reliable crash recovery processes. The MyISAM tables prevent the point-in-time recovery process from working properly.
Using temporary tables, distributed transactions, and in-memory tables.

Topics

Migration & Modernization Analytics Database

Relevant content

[RDS, Aurora MySQL] Frequent "Recovery of the DB instance has started. Recovery time will vary with the amount of data to be recovered."
User28306523088602983
asked 2 years ago
Disaster recovery of AWS RDS
Accepted Answer
rePost-User-1028852
asked a year ago
PITR testing for disaster recovery
rePost-User-1893682
asked 2 months ago
AWS CLI - RDS Point in time recovery
Accepted Answer
Wiljer
asked 6 months ago
Recovery of the DB instance has started - AWS RDS MySQL
rePost-User-4671366
asked a year ago
Why is the point-in-time restore of my Amazon RDS DB instance taking a long time?
AWS OFFICIALUpdated 2 years ago
Why is the restore of my Amazon DynamoDB table taking a long time?
AWS OFFICIALUpdated 2 years ago
Why is it taking so long to restore a snapshot of my Amazon RDS for MySQL DB instance?
AWS OFFICIALUpdated 3 years ago
How do I avoid the long point-in-time recovery period of my RDS for MySQL DB instance?
AWS OFFICIALUpdated 9 months ago
AWS Elastic Disaster Recovery with VMware Raw Device Mappings
EXPERT
Patrick Kremer
published a year ago