Why is it taking so long to restore a snapshot of my Amazon RDS for MySQL DB instance?

4 minute read
0

I'm trying to restore a snapshot of my Amazon Relational Database Service (Amazon RDS) for MySQL DB instance. Why is it taking so long?

Short description

Long snapshot restore times are typically caused by long database recoveries. The recovery time depends on workload on your instance when snapshot was taken. If your source DB instance has binary logging enabled, then recovery might take longer. As a result, your snapshot restore duration can also be impacted.

Resolution

When you restore a snapshot, Amazon RDS performs a recovery process and the MySQL DB engine is started on a new DB instance. The new DB instance start can take up to a few minutes, depending on the length of the recovery session during an instance startup. For more information, see InnoDB crash recovery on the MySQL website.

Note: You'll experience some latency (or lazy loading) until the volume is fully hydrated from Amazon Simple Storage Service (Amazon S3). For more information about lazy loading, see Restoring from a snapshot.

To reduce the snapshot restore completion time in Amazon RDS, consider the following approaches:

  • Schedule a backup window or take a manual snapshot of your DB instance during off-peak hours. Activities performed on the source DB instance while taking a snapshot affects the database recovery time and any snapshot restore times.
  • If a source instance is using the magnetic storage type during a snapshot, then the newly restored instance will be in a modifying state. For example, when you restore a DB snapshot as a General Purpose SSD (GP2) or Provisioned IOPS (PIOPS) storage type, the underlying volume change occurs. As a result, your new instance indicates a "modifying" state. During this time, you can still connect to an Amazon RDS instance, although you might experience some performance degradation.
  • Temporarily restore your instance to a higher DB instance class (such as an instance class that has more memory or RAM). By upgrading the DB instance class, crash recovery times can be improved. You temporarily gain more resources, which can help speed up the overall crash recovery time. After the snapshot restore completes, you can scale down the instance class.

To reduce the snapshot restore completion time for snapshots with binary logging enabled in Amazon RDS, consider the following:

  • When binary logging is enabled (such as when the source instance has automated backups enabled), binary logs directly affect the snapshot restore time. During crash recovery, the snapshot restore process also performs a binary log recovery.
  • To decrease binlog recovery times, avoid large transactions and large binlog files. The more data that is logged in the binary logs, the more data that the restore process must process during a binlog recovery. As a result, recovery time is increased, which also increases snapshot restore times.
  • Use the correct transaction size whenever possible. Large transactions are written to the binary log file at one time, and are not split up amongst different files. As a result, the binary log file ends up being large, increasing the crash recovery time.
  • The type of binary logging format used can also impact the size and efficiency of recovery. Some formats (such as row-based logging) log more information than others in the binary logs. Statements that modify a large number of rows in a table causes the DB engine to generate binlog entries for every modified row. As a result, you'll get a large binlog file. For more information about the row-based logging format, see Usage of row-based logging and replication on the MySQL website. For more information about the different types of binary logging formats, see Advantages and disadvantages of statement-based and row-based replication on the MySQL website.

AWS OFFICIAL
AWS OFFICIALUpdated 3 years ago