AWS Database Blog

Migration options for MySQL to Amazon RDS for MySQL or Amazon Aurora MySQL

MySQL is one of the most popular open-source relational databases in the world. Many customers choose to run their MySQL workload in Amazon Relational Database Service (Amazon RDS) for MySQL because it takes care of heavy lifting tasks such as hardware provisioning, database setup, patching, configuration, monitoring, backups, and scaling.

Amazon Aurora is a great option for any enterprise application that can use a relational database. Aurora is a MySQL and PostgreSQL-compatible relational database built for the cloud. If you want a modern relational database with full MySQL compatibility that provides high availability and performance at global scale, you can migrate your MySQL database to Amazon Aurora MySQL-Compatible Edition. With Aurora, you can have up to 15 low-latency read replicas for each cluster. It provides enterprise-grade advanced features like fast database clones, auto scaling replicas, and backtracking, to name a few. In addition to those features, Aurora database cluster storage spans across three Availability Zones in a Region, providing out-of-the-box durability and fault tolerance to your data across physical data centers. With Amazon Aurora Global Database, you can extend this across Regions by adding secondary database clusters in different Regions. It replicates your data with negligible impact on database performance, enables fast local reads with low latency in each Region, and provides disaster recovery from Region-wide outages.

Migrations are among the most time-consuming and critical tasks handled by DBAs. Large-scale database migrations require adequate planning and implementation to make sure that the workload being migrated performs the same way or better. There are two common ways to migrate a database: physical and logical. Physical migration allows you to copy the physical database files to migrate the database. Logical migration allows you to create a logical backup of your database by generating SQL statements in text or SQL files and executing them on your target database to generate the same schema and data structure. Amazon RDS for MySQL and Aurora MySQL support both physical and logical migration using Percona XtraBackup and native MySQL-compatible tools.

In this post, we discuss different options to migrate data from a self-managed MySQL database to AWS-managed Amazon RDS for MySQL or Aurora MySQL.

Migration options from a self-managed MySQL database

Given the popularity of MySQL databases, you can choose from a wide range of options to migrate data from one MySQL-compatible database to another. Choosing the right migration strategy based on ease of use, data size, and downtime requirements is critical for a successful migration. Let’s look at the migration options available in more detail.

Physical migration

Percona XtraBackup allows you to perform consistent backups on running databases without affecting your application. It works by copying the physical InnoDB data files from the source and restoring them on the target. Because there is no logical data movement involved (selecting data and inserting it again), physical migration using Percona XtraBackup is more performant than logical migration. You can use this method for data of any size.

Amazon RDS for MySQL supports migration from Percona XtraBackup files stored in an Amazon Simple Storage Service (Amazon S3) bucket from source MySQL versions 5.6, 5.7, and 8.0. Aurora MySQL supports migration from Percona XtraBackup files that are stored in an S3 bucket from the source MySQL version 5.6 and 5.7. If you’re already using Percona XtraBackup for database backups in the database you’re planning to migrate, you can copy the full and incremental backup files to Amazon S3 and restore them to Amazon RDS for MySQL or Aurora MySQL.

For a MySQL 8.0 migration, you must use Percona XtraBackup 8.0.12 and higher. For MySQL 5.7 and 5.6 migrations, you can also use Percona XtraBackup 2.4 or 8.0.12 and higher versions.

The following diagram shows the high-level steps involved in migrating data using the Percona XtraBackup method:

Migrating using Percona XtraBackup involves the following steps:

  1. Use the Percona XtraBackup tool to create a backup of the source MySQL database.
  2. Upload backup files to an S3 bucket.
  3. Restore backup files through the AWS Management Console, AWS Command Line Interface (AWS CLI), or API.
  4. (Optional) For minimal downtime, set up replication between the source and Aurora MySQL cluster using binlog replication.

For detailed step-by-step instructions on using the Percona Xtrabackup method and setting up binary log replication, refer to Migrating data from MySQL by using an Amazon S3 bucket. For more information about limitations, see Limitations and recommendations for importing backup files from Amazon S3 to Amazon RDS.

Logical migration

In this section, we discuss the different tools available for a logical migration.

mysqldump

mysqldump is a popular logical backup tool for MySQL-compatible database engines. With this tool, you can export schema definitions and data from a MySQL server. To import a logical backup to the target MySQL server, you can use the MySQL command line client or the SOURCE SQL command from inside the database. In this section, we discuss two methods to perform a dump and import to the target RDS for MySQL or Aurora MySQL database.

The first method involves exporting data from the existing MySQL-compatible database and importing it directly into RDS for MySQL or Aurora MySQL using mysqldump and mysql commands. This is a one-step migration because it loads the data to the target database without creating any intermediate dump files. This method is simple and straightforward as long as you have a high-speed, low-latency, and stable network connection between the source and the target server. Use the following example command on a host with the MySQL client installed and has network communication to RDS MySQL instance:

mysqldump --host=<host_name> --user=<source_user> \
--password=<source_user_password> --databases <database_name> \ 
--single-transaction --compress --order-by-primary \
| mysql -u <RDS_user> --port=<rds_port_numer> \
--host=<db_identifier> -p<RDS_password>

The second method involves creating a data dump of an existing MySQL-compatible database using the mysqldump tool and restoring the SQL dump file to an RDS for MySQL or Aurora MySQL DB cluster using the SOURCE SQL command. This approach is ideal for smaller datasets because as the data size increases, the amount of time it takes for the backup and restore increases. Use the following command to create a SQL dump of the database:

mysqldump --host=<host_name> --user=<source_user> \
--password=<source_user_password> --databases <database_name> \
--single-transaction --compress --order-by-primary \
--master-data=2 -p<local_password> -r backup.sql

The --single-transaction option sets the transaction isolation mode to repeatable read and sends a START TRANSACTION SQL statement to the server before dumping data. This creates a consistent backup by dumping all tables in a single transaction. To avoid the impact on your production database, you can create a read replica and take a backup from the read replica.

After you create the MySQL dump file of your database, import it using the source SQL command after connecting to the target database server:

source backup.sql;

If you require a near-zero downtime migration, you can use mysqldump to create backup files that not only include the data, but also the binary log position at the time when the consistent backup is taken. To do so, use the parameter --master-data=2. The --master-data parameter with value 2 captures the binary log and its position in the dump file as a part of the CHANGE MASTER TO statement to set up the binary log replication for near-zero downtime migration. You can use the binary log and its position obtained from the dump file to set up replication between a self-managed MySQL source and Amazon RDS for MySQL or Aurora MySQL. For step-by-step instructions on setting up this replication, Replicate between your external database and new Amazon RDS DB instance.

If you’re using a read replica to take a backup, you can stop the replication between the source and read replica. Capture the binary log and its position using SHOW REPLICA STATUS, and then take the backup using mysqldump. After successfully importing the data to Amazon RDS for MySQL or Aurora MySQL, you can set up the binary log replication to the source MySQL database using ‘Relay_Master_Log_File‘ and ‘Exec_Master_Log_Pos‘.

Multi-threaded migration using mydumper and myloader

mydumper and myloader are open-source MySQL export/import tools designed to address performance issues associated with the legacy mysqldump utility. They offer advanced features such as dumping and loading data using parallel threads, creating dump files in a file-per-table fashion, creating chunked dumps in a multiple-files-per-table fashion, dumping data and metadata into separate files for easier parsing and management, configurable transaction size during import, and more.

This approach is suitable for both small and large-scale migrations (data size ranging from hundreds of GBs to TBs) because these tools have flexible configuration options. You can use these tools with both Amazon RDS for MySQL and Aurora MySQL.

Use the following command to create a data dump file of DbName1 and DbName2 databases using mydumper with basic configurations:

mydumper --host=<host_name> --user=<source_user> \
--password=<source_user_passwprd> --directory=<output-dir> \
--threads=4 --compress --events --triggers --routines \
--regex ‘^(DbName1\.|DbName2\.)’ \
--logfile /<output_dir>/mydumper-logs.txt

For a complete list of parameters and usage, see Mydumper Usage.

Use the following command to restore the backup using myloader with basic configurations:

myloader --host=<host_name> --user=<source_user> \
--password=<source_user_passwprd> --directory=<output-dir> \
--queries-per-transaction=500 –threads=4 --verbose=3 \
–-logfile /<output_dir>/myloader-logs.txt

For a complete list of parameters and usage, see Myloader Usage.

If you have a requirement of near-zero downtime migration, mydumper captures the binary log and its positions before taking the backup and stores them in the metadata file in the target directory where the dump files are stored. Read the metadata file using the cat Linux command, and the results should appear similar to the following output:

$cat metadata 
Started dump at: 2021-10-28 23:29:13
SHOW MASTER STATUS:
	Log: mysql-bin-changelog.000003
	Pos: 37993751
	GTID:

You can use the binary log and its position to set up replication between the self-managed MySQL source and Amazon RDS for MySQL or Aurora MySQL.

Flat file migration using files in CSV format

Migrating with flat files is a preferred choice for many database administrators over the SQL format file because flat file dumps are easier to process using command line tools like split or combine. The lack of SQL encapsulation results in smaller dump files and reduces processing overhead during import, and it creates flat files in a file-per-table fashion, which makes it easy to import them in parallel.

The one disadvantage of using the flat file method is that the server uses a single transaction to import data from each dump file. For example, if you’re importing a large dump file and a crash occurs for any reason, it rolls back the transaction, which can be very time-consuming. To have more control over the size of import transactions, you need to manually split very large dump files into smaller files, and then import one file at a time. Files are produced in comma-separated value (CSV) or tab-separated value (TSV) format. This migration method uses a combination of the mysqldump tool and SQL commands.

First, use mysqldump to create a schema-only dump in SQL format, as shown in the following code. This dump describes the structure of schema objects like tables, views, and functions but doesn’t contain data.

mysqldump --host=<host_name> --user=<source_user> \
--password=<source_user_password> --databases <database_name> \
--single-transaction –-no-data > myschema_dump.sql

Next, create flat file data dumps using one of the following options:

  • The SELECT … INTO OUTFILE statement, which dumps table contents (but not table structure) into a file located in the server’s local file system.
  • The mysqldump command with the –tab parameter, which also dumps table contents to a file and creates the relevant metadata files with CREATE TABLE statements. The command uses SELECT … INTO OUTFILE internally, so it also creates dump files on the server’s local file system.

When you create the files using any of these methods, you can import it to the target RDS for MySQL or Aurora MySQL database using one of the following methods:

AWS Database Migration Service

AWS Database Migration Service (AWS DMS) is a managed database migration service that helps you quickly and securely migrate a database to AWS. It offers a wide range of options to migrate the data from source to target, such as remapping schema or table names, advanced data filtering, and migrating and replicating multiple database servers into a single Aurora MySQL cluster. With AWS DMS, you can perform simple migrations with near-zero downtime using change data capture (CDC) replication. AWS DMS can be an option if your source is a MySQL-compatible database such as MariaDB and can’t be migrated using any of the methods described previously, such as Percona XtraBackup or native export/import tools. For more information, see Using a MySQL-compatible database as a source for AWS DMS.

Migrate from Amazon RDS for MySQL to Aurora MySQL

To ease the migration burden, Amazon RDS provides two additional options to migrate from Amazon RDS for MySQL to Aurora MySQL. With a few clicks, you can migrate data to Aurora MySQL using the physical migration methods described in this section.

Aurora read replicas

With Aurora read replicas, you can migrate from Amazon RDS for MySQL to Aurora MySQL with near-zero downtime. This method is ideal if you want to migrate to Aurora MySQL in the same account and Region as the RDS instance. You can do this via the console or the AWS CLI. When you create an Aurora read replica of a MySQL DB instance, Amazon RDS creates a DB snapshot of your source MySQL DB instance and migrates the data from the DB snapshot to an Aurora read replica. After that, Amazon RDS for MySQL starts the binary log replication between Amazon RDS for MySQL and the Aurora MySQL DB cluster. When the Aurora MySQL read replica catches up with the source RDS for MySQL instance, stop the application and wait for any inflight transactions to catch up by monitoring the AuroraBinlogReplicaLag Amazon CloudWatch metric until it reaches zero. Point the application to the Aurora DB cluster by replacing the RDS for MySQL instance endpoint with the Aurora cluster endpoint in your application code.

Note that you can only create an Aurora MySQL 3 read replica DB cluster of an Amazon RDS for MySQL 8.0 DB instance (MySQL version 8.0.23 and below).

The following diagram shows the migration process from Amazon RDS for MySQL to Aurora MySQL using the Aurora read replica method:

Amazon RDS snapshot

This option is ideal if you want to migrate to another AWS account or Region. You can migrate either a manual or automated DB snapshot of the RDS for MySQL instance. You can perform this operation through the console or using the AWS CLI or AWS SDK. You can only restore an RDS for MySQL 5.7 snapshot to Aurora MySQL version 2 compatible with MySQL 5.7 or an RDS for MySQL 8.0 (MySQL version 8.0.23 and below) snapshot to Aurora MySQL version 3 compatible with MySQL 8.0. If you need to migrate from Amazon RDS for MySQL 5.7 version to Aurora MySQL 3, you can perform the migration in the following ways:

The following diagram shows the migration process from Amazon RDS for MySQL to Aurora MySQL using the RDS snapshot migration method:

Summary of available migration methods

The following table summarizes the common migration methods that are available to migrate small and large-scale, self-managed MySQL databases efficiently to Amazon RDS for MySQL or Aurora MySQL. For more information, see the Amazon Aurora Migration Handbook.

Source database Target database Suggested migration tool options
Self-managed MySQL Amazon RDS for MySQL or Aurora MySQL Percona XtraBackup (preferred)
mysqldump or mydumper/myloader
AWS DMS
Amazon RDS for MySQL Aurora MySQL Aurora read replica (preferred)
Amazon RDS snapshot restore
MySQL compatible (for example, MariaDB) Amazon RDS for MySQL or Aurora MySQL Logical migration using native tools (preferred)
AWS DMS

Conclusion

Identifying and choosing the right tool for migration can greatly reduce the cost and complexity of database migrations. In this post, we discussed different methods to migrate data from a self-managed MySQL-compatible database to Amazon RDS for MySQL or Aurora MySQL. In the next post, we dive deep into the best practices of migration tools that we covered in this post. Leave your comments and feedback in the comments section.


About the Authors

Surendar Munimohan is a Sr. Database Solutions Architect with Amazon Web Services.

Vijay Karumajji is a Database Solutions Architect with Amazon Web Services. He works with our customers to provide guidance and technical assistance on database projects, helping them improve the value of their solutions when using AWS.