AWS Partner Network (APN) Blog
Simplify Mission-Critical Workloads by Migrating to CockroachDB with AWS DMS
By Oliver Tan, Member of Technical Staff – Cockroach Labs
By Ryan Kuo, Staff Technical Writer – Cockroach Labs
By Pranav Deshmukh, Sr. Partner Solutions Architect – AWS
Cockroach Labs |
Anyone who’s migrated from any kind of platform to another knows that migrations can be challenging. Database migrations, in particular, take a tremendous amount of technical and logistical preparation, time, energy, troubleshooting, and optimization. All the while, there’s business-level pressure to get the application up and running as soon as possible.
CockroachDB is a cloud-native, distributed SQL database designed for applications with data-intensive workloads. CockroachDB’s migration toolset, known as MOLT, makes it easier and faster to migrate from other databases to CockroachDB by simplifying the end-to-end migration lifecycle.
AWS Database Migration Service (AWS DMS) is a managed migration and replication service that helps move database and analytics workloads to Amazon Web Services (AWS) quickly, securely, and with minimal downtime and zero data loss. AWS DMS is widely regarded as the easiest migration tool, and it now can be used with CockroachDB as a target database.
Note that CockroachDB is not officially supported as a target, but it works when presented as PostgreSQL.
In this post, we’ll explore what CockroachDB is, describe how AWS DMS can help migrate data to CockroachDB, and walk through an example migration.
Cockroach Labs is an AWS Data and Analytics Competency Partner and AWS Marketplace Seller. Cockroach Labs is the creator of CockroachDB, which is in use at some of the world’s largest enterprises and some of the largest companies in banking, retail, and media.
A Short CockroachDB Primer
CockroachDB is a PostgreSQL-compatible database backed by a highly scalable distributed backend. Using the PostgreSQL dialect you know and love, you can also enjoy features found on powerful distributed systems:
- Scale fast: Data is sharded, enabling elastic scaling to fit storage and compute needs.
- Disaster recovery: The database will continue to operate if nodes, availability zones, or whole regions are taken down.
- Thrive anywhere: With powerful multi-region abstractions, data at the row level can be controlled to comply with regional privacy laws.
Figure 1 – Data is sharded while presenting a single logical database.
Migrating Mission-Critical Workloads with AWS DMS
Traditionally, database migrations are performed by taking a database dump from the source database with the relevant database tool (such as mysqldump or pg_dump), translating that dump to the target language, and then ingesting the data into the target database.
This approach has significant challenges, especially for large databases:
- Large databases produce large dumps. These can be cumbersome to manage, especially when database dumps are split into smaller pieces.
- Translating dumps from a source to a target can be tricky, resource-intensive, and error-prone.
- Database dumps take a snapshot at a certain moment in time, and they can take time to complete. Ingesting the dump can also take time, which represents downtime on your database that your mission-critical workload can’t handle.
- It’s worth noting you can use logical feeds (PostgreSQL logical replication, MySQL binary logs) to stream changes between the dump being taken and the dump succeeding. This would minimize downtime, but requires even more work.
Running mission-critical applications means keeping your database online as much as possible. When migrating databases, the above compromises are probably not an option. Here’s where AWS DMS can help.
Figure 2 – Database migration overview.
Using AWS DMS is simple—just tell AWS DMS your database credentials and it handles the schema and data migration.
Even better, you can migrate from a large selection of database engines—such as MySQL, PostgreSQL, and Oracle—to a target database that doesn’t even have to use the same database engine. This makes it a great companion for migrating your application into CockroachDB.
Under the hood, AWS DMS does the following:
- Iterates over all your source tables and converts the schema to an equivalent form for the target database.
- Initiates a snapshot of all data from your source database, translates the rows if they are different dialects, and inputs the data into the target database.
- Replicates any changes on the source database over to the target database using change data capture (CDC) logical feed changes. If starting from an initial load, AWS DMS knows how to reconcile new changes and move the data over.
These steps together make for a painless, error-free, and minimal-downtime migration. All you have to coordinate is flipping your application over to using your new target database, although you may also want to take a look at CockroachDB’s best practices for maximizing performance.
Figure 3 – Less downtime required when using CDC.
Migrate to CockroachDB Using AWS DMS
Before migrating, create a SQL user on CockroachDB to use when creating your AWS DMS target endpoint.
Open AWS DMS in your AWS Management Console. Then, click Replication instances and create a replication instance to perform the AWS DMS migration. This performs the initial load of data onto the target database.
Navigate to the Endpoints tab, and create a source endpoint that points to the database you are migrating from.
Create a target endpoint that points to your CockroachDB database:
- Select PostgreSQL as the Target engine.
- Select a CockroachDB SQL user to use for the migration.
- Specify the connection parameters for your database. To find this information, refer to the CockroachDB documentation. Note that database name must be {database} if you are running a CockroachDB serverless cluster.
Figure 4 – Setting the Endpoint configuration.
After creating the source and target endpoints, click Database migration tasks.
Create a replication task to migrate data from the source database to the target:
- Under Task configuration, select the Replication instance, Source database endpoint, and Target database endpoint you created earlier.
- For Migration type, choose Migrate existing data and replicate ongoing changes. This instructs AWS DMS to perform the initial load of data, and then continuously replicate any changes to the data to CockroachDB.
- Under Task settings, we recommend selecting the Enable CloudWatch logs option for detailed log messages about your migration. Set Target Load to Detailed debug.
- Under Table mappings, click Add new selection rule and specify a schema and tables to include in the migration.
Figure 5 – Configuring the task.
After creating the task, watch the migration happen. Follow the status of the migration in the Table statistics display for the database migration task.
Figure 6 – Monitoring table statistics.
When the migration is complete, simply point your application over to the new database. Congratulations, you are now running on CockroachDB!
For detailed instructions on setting up an AWS DMS migration, see the CockroachDB documentation.
Conclusion
AWS DMS can be used to seamlessly migrate mission-critical data loads. With minimum-to-zero downtime, you can migrate over to CockroachDB in a few simple steps and enjoy the best of what distributed SQL has to offer.
Cockroach Labs has overseen numerous migrations covering billions of rows—from fledgling startups to large enterprise data leaders—all with minimal downtime and a whole lot of effort saved.
If you’re ready to give CockroachDB a red-hot go, check out CockroachDB in AWS Marketplace.
Cockroach Labs – AWS Partner Spotlight
Cockroach Labs is an AWS Partner and the creator of CockroachDB, which is a cloud-native distributed SQL database in use at some of the world’s largest enterprises and some of the largest companies in banking, retail, and media.