AWS Database Blog
Perform a side-by-side upgrade in AWS DMS by moving tasks to minimize business impact
You can use AWS Database Migration Service (AWS DMS) for many use cases, such as migrating from legacy or on-premises databases to managed cloud services, replicating ongoing data changes from online transaction processing (OLTP) databases such as Amazon Relational Database Service (Amazon RDS) to an online analytical processing (OLAP) data warehouse such as Amazon Redshift, building data lakes, and performing real-time processing on change data from data stores.
AWS DMS regularly releases new versions with new features, bug and security fixes, and performance improvements. If you’re using AWS DMS in critical environments, especially for ongoing data replication use cases, you need to maintain business continuity and minimize business impact while performing AWS DMS version upgrades.
In this post, we discuss common AWS DMS version upgrade approaches and dive into a side-by-side upgrade approach by moving a task to minimize business impact.
AWS DMS version upgrade options
A simple AWS DMS version upgrade approach is an in-place upgrade, where you modify the AWS DMS replication instance in place and change the engine version to a newer version. This is a suitable upgrade approach for AWS DMS replication instances with fewer replication tasks or ones that can tolerate a certain amount of downtime.
If you have change data capture (CDC) tasks with supported database endpoints, you can stop the task, create a new replication instance with a new version of the AWS DMS engine, and create a CDC-only task with the recovery checkpoint. However, if you need to reload the tables, you need to create a separate full load plus CDC task.
If the AWS DMS version is deprecated, AWS will upgrade the replication instances on the deprecated version to the default version during the next maintenance window.
If you don’t want to manage replication instances, you can consider AWS DMS Serverless, which automatically sets up, scales, and manages migration resources to make your database migrations straightforward and more cost-effective. AWS DMS Serverless uses the default engine version.
To minimize downtime and maintain a stable replication environment for your business-critical workload, you can use a side-by-side upgrade approach by moving replication tasks from the existing instance to a new instance. This is a safer upgrade approach, especially for replication instances with a large number of tasks, because it reduces the area of effect in case of failure by upgrading tasks in a controlled manner. A common use case for implementing this side-by-side upgrade approach over in-place upgrade would be scenarios when high storage consumption occurs on a PostgreSQL source when AWS DMS CDC tasks are stopped for a long time.
Solution overview
In this post, we show how to use the move task option to upgrade your AWS DMS replication instance version using the AWS Command Line Interface (AWS CLI), AWS Management Console, or AWS DMS API. The following diagram illustrates the solution architecture.
Prerequisites
Before you get started, we recommend reviewing the information regarding moving a task and the move-replication-task command.
Performing an upgrade with API calls allows you to perform cascading upgrades of your AWS DMS replication tasks, which makes sure all your replication tasks in the replication instance are upgraded in a controlled manner to minimize business impact. For a seamless upgrade, complete the following before upgrading or moving the replication task:
- Create a new replication instance using the latest AWS DMS version. For optimal performance, make sure the new instance has the same instance class as the existing AWS DMS replication instance.
- Store the replication task ARN and new replication instance ARN details. The following are some sample values:
- Stop the tasks that will be moved to the new instance:
- When the task is stopped, capture the recovery checkpoint from this task:
The recovery checkpoint can be useful if the task fails for some reason and you need to start the task from the last stopped endpoint.
After you complete these steps, you can move your AWS DMS replication task to a new replication instance with the higher version. This feature is available through AWS DMS APIs, the AWS DMS CLI, and the console. We go over the steps for each option in the following sections.
Move a replication task using the AWS CLI
Complete the following steps to move your task using the AWS CLI:
- After the replication task ARN and replication instance ARN are stored and you have completed the prerequisites, enter the following AWS CLI command to move the replication task:
The response of this API call is the replication task object.
- Retrieve or monitor the task status using the describe-replication-task AWS CLI command:
This should return a response like the following:
The target replication task status changes to Stopped when the move is complete.
- Resume the replication task to continue the migration:
At this point, the upgrade is complete. The replication task resumes in the upgraded AWS DMS replication instance.
Move a replication task using the console
To use the AWS DMS console to transfer your replication task to another instance, complete the following steps:
- After the replication task is stopped, on the AWS DMS console, choose Database migration tasks in the navigation pane.
- Choose the task that you want to move.
- On the Actions menu, choose Move.
- For Replication instance, choose the new instance to host the task.
- Choose Move database migration task.
- Resume the replication task.
Move a replication task using the AWS DMS API
In this example, we present a sample API call to move the replication task to a new instance using the Boto3 client. The initial steps of stopping the task and storing the replication instance, replication task, and checkpoint details remain the same. See the following code:
Considerations
With the introduction of moving replication tasks, two new states are introduced:
- Moving – The task is in the process of being moved to another replication instance. The replication is in this state until the move is complete. The only allowed operation on the replication task when it’s being moved is deleting the task.
- Failed-Move – A replication task enters this state when the move fails for any reason, such as not having enough storage space on the target replication instance. A replication task can be started, modified, moved, or deleted when in this state.
If the task is in the failed-move
state and you are unable to resume the task from the new replication instance, consider using the recovery checkpoint captured in the previous step to create an ongoing replication (CDC-only) task and restart it from the checkpoint information captured. For more information, refer to How to work with native CDC support in AWS DMS and Using a checkpoint as a CDC start point.
Even with a failed-move
state, you are able to perform the move replication task again or resume the task from the older replication instance. The recovery checkpoint is useful in some scenarios where the task is unable to resume after it is moved to the new instance.
AWS DMS version upgrade best practices
Consider the following best practices when upgrading your AWS DMS version:
- Test in a lower environment. The test environment should match the production environment in terms of both data volumes and representative datasets.
- Create a Multi-AZ deployment for high availability and failover support. In the event of an Availability Zone failure, a replication instance from another Availability Zone will step in and continue data replication.
- Have proper monitoring in place for the AWS DMS tasks, and notify stakeholders of any upgrade errors. For more information on setting up monitoring at the task level, refer to Automating database migration monitoring with AWS DMS. You should understand important replication instance and task metrics such as
FreeableMemory
,SwapUsage
,CDCLatencySource
, andCDCLatencyTarget
. - If there is a table error, check the control table (
dmslogs.awsdms_apply_exceptions
) on the target database. - For business-critical tables, review the table schema and data, and set proper error handling task settings.
- For large tables, consider creating more than one task before conducting the upgrade, to reduce the effect of a single task issue.
- If the move failed, capture the recovery checkpoint of the AWS DMS task and create a new CDC-only task with the recovery checkpoint.
- Consider using the automatic version upgrade option for use cases such as using the current default engine version when you create a replication instance.
- Enable AWS DMS data validation for supported source and target endpoints to ensure your data was migrated accurately, especially in production environment.
Conclusion
AWS DMS offers multiple ways to assist with data movement and migration. When upgrading to a new version of AWS DMS, you can choose from different options depending on your configuration and business requirements. You could opt for an in-place upgrade, use recovery checkpoints with CDC tasks, use AWS DMS Serverless to manage the migration, or move AWS DMS tasks to minimize downtime with a side-by-side upgrade approach. The side-by-side approach can be done via the console, AWS CLI, or AWS SDK (Boto3). You can achieve a successful upgrade by following the best practices we discussed and by balancing the different upgrade behaviors with your business impact and technical requirements.
If you have questions or suggestions, please leave a comment below.
About the Authors
Eddie Yao is an Enterprise Support Lead at AWS. He guides AWS customers build and run production workloads at scale in the cloud. With over a decade experience in tech, from web application engineering and consulting, to digital platform solutions architecture, Eddie currently focuses on Media & Entertainment industry and AI/ML (including generative AI).
Aritra Biswas is a Senior Cloud Support DBE with Amazon Web Services and Subject Matter Expert for AWS DMS. He has over a decade of experience in working with relational databases. At AWS, he works with service teams, Technical Account Managers, and Solutions Architects, and assists customers migrate database workloads to AWS. Outside of work, he enjoys playing racquetball and spending time with family and friends. Scott St. Martin is a Solutions Architect at AWS who is passionate about helping customers build modern applications. Scott uses his decade of experience in the cloud to guide organizations in adopting best practices around operational excellence and reliability, with a focus the manufacturing and financial services spaces. Outside of work, Scott enjoys traveling, spending time with family, and playing piano.