How can I use the DMS batch apply feature to improve CDC replication performance?

8 minute read

I'm running a full load and a change data capture (CDC) AWS Database Migration Service (AWS DMS) task. The source latency isn't high, but the target latency is high or it's increasing. How can I speed up the CDC replication phase?

Short description

AWS DMS uses the following methods to replicate data in the change data capture (CDC) phase:

Transactional apply
Batch apply

The AWS DMS CDC process is single threaded, by default (transactional apply). This is the same method used for SQL replication as for all other online transactional processing (OLTP) database engines. DMS CDC replication is dependent on the source database transaction logs. During the ongoing replication phase, DMS applies changes using a transactional apply method, as follows:

DMS reads changes from the transaction log, from the source into the replication DB instance memory.
DMS translates changes, and then passes them on to a sorter component.
The sorter component sorts transactions in commit order, and then forwards them to the target, sequentially.

If the rate of change is high on the source DB, then this process can take time. You might see a spike in CDC target latency metrics when DMS receives high incoming workload from source DB.

DMS uses a single threaded replication method to process the CDC changes. DMS provides the task level setting BatchApplyEnabled to quickly process changes on a target using batches. BatchApplyEnabled is useful if you have high workload on the source DB, and a task with high target CDC latency. By default, DMS deactivates BatchApplySetting. You can activate this using AWS Command Line Interface (AWS CLI).

How batch apply works

If you run a task with BatchApplyEnabled, DMS processes changes in the following way:

DMS collects the changes in batch from the source DB transaction logs.
DMS creates a table called the net changes table, with all changes from the batch.
This table resides in the memory of the replication DB instance, and is passed on to the target DB instance.
DMS applies a net changes algorithm that nets out all changes from the net changes table to actual target table.

For example, if you run a DMS task with BatchApplyEnabled, and you have a new row insert, ten updates to that row, and a delete for that row in a single batch, then DMS nets out all these transactions and doesn’t carry them over. It does this because the row is eventually deleted and no longer exists. This process reduces the number of actual transactions that are applied on the target.

BatchApplyEnabled applies the net changes algorithm in row level of a table within a batch of a particular task. So, if the source database has frequent changes (update, delete, and insert) or a combination of those workloads on the same rows, you can then get optimal use from the BatchApplyEnabled. This minimizes the changes to be applied to the target. If the collected batch is unique in changes (update/delete/insert changes for different row records), then the net change table algorithm process can't filter any events. As a result, all batch events are applied on the target in batch mode. Tables must have either a primary key or a unique key for batch apply to work.

DMS also provides the BatchApplyPreserveTransaction setting for change-processing tuning. If you activate BatchApplyEnabled, then BatchApplyPreserveTransaction turns on, by default. If you set it to true, then transactional integrity is preserved. A batch is guaranteed to contain all the changes within a transaction from the source. This setting applies only to Oracle target endpoints.

Note: Pay attention to the advantages and disadvantages of this setting. When the BatchApplyPreserveTransaction setting is true, DMS captures the entire long-running transaction in the memory of the replication DB instance. It does this according to the task settings MemoryLimitTotal and MemoryKeepTime, and swaps as needed, before it sends changes to the net changes table. When the BatchApplyPreserveTransaction setting is false, changes from a single transaction can span across multiple batches. This can lead to data loss when partially applied, for example, due to target database unavailability.

For more information about DMS latency and the batch apply process, see Part 2 and Part 3 of the Debugging your AWS DMS migrations blogs.

Use cases for batch apply

You can use batch apply in the following circumstances:

The task has a high number of transactions captured from the source and this is causing target latency.
The task has a workload from source that is a combination of insert, update, and delete on the same rows.
No requirement to keep strict referential integrity on the target (disabled FKs).

Limitations

Batch apply currently has the following limitations**:**

The Amazon Redshift target uses batch apply, by default. The Amazon Simple Storage Service (Amazon S3) target is forced to use transactional apply.
Batch apply can only work on tables with primary key/unique index. For tables with no primary key/unique index, bulk apply will only apply the insert in bulk mode, but performs updates and deletes one-by-one. If the table has primary key/unique index but one-by-one mode switched is observed, see How can I troubleshoot why Amazon Redshift switched to one-by-one mode because a bulk operation failed during an AWS DMS task?
When LOB columns are included in the replication, you can use BatchApplyEnabled in limited LOB mode, only. For more information, see Target metadata task settings.
When BatchApplyEnabled is set to true, AWS DMS generates an error message if a target table has a unique constraint.

Resolution

Note: If you receive errors when running AWS Command Line Interface (AWS CLI) commands, make sure that you’re using the most recent AWS CLI version.

BatchApplySetting is disabled by default. You can activate this setting using either the AWS CLI or the AWS DMS Console. Complete the following setup tasks on your system before enabling batch setting:

Install and configure the latest version of the AWS CLI.
Create an IAM user with programmatic access.

Check the batch setting status of an existing task

Open the AWS DMS Console.
From the Navigation panel, choose Database migration tasks
Choose your task, and then choose Task Setting (JSON). In the JSON, the BatchApplyEnabled is listed in the disabled status.

Activate batch setting using the AWS CLI

Open the system with AWS CLI installed.
Run the aws configure command to open the AWS CLI prompt.
Enter your AWS access key ID and then press Enter.
Enter your AWS secret key ID and then press Enter.
Enter the Region name of your DMS resources and then press Enter.
Enter the output format and then press Enter.
Run the modify-replication-task command with task ARN and batch setting conditions.

Note: Confirm that the task is in the stopped state before you modify the task. Change the ARN on the following command based on your task, and then runs it to change the task setting.

After the command has run successfully in the AWS CLI, open the DMS console and check the batch setting status of your task again. The BatchApplyEnabled is now listed as "enabled" in the Task Setting (JSON).

You can now start the DMS task and observe the migration performance.

aws dms modify-replication-task --replication-task-arn arn:aws:dms:us-east-1:123456789123:task:4VUCZ6ROH4ZYRIA25M3SE6NXCM --replication-task-settings "{\"TargetMetadata\":{\"BatchApplyEnabled\":true}}"

Activate batch setting using the AWS DMS Console

Open the AWS DMS Console.
From the navigation panel, choose Database migration task.
Choose your task, and then choose Modify.
From the Task settings section, choose JSON editor.
Modify the task settings that you want to change. For example, from the TargetMetadata section, change BatchApplyEnabled to true (default is false).
Click save to modify the task.

Verify that the changes have taken effect by following these steps:

From the Task list page, choose the task you modified.
From the Overview details tab, expand Task settings (JSON).
Review the task settings for the task.

Troubleshoot CDCLatencyTarget high after running task in batch mode

If the CDCLatencyTarget is high after running the task in batch mode, the latency could be caused by the following:

Long-running transaction on target due to lack of primary and secondary index
Insufficient resource availability to process the workload on target
High resource contention on DMS replication instance

Follow the DMS best practices to troubleshoot these issues.

Related information

Monitoring AWS DMS tasks

How to script a database migration

Automating AWS DMS migration tasks

How do I create source or target endpoints using AWS DMS?

Change processing tuning settings

Topics

Migration & Modernization

Relevant content

AWS DMS CDC replication task not migrating the data and showing "Unchanged-toast-datum" instead of real value in LOB columns
Aehteshaam
asked 7 months ago
DMS CDC for SQL Server - Replication
Accepted Answer
EXPERT
Behram
asked 4 years ago
DMS: How does a replication task handle duplicate data w/o Primary/Unique keys on the table
MP
asked 2 years ago
Oracle materialized view replication to postgres using DMS CDC
rePost-User-4380538
asked 2 years ago
DMS slow CDC apply
rePost-User-4906584
asked 6 months ago
How can I improve the speed of an AWS DMS task that has LOB data?
AWS OFFICIALUpdated 7 months ago
How do I use the validation feature in AWS DMS?
AWS OFFICIALUpdated 2 years ago
Why does my AWS DMS task that uses PostgreSQL as the source fail with all of the replication slots in use?
AWS OFFICIALUpdated 6 months ago
How can I improve the performance of the AWS SCT conversion tool when using AWS DMS?
AWS OFFICIALUpdated 2 years ago
Automated change data capture (CDC) data ingestion from DynamoDB to Redshift
EXPERT
Eesha Kumar
published 7 months ago