How can I troubleshoot high target latency on an AWS DMS task?

Last updated: 2019-10-02

I'm running a full load and change data capture (CDC) AWS Database Migration Service (AWS DMS) task. The source latency is not high, but the target latency is high or it is increasing. How can I troubleshoot high target latency on an AWS DMS migration task?

Short Description

You can use Amazon CloudWatch metrics to monitor your replication task's statistics. Specifically, you can monitor CDCLatencySource and CDCLatencyTarget to identify replication latency in the ongoing replication phase (CDC). The CDCLatencySource metric is the latency between the source and the replication instance. The CDCLatencyTarget metric is the latency between replication instance and target. For more information, see Replication Task Metrics.

High CDCLatencySource means that the process of capturing changes from the source is delayed. And high CDCLatencyTarget means that the process of applying the change events to the target is delayed. If both CDCLatencySource and CDCLatencyTarget are high, investigate CDCLatencySource first because target latency is always the same or greater than the source latency. So, high CDCLatencyTarget is most likely result of the delay in capturing the change events from the source. If the CDCLatencySource isn't high, but the CDCLatencyTarget is high, it could be caused by the following:

  • There are no primary keys or indexes in the target
  • There are resource bottlenecks in the target
  • There are resource bottlenecks in the replication instance
  • There is a network issue between the replication instance and the target

To resolve these issues, see Best practices and troubleshooting.

Resolution

No primary keys or indexes in the target

By default, AWS DMS writes changes to the target by data manipulation language (DML) statements—such as INSERT, UPDATE, or DELETE—like any other application. If the required indexes aren't in place, then changes like UPDATEs and DELETEs can result in full table scans. Full table scans can cause performance issues on the target and result in target latency. Check your target database schemas, especially if you created the target schema manually. Identify slow queries using target database mechanisms, like the slow query log for MySQL, pg_stat_activity for PostgreSQL, or a query plan. If your target is Amazon Redshift, also check the distribution style for your table. All distribution styles can cause target latency because they take longer to INSERT or UPDATE data into tables.

Resource bottlenecks in the target

If your target doesn't have sufficient resources, then the target can’t accept changes at the rate that AWS DMS sends them. This can cause resource bottlenecks on the target and target latency. This can also happen if other processes are consuming resources in the target. If the target is hosted on AWS, then check the resource statistics from the CloudWatch metrics.

Resource bottlenecks in the replication instance

Choose a replication instance that has enough resources to handle your migration—CPU, memory, network, or iOPS. You can monitor your replication instance resources using CloudWatch metrics.

Network issue between replication instance and target

Network bandwidth and latency issues can also cause latency issues, especially when your target is an on-premises database, or if you use AWS DMS for cross-region replication.

Best practices and troubleshooting

If your target is Amazon Relational Database Service (Amazon RDS) follow the best practices for Improving the Performance of an AWS DMS Migration. Amazon RDS has an auto backup mechanism that starts within the backup window, and it backs up the moved data. If a snapshot of the target DB instance was in the process of being taken, AWS DMS can have issues applying changes to the target. As a result, the target latency increases until the snapshot is completed. If your target is Amazon Elastic Compute Cloud (Amazon EC2) or an on-premises database, check your target database's backup mechanism.

Some task settings can cause changes to be written slowly to the target. If you run ongoing replication from a source where the rate of change is high, consider using BatchApplyEnabled. For more information, see the BatchApplyEnabled section of Debugging Your AWS DMS Migrations: What to Do When Things Go Wrong?

To set BatchApplyEnabled to True, run the modify-replication-task command using the AWS Command Line Interface (AWS CLI):

aws dms modify-replication-task --replication-task-arn --replication-task-arn arn:aws:dms:ap-northeast-1:123456789012:task:ABCDEFGHIJKLMNOPQRSTUVWXYZ --replication-task-settings "{\"TargetMetadata\":{\"BatchApplyEnabled\":true}}"

Did this article help you?

Anything we could improve?


Need more help?