AWS Database Blog

Python code to download DMS Task Logs using the AWS DMS Task ID

With AWS Database Migration Service (AWS DMS), you can migrate databases to AWS quickly and securely. In this post, we walk through the sample Python code required to download AWS DMS task logs on to your local computer using the AWS DMS task ID.

Overview

The DMS task logs contain task information logged during the migration process. These logs can be used to troubleshoot issues while using AWS DMS to migrate data, as suggested in the post Debugging Your AWS DMS Migrations: What to Do When Things Go Wrong (Part 1).

Today, DMS sends task logs to log streams in Amazon CloudWatch Logs so that you can get information about the DMS migration tasks. But what if you wanted to download the logs locally to speed up troubleshooting?

Find the complete Python code to download AWS DMS task logs using the DMS task ID in the Download DM Task logs GitHub repo. In the remainder of this post, I break down the Python code and show you how to use the solution:

  • Import the required libraries.
  • Read arguments for the DMS task ID and the time range for the logs.
  • Convert the time to milliseconds since epoch.
  • Get the replication tasks based on the AWS DMS task ID.
  • Get the replication instance ARN.
  • Get the replication instance information.
  • Retrieve the replication instance ID.
  • Construct the log group name.
  • Retrieve the CloudWatch log events for the DMS task.
  • Print out the CloudWatch logs.
  • Use the solution.

Prerequisites

Here are the prerequisites for using the solution:

  1. Download and install Python.
  2. Download the boto3 and maya libraries using the following pip commands:
    pip install boto3
    pip install maya
  3. Install the AWS CLI.
  4. Configure the AWS CLI.
  5. Download a code editor for Python. In this post, I use Visual Studio Code.

Import the required libraries

Begin by importing the boto3, sys, and maya libraries.

import boto3

import sys

import maya

Read arguments for the DMS task ID and the time range for the logs

Read the arguments for the AWS DMS task ID and the start and end time ranges for the logs.

replication_task_id = sys.argv[1]

time_string = sys.argv[2]

end_time_string = sys.argv[3]

Convert the time to milliseconds since epoch

Using the following methods, convert the start and end times to milliseconds since epoch. Use the maya library for the conversion.

def start_time_milliseconds_since_epoch(time_string):
    datetime = maya.when(time_string)
    seconds = datetime.epoch
    return seconds * 1000

start_time = start_time_milliseconds_since_epoch(time_string)
def end_time_milliseconds_since_epoch(end_time_string):
    datetime = maya.when(end_time_string)
    seconds = datetime.epoch
    return seconds * 1000

end_time = end_time_milliseconds_since_epoch(end_time_string)

Get the replication tasks based on the AWS DMS task ID

Get the replication tasks based on the AWS DMS task ID that you provided as an argument. Use the describe_replication_tasks method of the boto3 library. This method returns information about replication tasks for your account in the current Region. Then, return the replication tasks that were returned by the method.

def get_replication_tasks():

    client = boto3.client('dms')

    response = client.describe_replication_tasks(Filters=[
        {
            'Name': 'replication-task-id',
            'Values': [
                replication_task_id,
            ]
        },
    ],
    MaxRecords=100,
    Marker='')

    return response['ReplicationTasks']

Get the replication instance ARN

Get the replication instance ARN from the replication tasks that you retrieved using the previous method. Return the replication instance ARN and assign it to a variable, rep_instance_arn, for future use.

def get_replication_instance_arn():
    for ReplicationTasks in get_replication_tasks():
        ReplicationInstanceArn = ReplicationTasks['ReplicationInstanceArn']

        return ReplicationInstanceArn

rep_instance_arn = get_replication_instance_arn()

Get the replication instance information

Next, get the replication instance information by providing the replication instance ARN to the describe_replication_instances method, as shown in the following code example.

def get_replication_instances():

    client = boto3.client('dms')
    
    response = client.describe_replication_instances(Filters=[
        {
            'Name': 'rep-instance-arn',
            'Values': [
                rep_instance_arn,
            ]
        },
    ],
    MaxRecords=100,
    Marker='')
    
    
    return response['ReplicationInstances']

Retrieve the replication instance ID

From the replication instance result obtained using the previous method, retrieve the replication instance identifier to be used to construct the log group name for the AWS DMS task.

def get_replication_instance_id():
    for ReplicationInstances in get_replication_instances():
     ReplicationInstanceIdentifier = ReplicationInstances['ReplicationInstanceIdentifier']

     return ReplicationInstanceIdentifier

Construct the log group name

After you have the replication instance identifier, construct the log group name by prefixing the replication instance ID with dms-tasks-.

log_group = "dms-tasks-" + get_replication_instance_id()

Retrieve the CloudWatch log events for the DMS task

Pass the log group just constructed above to the get_cloudwatch_log_events method, which retrieves the CloudWatch log events for the DMS task by calling the filter_log_events method. This method lists log events from the specified log group.

def get_cloudwatch_log_events(log_group):
    
    client = boto3.client('logs')
    kwargs = {
        'logGroupName': log_group,
        'limit': 10000,
        'startTime': start_time,
        'endTime': end_time
    }
    while True:
        response = client.filter_log_events(**kwargs)
        yield from response['events']
        try:
            kwargs['nextToken'] = response['nextToken']
        except KeyError:
            break

Print out the CloudWatch logs

Finally, in the main method, print the CloudWatch log events for the DMS task.

def main():
    for event in get_cloudwatch_log_events(log_group):
        sys.stdout.write(event['message'].rstrip() + '\n')

if __name__ == '__main__':
    main()

Use the solution

To use the solution, download the code file from aws-database-migration-tools GitHub repo into a folder on your computer. To run the Python code, execute the following command from a terminal or command prompt.

python GetCWLogData.py <Your DMS Task ID> <Start Time Filter> <End Time Filter> > dmslogs.log

Replace <Your DMS task ID> in the command with your DMS task ID. The task ID can be found using the describe-replication-tasks AWS CLI command.

You can also get the DMS task ID from the AWS Management Console, as shown in the following screenshot:

Replace <Start Time Filter> with the start date and time of the log entries to include. Replace <End Time Filter> with the cutoff date and time for the log entries that should be included on the log file. When you run the command, log entries are downloaded into the file called dmslogs.log.

The following example command includes log entries from January 21 to January 22:

python GetCWLogData.py [DMS task ID] 2019-01-21T00:00 2019-01-22T00:00 > dmslogs.log

In the folder from which you executed the command that ran the Python code, you should see a file named dmslogs.log. If you open the file, there are log entries, as in the following screenshot.

On Linux- and UNIX-based systems, you can also see the end of the dmslogs.log file using the tail command. In the following screenshot, the last 50 lines of the dmslogs.log file are shown:

Conclusion

You may run into issues while using AWS DMS to migrate data between different data stores. This can happen whether it’s into the cloud, between on-premises instances (through an AWS Cloud setup), or between cloud and on-premises. Use the Python code provided in this post to download the DMS task logs and troubleshoot issues without having to log on to the AWS Management Console.

If you have comments or questions about implementing the solution outlined in this post, submit them in the Comments section below.

 


About the Author

Zafar Kapadia is a Sr. Solutions Architect at Amazon Web Services. He specializes in Application Development and Optimization. He is also an avid cricketer and plays in various local leagues.

 

 

 

David Rader is a Sr. Practice Manager at Amazon Web Services. He specializes in Application Modernization. He loves to run, camp, and roast marshmallows.