AWS Database Blog

Create AWS CloudFormation templates for AWS DMS tasks using Microsoft Excel

This blog post talks about a tool that automates creation of AWS CloudFormation templates for AWS Database Migration Service (AWS DMS) tasks. If you are looking for how to create CloudFormation templates for your DMS tasks and don’t have knowledge of CloudFormation, read on.

Many times during database migration activities where AWS DMS is involved, you encounter the common task of creating AWS CloudFormation templates to provision DMS tasks. Creating AWS CloudFormation templates requires knowledge of writing CloudFormation templates in either JSON or YAML.

Good news! You don’t need to know how to write CloudFormation templates at all, and still you can create CloudFormation templates to provision AWS DMS tasks. All you need is some basic knowledge of Microsoft Excel and how to run a Python script using a terminal or command window.

In this post, we present a small command line tool written in Python. The tool takes a Microsoft Excel workbook having names of tables to be migrated, Amazon Resource Names (ARNs) of DMS endpoints, and DMS replication instances to be used as input. The tool generates CloudFormation templates for required DMS tasks as output. However, this tool doesn’t address creation of DMS endpoints and replication instances.

Prerequisites and assumptions

To use this tool, you need the following:

  • Microsoft Excel 2016 or later installed on your machine.
  • Python, version 2.7 or later. To install Python 2.7.15, go to The Python.org download page.
  • The xlrd Python module. If you don’t have it installed already, install it using pip from the command prompt or terminal with this command: pip install xlrd
  • Make sure that both the DMS source and target endpoints and DMS replication instance are already created.

About the tool

All information about DMS tasks that need to be created should be incorporated in the Excel template. Before jumping into the details for the Excel template, let’s look at the list of items that are needed to create a DMS task.

The basic information needed to create a DMS task is this:

  • DMS task name
  • Source endpoint
  • Target endpoint
  • Replication instance to be used
  • Schema name of tables to be migrated
  • Names of tables to be migrated

You provide this information to the tool as an Excel template.

About the Excel template

The Excel template to provide as input to the tool should include two worksheets named DMS-Tasks and DMS-Tags. The rest of each worksheet represents a DMS task whose CloudFormation template you plan to create.

DMS-Tasks sheet

The DMS-Tasks sheet contains information about DMS tasks to be generated by the tool. To illustrate, if you plan to generate CloudFormation templates for two DMS tasks, there are two rows in the sheet. There is one row for each DMS task, which holds the information mentioned in each column, as described following.

For this sheet, provide details of DMS tasks, that is DMS endpoints, replication instance, and so on, in the designated columns of this sheet as described following.

Attribute name Description
Name Name of the DMS task. This name should be unique among DMS tasks; no duplicates are allowed.
Description Description of the DMS task.
SourceEndPointARN ARN of the DMS source endpoint.
TargetEndPointARN ARN of the DMS target endpoint.
RIARN ARN of the DMS replication instance.
CDCStartTime Start time of change data capture. This attribute is mandatory if the task migration mode is CDC. In the case of FULL-LOAD mode, the tool ignores this attribute value.
Schema Name Name of the source database schema that the tables for the DMS task indicated by the current row belong to.

DMS-Tags sheet

In the DMS-Tags sheet, you detail the AWS resource tags to attach to the DMS tasks whose CloudFormation templates the tool creates. As in the sample Excel template provided, the following table illustrates two tags, Resource Owner and EnvironmentType. The table also shows provided values for these tags, to be associated with all DMS tasks defined in the DMS-Tasks sheet.

Tag name Tag value
Resource Owner dev@aws.com
EnvironmentType Development

DMS Tasks sheets

By now, you probably see that the Excel template’s two sheets, DMS-Tasks and DMS-Tags, contain metadata for DMS tasks. Specifically, they contain metadata for the tasks for which the tool is to create CloudFormation templates.

If there are N number of CloudFormation templates to create for DMS tasks, then the Excel should have N+2 worksheets. In other words, it needs one worksheet for each DMS task and two worksheets for metadata of all of the tasks, which are DMS-Tasks and DMS-Tags.

The name of the worksheet for each DMS task should match the name mentioned in the name column in the DMS-Tasks sheet.

DMS Tasks sheet details

The name of each DMS task’s details sheet should match the name mentioned in the name column in the DMS-Tasks sheet.

All tables to be part of the DMS task should be listed in this sheet. This sheet has two columns (apart from Sno column, kept for future development purposes), Table Name and Excluded Columns, described following.

In some situations, certain columns should not be migrated. Note those columns as comma-separated column names in the Excluded Columns column in the sheet.

Column name Column value
Table Name Name of the table to be part of the DMS task.
Excluded Columns Columns names of the table whose columns should be excluded by DMS during either full-load or CDC mode. If nothing is mentioned here, all columns are considered for data migration. Column names should be separated by commas.

Generating DMS CloudFormation templates

To generate CloudFormation templates for your DMS tasks, download the tool from this Amazon GitHub Repository in the form of archive file (.zip), extract the downloaded archive file to a folder, and take the following steps:

  1. Make sure that you have an installed version of Python 2.7 or later.
  2. Make sure that you have an installed version of the xlrd Python module. If not, install it using pip with this command: pip install xlrd
  3. Open a terminal window in Mac OS or a command window in Microsoft Windows, and navigate to the folder where the tool is extracted.
  4. Run the tool.

The downloaded tool ships with a sample Excel template that generates two DMS tasks’ CloudFormation templates, DMS-CHILD and DMS-PARENT.

Running the tool

When you run the tool, only two arguments are mandatory. Those are path and type, whose details are mentioned following.

Argument name Argument value
path Location of the Excel template that contains DMS task details for which CloudFormation templates need to be generated.
type

This argument accepts three different values:

·         cdc – If the DMS tasks to create are for change data capture mode only

·         full-load – If the DMS tasks to create are for full-load mode only

·         full-load-and-cdc – If the DMS tasks to create are for full-load followed by CDC

Use the following command with to generate DMS task templates, providing the path to the Excel template:

python create_task.py --path [PATH_OF_THE_ EXCEL_TEMPLATE] --type [cdc | full-load | full-load-and-cdc]

Following is an example of tool execution.

user_name$ python create_task.py --path dms-tasks.xlsx --type cdc
Created Task Template for DMS-CHILD
Created Task Template for DMS-PARENT
user_name$

The preceding example illustrates running the tool with the provided Excel template, and CloudFormation templates for two DMS tasks are generated. The tool places them in a folder named output, which is one of the subfolders where the main tool’s archive is extracted.

Limitations

The tool has the following limitations:

  • The names of schema, tables, and their columns are transformed into lowercase at the destination endpoint.
  • Currently, the tool supports only one schema for each DMS task.

Summary

We found that during the migration of one customer database having 500-plus tables, shuffling tables from one DMS task to the other became a common exercise. This proved true because the number of partitions in a few tables grew significantly in the production/staging environment compared to the dev environment. From the requirement to shuffle tasks, the idea of creating this tool was born.

This tool drastically reduces the amount of time required to create CloudFormation templates for DMS tasks.

 


About the authors

Ismail Shaik is a consultant with AWS Professional Services. He works with AWS DMS, SCT, Aurora PostgreSQL, and Microsoft Workloads related AWS services to bring the best possible experience to their customers.

 

 

 

Venkata Naveen Koppula is an associate consultant with AWS Professional Services. He works with AWS DMS, SCT, Aurora PostgreSQL to bring the best possible experience to their customers.