AWS Database Blog

Accelerate data migration using AWS DMS and AWS CDK

Deploying and configuring AWS Data Migration Service (AWS DMS) across multiple environments involves several configurations, testing, and provisioning of AWS DMS resources. This can be time-consuming and error-prone due to the large number of settings involved.

The AWS Cloud Development Kit (AWS CDK) lets you define your cloud infrastructure as code in one of five (as of this writing) supported programming languages. It is intended for moderately to highly experienced AWS users.

In this post, we describe an AWS CDK based approach that uses a higher-level AWS CDK construct to set up AWS DMS resources. The AWS CDK construct provisions AWS DMS components such as replication tasks, replication instances, database endpoints, and replication task settings in a consistent and generic way across your environments.

Prerequisites

To follow along with this post, you must have the following prerequisites:

  • Basic knowledge of AWS CDK (for more information, visit Getting started with the AWS CDK)
  • AWS CDK version 1.100.x or later (Note: CDK v2 is already available, however, this solution only works with CDK v1 for now)
  • AWS DMS v3.4.x ( Github codebase was tested with this version )
  • Nodejs v.14.x
  • A source and target database (MySQL 5.7.4+)

Solution overview

The solution described in this post consists mainly of two typescript classes, DMSReplication and DMStack, and uses the out-of-the-box AWS CDK construct library @aws-cdk/aws-dms.

This solution is primarily designed for Amazon Relational Database Service (Amazon RDS) for MySQL databases, and you can extend it for other databases like PostgreSQL or Oracle. The solution assumes all database credentials are stored inside AWS Secrets Manager.

The following diagram illustrates the solution architecture.

The DMSReplication class is the main construct responsible for creating resources such as the replication instance, task settings, subnet group, and AWS Identity and Access Management (IAM) role for accessing Secrets Manager.

The DMStack class is a lower-level construct that uses DMSReplication to provision the actual AWS DMS components like DMS tasks, endpoints, replication instance based on parameters provided by you via the cdk.json file.

ContextProps is a type that helps map input parameters from the cdk.json file in a typesafe manner. The cdk.json file includes settings related to AWS DMS resources like replication instance, task settings, and logging for each of your environment and stages. ContextProps also defines default values for AWS DMS, which you can override by defining the settings in the cdk.json file.

Walkthrough

The following sample DemoDMSStack class instantiates the higher-level AWS CDK construct DMSReplication class. The DMSReplication class contains several functions to provision the AWS DMS related resources.

In the following snippet, we use DMSReplication to create the source and target endpoint and an AWS DMS replication task:

import * as cdk from '@aws-cdk/core';
import { Stack } from '@aws-cdk/core';
import * as ec2 from '@aws-cdk/aws-ec2';
import * as rds from '@aws-cdk/aws-dms';
import { DMSReplication } from '../lib/dms-replication';

export class DemoDMSStack extends cdk.Stack {
  constructor(scope: cdk.App, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    const vpc = new ec2.Vpc(this, 'DMSVpc', {});

    const dmsProps = {
      subnetIds: ['subnet-1', 'subnet-2'],
      replicationSubnetGroupIdentifier: 'dms-subnet-private',
      replicationInstanceClass: 'dms-replication-instance-t3',
      replicationInstanceIdentifier: 'dms-mysql-dev',
      vpcSecurityGroupIds: ['sg-xxxx'],
      engineName: 'mysql',
      region: Stack.of(this).region
    };

    const dmsReplication = new DMSReplication(this, 'DMSReplicationService', dmsProps);
    const source = dmsReplication.createMySQLEndpoint(
      'db-on-source',
      'source',
      'sourceSecretsManagerSecretId',
      'sourceSecretsManagerRoleArn'
    );
    const target = dmsReplication.createMySQLEndpoint(
      'rds-target',
      'target',
      'targetSecretsManagerSecretId',
      'targetSecretsManagerRoleArn'
    );

    dmsReplication.createReplicationTask('platform-replication-task', 'platform', source, target);
  }
}

You might need to make the AWS DMS tasks more configurable and customizable. For example, you can create several AWS DMS replication tasks dynamically based on the schemas of your source database. Also, you might want to specify the VPC, subnets, and schemas for creating an AWS DMS subnet group.

In the following code, we define the source and target database settings under the schemas section of the cdk.json file. Based on these settings, the DMStack class provisions several separate AWS DMS replication tasks.

cdk.json:

"app": "npx ts-node --prefer-ts-exts bin/dms-cdk.ts",
  "aws-cdk:enableDiffNoFail": "true",
  "@aws-cdk/aws-ecr-assets:dockerIgnoreSupport": true,
  "@aws-cdk/aws-secretsmanager:parseOwnedSecretName": true,
  "@aws-cdk/aws-kms:defaultKeyPolicies": true,
  "@aws-cdk/aws-s3:grantWriteWithoutAcl": true,

  "context": {
    "environment": "dev",
    "account": "111111111111",

    "dev": {
      "region": "eu-central-1",
      "vpcId": "vpc-xxxxxxxxxxxxx",
      "subnetIds": [
        "subnet-xxxxxxxxxxxxxxxx",
        "subnet-xxxxxxxxxxxxxxxx"
      ],
      "vpcSecurityGroupIds": [
        "sg-xxxxxxxxxxx"
      ],
      "schemas": [
        {
          "name": "demo-schema1-mig-task1",
          "sourceSecretsManagerSecretId": "arn:aws:secretsmanager:eu-central-1:111111111111:secret:dev/mysql/source-db1",
          "targetSecretsManagerSecretId": "arn:aws:secretsmanager:eu-central-1:111111111111:secret:dev/mysql/target-db1"
        },
        {
          "name": "demo-schema2-mig-task2",
          "sourceSecretsManagerSecretId": "arn:aws:secretsmanager:eu-central-1:111111111111:secret:dev/mysql/source-db2",
          "targetSecretsManagerSecretId": "arn:aws:secretsmanager:eu-central-1:111111111111:secret:dev/mysql/target-db2"
        },

      ],
      "replicationInstanceClass": "dms.r5.4xlarge",
      "replicationInstanceIdentifier": "dms-dev-eu",
      "replicationSubnetGroupIdentifier": "dms-dev-subnet-eu",
      "replicationTaskSettings": {
      },
      "migrationType": "full-load-and-cdc"
    }         
  }
  
}

Based on the preceding cdk.json settings, the DMStack class creates several AWS DMS tasks and endpoints in the forEach(…) loop section.

ContextProps and DMSProps map input parameters defined in cdk.json file in a typesafe manner. ContextProps is composed of several other types, such as TaskSettings and TargetMetadataSetting, with the goal of supporting the various AWS DMS related settings (for more information, see Specifying task settings for AWS Database Migration Service tasks).

The network-related settings for AWS DMS stack like subnet and VPC are mapped through DMSProps. By default, the AWS DMS deployed is only accessible privately. The resource-importer class is used to look up the VPC based on the vpcid parameter in the cdk.json. The full code for the AWS DMS stack is available on GitHub.

export class DMSStack extends cdk.Stack {
  constructor(scope: cdk.Construct, id: string, props: DmsProps) {
    super(scope, id, props);
    const context: ContextProps = propsWithDefaults(props.context);

    const dmsProps = {
      subnetIds: context.subnetIds,
      replicationSubnetGroupIdentifier: context.replicationSubnetGroupIdentifier,
      replicationInstanceClass: context.replicationInstanceClass,
      replicationInstanceIdentifier: context.replicationInstanceIdentifier,
      vpcSecurityGroupIds: context.vpcSecurityGroupIds,
      engineName: 'mysql',
      region: Stack.of(this).region,
    };

    const dmsReplication = new DMSReplication(this, 'Replication', dmsProps);
    const suffix = context.replicationInstanceIdentifier;

    context.schemas.forEach(schema => {
      const source = dmsReplication.createMySQLEndpoint(
        'source-' + schema.name + '-' + suffix,
        'source',
        schema.sourceSecretsManagerSecretId,
        schema.sourceSecretsManagerRoleArn
      );

      const target = dmsReplication.createMySQLEndpoint(
        'target-' + schema.name + '-' + suffix,
        'target',
        schema.targetSecretsManagerSecretId,
        schema.targetSecretsManagerRoleArn
      );

      dmsReplication.createReplicationTask(
        schema.name + '-replication-' + suffix,
        schema.name,
        source,
        target,
        context.migrationType,
        context.replicationTaskSettings
      );
    });
  }
}

To access the database credentials stored in Secrets Manager for creating source and target endpoints, you either need to create a role or pass the existing role’s ARN. You also need to create VPC endpoints for Secrets Manager beforehand to access Secrets Manager through the AWS backbone.

Deploy the solution

To deploy the complete solution, make sure you have completed the prerequisites we described. The source code is available on the GitHub repo.

  1. Install Nodejs and npm (for more information, check out Installing Node.js via package manager):
    node --version
  2. Install AWS CDK or use the following command. You must bootstrap the AWS CDK if this is the first time you’re deploying the solution.
    npm install -g aws-cdk 
    
    cdk –version
    cdk bootstrap aws://ACCOUNT-NUMBER/REGION
  3. Create an AWS profile (if you don’t have one):
    aws configure --profile dms
  4. Download the source code from the GitHub repo:
    git clone git@github.com:aws-samples/dms-cdk.git	(ssh)
  5. Compile the code and run the unit tests:
    npm install  ( compiles and install necessary dependencies)
    npm test     (runs unit tests)
  6. Create entries in Secrets Manager to store the database credentials for your source and target database for migration. Update the cdk.json file settings for the sourceSecretsManagerSecretId and targetSecretsManagerSecretId entries to point to the Secrets Manager secret ID.
  7. Provide the vpcId, subnetId, and schema names.
  8. Deploy the solution using the profile you created earlier. Provide the account where AWS DMS resources should be provisioned by AWS CDK.
    cdk deploy --profile dms

Clean up

To avoid incurring future charges, delete the stack and related resources when you’re done using the solution:

cdk destroy --profile dms

Conclusion

In this post we showed you how to use AWS CDK to quickly configure and deploy AWS DMS across multiple environments. You can also extend the solution to fit your own migration requirements, such as creating AWS DMS tasks and endpoints and configuring settings. Furthermore, you could integrate the solution with your CI/CD pipeline and deploy it across multiple environments. Share your feedbacks in the comments section. To learn more about this solution please visit AWS CDK and Github repository.


About the Authors

Prasanna Tuladhar is a cloud infrastructure architect at AWS Professional Services based in Germany. He likes to explore new challenges, be it databases, containers, or cloud infrastructure. Outside of work, he likes jogging, hiking, and spending time with his family.

Ramy Nasreldin is a DevOps Architect at AWS, based in Sweden. Ramy helps customers design and implement their systems to run on the AWS Cloud. He also preaches best practices by automating everything, from infrastructures to application delivery, to achieve the most resilient and scalable solutions that best serve end users in a sustainable way. In his spare time, he enjoys swimming, playing football, and spending time with his family.

Rolando Santamaria Maso is a senior cloud application development consultant at AWS Professional Services, based in Germany. He helps customers migrate and modernize workloads in the AWS Cloud, with a special focus on modern application architectures and development best practices, but he also creates IaC using AWS CDK. Outside work, he maintains open-source projects and enjoys spending time with family and friends.