How can I migrate my DynamoDB tables from one AWS account to another?
Last updated: 2021-03-24
I want to perform a cross-account Amazon DynamoDB table migration.
You can migrate your DynamoDB tables to a different AWS account by doing the following:
- Export the DynamoDB table data into an Amazon Simple Storage Service (Amazon S3) bucket in the other account.
- Use an AWS Glue job to import the data.
You can also use AWS Data Pipeline or Amazon EMR to move DynamoDB tables to another AWS account. Data Pipeline is the easiest method to move the tables, but provides fewer options for customization. Amazon EMR is a better choice for users with more technical expertise who want more control over the process.
Amazon S3 and AWS Glue
You can migrate your DynamoDB table to a different AWS account using an Amazon S3 bucket and an AWS Glue job.
- You can perform the initial migration of the DynamoDB table by exporting the tables to an Amazon S3 bucket in the other account. For more information, see Exporting DynamoDB table data to Amazon S3.
When you export your tables from Account A to an S3 bucket in Account B, the objects are still owned by Account A. The AWS Identify Access Management (IAM) users in Account B can't access the objects by default. The export function doesn't write data with the access control list (ACL) bucket-owner-full-control. As a workaround to this object ownership issue, include the PutObjectAcl permission on all exported objects after the export is complete. This workaround grants access to all exported objects for the bucket owners in Account B. For more information, see Why can't I access an object that was uploaded to my Amazon S3 bucket by another AWS account?
- Use a Glue job to read the files from the S3 bucket and write them to the target DynamoDB table. For more information, see Connection types and options for ETL in AWS Glue.
- After exporting the tables to the Amazon S3 bucket, use DynamoDB streams and AWS Lambda to migrate the data insertions and updates in the source table to the destination table in another account. For more information, see Cross-account replication with Amazon DynamoDB.
To move a DynamoDB table to a different account using Data Pipeline, see How can I use Data Pipeline to back up a DynamoDB table to an S3 bucket that is in a different account?
Note: The destination account can't access the DynamoDB data in the Amazon Simple Storage Service (Amazon S3) bucket. To work with the data, restore it to a DynamoDB table.
When you use Amazon EMR to migrate DynamoDB tables, you have two options, depending on your use case:
- If you can afford downtime during the migration, then stop write operations to the source table to assure that the target table is in sync with the source table.
- If you can't afford downtime, then you must store all transactions that happen during the migration in a staging table. After the original table is migrated to the other AWS account, push the new transactions from the staging table to the target table.
Note: The time required to migrate tables with Amazon EMR can vary significantly depending on network performance, the DynamoDB table's provisioned throughput, the amount of data stored in the table, and so on.
To migrate a DynamoDB table using Amazon EMR:
- Launch EMR clusters in both the source and destination accounts. In the Software configuration section, be sure that you choose an option that includes Apache Hive.
Note: It's a security best practice to launch Amazon EMR clusters into private subnets. The private subnets must have an Amazon S3 VPC endpoint and a route to DynamoDB. For more information, see Private subnets. If the clusters need to access the internet, use a NAT gateway that resides in a public subnet. For more information, see VPC with public and private subnets (NAT).
- Be sure that the EMR_EC2_DefaultRole IAM roles in both accounts have permission to write to the S3 bucket in the destination account. For more information, see Configure IAM service roles for Amazon EMR permissions to AWS services and resources.
- In the source account, connect to the master node using SSH.
- In the source account, use Hive commands to export the DynamoDB table data to the S3 bucket in the destination account.
- In the destination account, import the Amazon S3 data to the new DynamoDB table.
- If you're using a staging table to capture writes that happened during the migration, repeat steps 4 and 5 on the staging table.