How can I back up a DynamoDB table to Amazon S3?

4 minute read

I want to back up my Amazon DynamoDB table using Amazon Simple Storage Service (Amazon S3).

Short description

DynamoDB offers two built-in backup methods:

On-demand: Create backups when you want.
Point-in-time recovery: Turn on automatic and continuous backups.

Both of these methods are suitable for backing up your tables for disaster recovery purposes. However, with these methods, you can't use the data for use cases involving data analysis or extract, transform, and load (ETL) jobs. The DynamoDB Export to S3 feature is the easiest way to create backups that you can download locally or use with another AWS service. To customize the process of creating backups, you can use Amazon EMR or AWS Glue.

Resolution

DynamoDB Export to S3 feature

Using this feature, you can export data from an Amazon DynamoDB table anytime within your point-in-time recovery window to an Amazon S3 bucket. For more information, see DynamoDB data export to Amazon S3.

For an example of how to use this feature, see Export Amazon DynamoDB table data to your data lake in Amazon S3, no code writing required.

Using the Export to S3 Feature allows you to use your data in other ways including the following:

Perform ETL against the exported data on S3, and then import the data back to DynamoDB
Retain historical snapshots for auditing
Integrate the data with other services or applications
Build an S3 data lake from the DynamoDB data, and then analyze the data from various services, such as Amazon Athena, Amazon Redshift, or Amazon SageMaker
Run as-needed queries on your data from Athena or Amazon EMR without affecting your DynamoDB capacity

Note the following pros and cons when using this feature:

Pros: This feature allows you to export data across AWS Regions and accounts without building custom applications or writing code. The exports don't affect the read capacity or the availability of your production tables.
Cons: This feature exports the table data in DynamoDB JSON or Amazon Ion format only. To reimport the data natively with an S3 bucket, see DynamoDB data import from Amazon S3. You can also create a new template or use AWS Glue, Amazon EMR, or the AWS SDK to reimport the data.

Amazon EMR

Use Amazon EMR to export your data to an S3 bucket. You can do so with either of these methods:

Run Hive/Spark queries against DynamoDB tables using DynamoDBStorageHandler. For more information, see Exporting data from DynamoDB.
Use the open-source emr-dynamodb-tool on GitHub to export/import DynamoDB tables.

Note the following pros and cons when using these methods:

Pros: If you're an active Amazon EMR user and are comfortable with Hive or Spark, then you can manage your configurations better with these methods than with the native Export to S3 function. You can also use existing clusters for this purpose.
Cons: These methods require you to create and maintain an EMR Cluster. If you use DynamoDBStorageHandler, then you must be familiar with Hive or Spark.

AWS Glue

Use AWS Glue to copy your table to Amazon S3. For more information, see Using AWS Glue and Amazon DynamoDB export.

Pros: Because AWS Glue is a serverless service, you don't need to create and maintain resources. You can directly write back to DynamoDB. You can add custom ETL logic for use cases, such as filtering and converting, when exporting data. You can also choose your preferred format from CSV, JSON, Parquet, or ORC. For more information, see Data format options for inputs and outputs in AWS Glue.
Cons: If you choose this option, you must know how to use Spark. You also must maintain the source code for your AWS Glue ETL job. For more information, see "connectionType": "dynamodb".

If none of these options offer the flexibility that you need, then you can use the DynamoDB API to create your own solution.

Related information

Requesting a table export in DynamoDB

How to export an Amazon DynamoDB table to Amazon S3 using AWS Step Functions and AWS Glue

Topics

Database

Relevant content

Unable to perform cloudwatch:GetMetricData job fails when backing up S3 with AWS Backup
rePost-User-0529671
asked 9 months ago
Backing up local DynamoDB tables and restoring on remote instance
polymorphic_nw
asked 10 months ago
How to manually back up an Amazon Workspace?
broiled_smurf
asked 2 years ago
Can I restore a backup to an existing dynamodb table but not create a new dynamodb table.
heno7
asked 8 months ago
Cloudformation Combine two attribute to make up partition key in dynamoDB
Accepted Answer
Nafiu
asked 8 months ago
How can I migrate my Amazon DynamoDB tables from one AWS account to another?
AWS OFFICIALUpdated 9 months ago
How do I restore the backup of my Amazon DynamoDB table to a different Region?
AWS OFFICIALUpdated 2 years ago
How do I issue a bulk upload to a DynamoDB table?
AWS OFFICIALUpdated a year ago
How can I speed up the creation of a global secondary index for an Amazon DynamoDB table?
AWS OFFICIALUpdated a year ago
Understanding Amazon DynamoDB On-Demand Backups and Billing
EXPERT
Leeroy Hannigan
published 4 months ago