How do I issue a bulk upload to a DynamoDB table?
Last updated: 2022-10-20
I want to upload data in bulk to my Amazon DynamoDB table. How can I do this?
Use one of the following options to upload data to DynamoDB in bulk.
Use the BatchWriteItem API operation to issue multiple PutItem calls simultaneously. You can also use parallel processes or threads in your code to issue multiple parallel BatchWriteItem API calls to make the data load faster.
AWS Data Pipeline
If the data is in Amazon Simple Storage Service (Amazon S3), then you can use Data Pipeline to export to DynamoDB. Data Pipeline automates the process of creating an Amazon EMR cluster and exporting your data from Amazon S3 to DynamoDB in parallel BatchWriteItem requests. When you use Data Pipeline, you don't have to write the code for the parallel transfer. For more information, see Importing data from Amazon S3 to DynamoDB.
Import Table feature
If the data is stored in S3, then you can uploaded the data to a new DynamoDB table using the Import Table feature. This feature supports CSV, DynamoDB JSON or Amazon ION format in either compressed (GZIP or ZSTD) or uncompressed format. For more information, see DynamoDB data import from Amazon S3: how it works.
To upload data to DynamoDB with Amazon EMR and Apache Hive:
- Create an EMR cluster:
For Release, choose emr-5.30.0 or later.
For Applications, choose an option that includes Hive.
- Create an external Hive table that points to the Amazon S3 location for your data.
- Create another external Hive table, and point it to the DynamoDB table.
- Use the INSERT OVERWRITE command to write data from Amazon S3 to DynamoDB. For more information, see Importing data to DynamoDB.
AWS Database Migration Service (AWS DMS)
You can use AWS DMS to export data from a relational database to a DynamoDB table. For more information, see Using an Amazon DynamoDB database as a target for AWS Database Migration Service.
User this option if the data that you're uploading was originally exported to S3 from a different DynamoDB table using the DynamoDB export feature. This option is efficient for upload large datasets because the export feature uses the DynamoDB backup functionality and doesn't scan the source table. There is no impact on the performance or availability of the source table. For more information, see Using AWS Glue and Amazon DynamoDB export.