How can I copy all objects from one Amazon S3 bucket to another bucket?

Last updated: 2021-05-19

I want to copy or move all my objects from one Amazon Simple Storage Service (Amazon S3) bucket to another bucket. How can I migrate objects between my S3 buckets?

Short description

To copy objects from one S3 bucket to another, follow these steps:

1.    Create a new S3 bucket.

2.    Install and configure the AWS Command Line Interface (AWS CLI).

3.    Copy the objects between the S3 buckets.

Note: Using the aws s3 ls or aws s3 sync commands on large buckets (with 10 million objects or more) can be expensive, resulting in a timeout. If you encounter timeouts because of a large bucket, consider using Amazon CloudWatch metrics to calculate the size and number of objects in a bucket. Also, consider using S3 Batch Operations to copy the objects.

4.    Verify that the objects are copied.

5.    Update existing API calls to the target bucket name.

Before you begin, consider the following:

Resolution

Create a new S3 bucket

1.    Open the Amazon S3 console.

2.    Choose Create Bucket.

3.    Choose a DNS-compliant name for your new bucket.

4.    Select your AWS Region.

Tip: To avoid performance issues caused by cross-Region traffic, create the target bucket in the same Region as the source bucket.

5.    Optionally, choose Copy settings from an existing bucket to mirror the configuration of the source bucket.

Install and configure the AWS CLI

1.    Install the AWS CLI.

2.    Configure the AWS CLI by running the following command:

aws configure

Note: If you receive errors when running AWS CLI commands, make sure that you’re using the most recent version of the AWS CLI.

3.    Enter your access keys (access key ID and secret access key).

4.    Press Enter to skip the default Region and default output options. For more information about Amazon S3 Region parameters, see AWS service endpoints.

Note: The AWS CLI outputs are JSON, text, or table, but not all the commands support each type of output. For more information, see Controlling command output from the AWS CLI.

Copy the objects between the S3 buckets

1.    If you archived S3 objects in the Amazon Simple Storage Service Glacier storage class, restore the objects.

2.    Copy the objects between the source and target buckets by running the following sync command using the AWS CLI:

aws s3 sync s3://DOC-EXAMPLE-BUCKET-SOURCE s3://DOC-EXAMPLE-BUCKET-TARGET

Note: Update the sync command to include your source and target bucket names.

The sync command uses the CopyObject APIs to copy objects between S3 buckets. The sync command lists the source and target buckets to identify objects that are in the source bucket but that aren't in the target bucket. The command also identifies objects in the source bucket that have different LastModified dates than the objects that are in the target bucket. When you use the sync command on a versioned bucket, only the current version of the object is copied—previous versions are not copied. By default, this behavior preserves object metadata, although the access control lists (ACLs) are set to FULL_CONTROL for your AWS account, which removes any additional ACLs. If the operation fails, you can run the sync command again without duplicating previously copied objects. To troubleshoot issues with the sync operation, see Why can't I copy an object between two Amazon S3 buckets?

3.    (Optional) If you encounter a timeout, use the cloudwatch get-metric-statistics command to calculate the number of objects in your bucket:

$ aws cloudwatch get-metric-statistics --namespace AWS/S3 --metric-name NumberOfObjects --dimensions Name=BucketName,Value=DOC-EXAMPLE-BUCKET-SOURCE Name=StorageType,Value=AllStorageTypes --start-time 2021-05-11T00:00 --end-time 2021-05-11T00:10 --period 600 --statistic Average --output json

4.    (Optional) If you encounter a timeout, use the cloudwatch get-metric-statistics command to retrieve your bucket size:

$ aws cloudwatch get-metric-statistics --namespace AWS/S3 --metric-name BucketSizeBytes --dimensions Name=BucketName,Value=DOC-EXAMPLE-BUCKET-SOURCE Name=StorageType,Value=StandardStorage --start-time 2021-05-11T00:00 --end-time 2021-05-11T00:10 --period 3600 --statistics Average --unit Bytes --output json
Note: Listcalls can be very expensive, resulting in the command timing out. For large buckets, consider using Amazon CloudWatch metrics to calculate the size of the bucket and total number of objects instead. However, because Amazon CloudWatch metrics are pulled only once a day, the reported object count and bucket size can differ from the list command results.

Verify that the objects are copied

1.    Verify the contents of the source and target buckets by running the following commands:

aws s3 ls --recursive s3://DOC-EXAMPLE-BUCKET-SOURCE --summarize > bucket-contents-source.txt
        
aws s3 ls --recursive s3://DOC-EXAMPLE-BUCKET-TARGET --summarize > bucket-contents-target.txt

Note: Update the list command to include your source and target bucket names.

2.    Compare objects that are in the source and target buckets by using the outputs that are saved to files in the AWS CLI directory. See the following example output:

$ aws s3 ls --recursive s3://DOC-EXAMPLE-BUCKET --summarize
2017-11-20 21:17:39      15362 s3logo.png

  Total Objects: 1        Total Size: 15362

Update existing API calls to the target bucket name

Update any existing applications or workloads so that they use the target bucket name. You might need to run sync commands to address discrepancies between source and target buckets if you have frequent writes.