How can I copy all objects from one Amazon S3 bucket to another bucket?
Last updated: 2021-05-19
I want to copy or move all my objects from one Amazon Simple Storage Service (Amazon S3) bucket to another bucket. How can I migrate objects between my S3 buckets?
To copy objects from one S3 bucket to another, follow these steps:
1. Create a new S3 bucket.
2. Install and configure the AWS Command Line Interface (AWS CLI).
3. Copy the objects between the S3 buckets.
Note: Using the aws s3 ls or aws s3 sync commands on large buckets (with 10 million objects or more) can be expensive, resulting in a timeout. If you encounter timeouts because of a large bucket, consider using Amazon CloudWatch metrics to calculate the size and number of objects in a bucket. Also, consider using S3 Batch Operations to copy the objects.
4. Verify that the objects are copied.
5. Update existing API calls to the target bucket name.
Before you begin, consider the following:
- If you have many objects in your S3 bucket (more than 10 million objects), consider using S3 Batch Operations. You can use S3 Batch Operations to automate the copy process.
- To copy objects across AWS accounts, set up the correct cross-account permissions on the bucket and the relevant AWS Identity and Access Management (IAM) role.
- If you're using AWS CLI version 2 to copy objects across buckets, your IAM role must also have proper permissions. Make sure that your IAM role can access s3:GetObjectTagging for source objects and s3:PutObjectTagging for destination objects.
- To increase the performance of the sync process, tune the AWS CLI to use a higher concurrence. You can also split sync commands for different prefixes to optimize your S3 bucket performance. For more information about optimizing the performance of your workload, see Best practices design patterns: Optimizing Amazon S3 performance.
Create a new S3 bucket
1. Open the Amazon S3 console.
2. Choose Create Bucket.
3. Choose a DNS-compliant name for your new bucket.
4. Select your AWS Region.
Tip: To avoid performance issues caused by cross-Region traffic, create the target bucket in the same Region as the source bucket.
5. Optionally, choose Copy settings from an existing bucket to mirror the configuration of the source bucket.
Install and configure the AWS CLI
2. Configure the AWS CLI by running the following command:
Note: If you receive errors when running AWS CLI commands, make sure that you’re using the most recent version of the AWS CLI.
3. Enter your access keys (access key ID and secret access key).
4. Press Enter to skip the default Region and default output options. For more information about Amazon S3 Region parameters, see AWS service endpoints.
Note: The AWS CLI outputs are JSON, text, or table, but not all the commands support each type of output. For more information, see Controlling command output from the AWS CLI.
Copy the objects between the S3 buckets
1. If you archived S3 objects in the Amazon Simple Storage Service Glacier storage class, restore the objects.
2. Copy the objects between the source and target buckets by running the following sync command using the AWS CLI:
aws s3 sync s3://DOC-EXAMPLE-BUCKET-SOURCE s3://DOC-EXAMPLE-BUCKET-TARGET
Note: Update the sync command to include your source and target bucket names.
The sync command uses the CopyObject APIs to copy objects between S3 buckets. The sync command lists the source and target buckets to identify objects that are in the source bucket but that aren't in the target bucket. The command also identifies objects in the source bucket that have different LastModified dates than the objects that are in the target bucket. When you use the sync command on a versioned bucket, only the current version of the object is copied—previous versions are not copied. By default, this behavior preserves object metadata, although the access control lists (ACLs) are set to FULL_CONTROL for your AWS account, which removes any additional ACLs. If the operation fails, you can run the sync command again without duplicating previously copied objects. To troubleshoot issues with the sync operation, see Why can't I copy an object between two Amazon S3 buckets?
3. (Optional) If you encounter a timeout, use the cloudwatch get-metric-statistics command to calculate the number of objects in your bucket:
$ aws cloudwatch get-metric-statistics --namespace AWS/S3 --metric-name NumberOfObjects --dimensions Name=BucketName,Value=DOC-EXAMPLE-BUCKET-SOURCE Name=StorageType,Value=AllStorageTypes --start-time 2021-05-11T00:00 --end-time 2021-05-11T00:10 --period 600 --statistic Average --output json
4. (Optional) If you encounter a timeout, use the cloudwatch get-metric-statistics command to retrieve your bucket size:
$ aws cloudwatch get-metric-statistics --namespace AWS/S3 --metric-name BucketSizeBytes --dimensions Name=BucketName,Value=DOC-EXAMPLE-BUCKET-SOURCE Name=StorageType,Value=StandardStorage --start-time 2021-05-11T00:00 --end-time 2021-05-11T00:10 --period 3600 --statistics Average --unit Bytes --output json
Verify that the objects are copied
1. Verify the contents of the source and target buckets by running the following commands:
aws s3 ls --recursive s3://DOC-EXAMPLE-BUCKET-SOURCE --summarize > bucket-contents-source.txt aws s3 ls --recursive s3://DOC-EXAMPLE-BUCKET-TARGET --summarize > bucket-contents-target.txt
Note: Update the list command to include your source and target bucket names.
2. Compare objects that are in the source and target buckets by using the outputs that are saved to files in the AWS CLI directory. See the following example output:
$ aws s3 ls --recursive s3://DOC-EXAMPLE-BUCKET --summarize 2017-11-20 21:17:39 15362 s3logo.png Total Objects: 1 Total Size: 15362
Update existing API calls to the target bucket name
Update any existing applications or workloads so that they use the target bucket name. You might need to run sync commands to address discrepancies between source and target buckets if you have frequent writes.