Satya shows you how to
perform a multipart message
upload to S3

Satya_s3-multipart-upload-cli-1

I want to copy a large file to an Amazon Simple Storage Service (Amazon S3) bucket as multiple parts or by using multipart uploading. How can I do this using the AWS Command Line Interface (AWS CLI)?

You can upload large files to Amazon S3 using the AWS CLI with either aws s3 commands (high level) or aws s3api commands (low level). For more information about these two command tiers, see Using Amazon S3 with the AWS Command Line Interface.
 
The recommended method is to use aws s3 commands (such as aws s3 cp) for multipart uploads and downloads, because these aws s3 commands automatically perform multipart uploading and downloading based on the file size. By comparison, aws s3api commands, such as aws s3api create-multipart-upload, should be used only when aws s3 commands don't support a specific upload need, such as when the multipart upload involves multiple servers, a multipart upload is manually stopped and resumed later, or when the aws s3 command doesn't support a required request parameter.

Before you upload the file, you can calculate the file's MD5 checksum value as a reference for integrity checks after the upload.

(Recommended) Upload the file using high-level (aws s3) commands

To use a high-level aws s3 command for your multipart upload, run this command:

$ aws s3 cp large_test_file s3://exampleawsbucket/

This example uses the command aws s3 cp, but other aws s3 commands that involve uploading objects into an S3 bucket (for example, aws s3 sync or aws s3 mv) also automatically perform a multipart upload when the object is large.

Objects that are uploaded to Amazon S3 using multipart uploads have a different ETag format than objects that are uploaded using a traditional PUT request. To store the MD5 checksum value of the source file as a reference, you can choose to upload the file with the checksum value as custom metadata. To add the MD5 checksum value as custom metadata, include the optional parameter --metadata md5="examplemd5value1234/4Q" in the upload command, similar to the following:

$ aws s3 cp large_test_file s3://exampleawsbucket/ --metadata md5="examplemd5value1234/4Q"

To use more of your host's bandwidth and resources during the upload, you can increase the maximum number of concurrent requests set in your AWS CLI configuration. By default, the AWS CLI uses 10 maximum concurrent requests. This command sets the maximum concurrent number of requests to 20:

$ aws configure set default.s3.max_concurrent_requests 20

For more information on configuring the AWS CLI with Amazon S3, see AWS CLI S3 Configuration.

Upload the file in multiple parts using low-level (aws s3api) commands

Important: This aws s3api procedure should be used only when aws s3 commands don't support a specific upload need, such as when the multipart upload involves multiple servers, a multipart upload is being manually stopped and resumed, or when the aws s3 command doesn't support a required request parameter. For other multipart uploads, use aws s3 cp or other high-level s3 commands.

1.    Split the file that you want to upload into multiple parts.
Tip: If you're using a Linux operating system, you can use the split command.

2.    Run this command to initiate a multipart upload and to retrieve the associated upload ID. The command returns a response that contains the UploadID:

aws s3api create-multipart-upload --bucket exampleawsbucket --key large_test_file

3.    Copy the UploadID value as a reference for later steps.

4.    Run this command to upload the first part of the file. Be sure to replace all values with the values for your bucket, file, and multipart upload. The command returns a response that contains an ETag value for the part of the file that you uploaded. For more information on each parameter, see upload-part.

aws s3api upload-part --bucket exampleawsbucket --key large_test_file --part-number 1 --body large_test_file.001 --upload-id exampleTUVGeKAk3Ob7qMynRKqe3ROcavPRwg92eA6JPD4ybIGRxJx9R0VbgkrnOVphZFK59KCYJAO1PXlrBSW7vcH7ANHZwTTf0ovqe6XPYHwsSp7eTRnXB1qjx40Tk --content-md5 exampleaAmjr+4sRXUwf0w==

5.    Copy the ETag value as a reference for later steps.

6.    Repeat steps 4 and 5 for each part of the file. Be sure to increase the part number with each new part that you upload.

7.    After you upload all the file parts, run this command to list the uploaded parts and confirm that the list is complete:

aws s3api list-parts --bucket exampleawsbucket --key large_test_file --upload-id exampleTUVGeKAk3Ob7qMynRKqe3ROcavPRwg92eA6JPD4ybIGRxJx9R0VbgkrnOVphZFK59KCYJAO1PXlrBSW7vcH7ANHZwTTf0ovqe6XPYHwsSp7eTRnXB1qjx40Tk

8.    Compile the ETag values for each file part that you uploaded into a JSON-formatted file that is similar to the following:

{
    "Parts": [{
        "ETag": "example8be9a0268ebfb8b115d4c1fd3",
        "PartNumber":1
    },

    ....

    {
        "ETag": "example246e31ab807da6f62802c1ae8",
        "PartNumber":4
    }]
}

9.    Name the file fileparts.json.

10.    Run this command to complete the multipart upload. Replace the value for --multipart-upload with the path to the JSON-formatted file with ETags that you created.

aws s3api complete-multipart-upload --multipart-upload file://fileparts.json --bucket exampleawsbucket --key large_test_file --upload-id exampleTUVGeKAk3Ob7qMynRKqe3ROcavPRwg92eA6JPD4ybIGRxJx9R0VbgkrnOVphZFK59KCYJAO1PXlrBSW7vcH7ANHZwTTf0ovqe6XPYHwsSp7eTRnXB1qjx40Tk

11.    If the previous command is successful, you receive a response similar to the following:

{
    "ETag": "\\"exampleae01633ff0af167d925cad279-2\\"",
    "Bucket": "exampleawsbucket",
    "Location": "https://exampleawsbucket.s3.amazonaws.com/large_test_file",
   
    "Key": "large_test_file"
}

Resolve upload failures

If you use the high-level aws s3 commands for a multipart upload and the upload fails (due either to a timeout or a manual cancellation), you must start a new multipart upload. In most cases, the AWS CLI automatically aborts the multipart upload and then removes any multipart files that you created. This process can take several minutes.

If you use aws s3api commands for a multipart upload and the process is interrupted, you must remove incomplete parts of the upload, and then re-upload the parts.

To remove the incomplete parts, you can use the AbortIncompleteMultipartUpload lifecycle action. Or, you can use aws s3api commands to remove the incomplete parts by following these steps:

1.    Run this command to list incomplete multipart file uploads. Replace the value for --bucket with the name of your bucket.

aws s3api list-multipart-uploads --bucket exampleawsbucket

2.    The command returns a message with any file parts that weren't processed, similar to the following:

{
    "Uploads": [
        {
           
    "Initiator": {
                "DisplayName": "multipartmessage",
                "ID": "290xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    "
            },
            "Initiated": "2016-03-31T06:13:15.000Z",
           
    "UploadId": "examplevQpHp7eHc_J5s9U.kzM3GAHeOJh1P8wVTmRqEVojwiwu3wPX6fWYzADNtOHklJI6W6Q9NJUYgjePKCVpbl_rDP6mGIr2AQJNKB_A-",
            "StorageClass": "STANDARD",
           
    "Key": "",
            "Owner": {
                "DisplayName": "multipartmessage",
               
    "ID": "290xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx "
            }
        }
   ]
}

3.    Run this command to remove the incomplete parts:

aws s3api abort-multipart-upload --bucket exampleawsbucket --key large_test_file --upload-id examplevQpHp7eHc_J5s9U.kzM3GAHeOJh1P8wVTmRqEVojwiwu3wPX6fWYzADNtOHklJI6W6Q9NJUYgjePKCVpbl_rDP6mGIr2AQJNKB

Did this page help you? Yes | No

Back to the AWS Support Knowledge Center

Need help? Visit the AWS Support Center.

Published: 2016-01-25

Updated: 2018-09-25