How can I optimize performance when I upload large files to Amazon S3?

3 分的閱讀內容
0

I want to upload large files (1 GB or larger) to Amazon Simple Storage Service (Amazon S3). How can I optimize the performance of this upload?

Short description

When you upload large files to Amazon S3, it's a best practice to leverage multipart uploads. If you're using the AWS Command Line Interface (AWS CLI), then all high-level aws s3 commands automatically perform a multipart upload when the object is large. These high-level commands include aws s3 cp and aws s3 sync.

Consider the following options for improving the performance of uploads and optimizing multipart uploads:

  • If you're using the AWS CLI, customize the upload configurations.
  • Enable Amazon S3 Transfer Acceleration.

Resolution

If you're using the AWS CLI, customize the upload configurations

You can customize the following AWS CLI configurations for Amazon S3:

  • max_concurrent_requests: This value sets the number of requests that can be sent to Amazon S3 at a time. The default value is 10.
    Note: Running more threads consumes more resources on your machine. You must be sure that your machine has enough resources to support the maximum number of concurrent requests that you want.
  • max_queue_size: This value sets the maximum number of tasks in the queue. The default value is 1,000.
  • multipart_threshold: This value sets the size threshold for multipart uploads of individual files. The default value is 8 MB.
  • multipart_chunksize: This value sets the size of each part that the AWS CLI uploads in a multipart upload for an individual file. This setting allows you to break down a larger file (for example, 300 MB) into smaller parts for quicker upload speeds. The default value is 8 MB.
    Note: A multipart upload requires that a single file is uploaded in not more than 10,000 distinct parts. You must be sure that the chunksize that you set balances the part file size and the number of parts.
  • max_bandwidth: This value sets the maximum bandwidth for uploading data to Amazon S3. There is no default value.

Note: If you receive errors when running AWS CLI commands, make sure that you’re using the most recent version of the AWS CLI.

Enable Amazon S3 Transfer Acceleration

Amazon S3 Transfer Acceleration can provide fast and secure transfers over long distances between your client and Amazon S3. Transfer Acceleration uses Amazon CloudFront's globally distributed edge locations.

Transfer Acceleration incurs additional charges, so be sure to review pricing. To determine if Transfer Acceleration might improve the transfer speeds for your use case, review the Amazon S3 Transfer Acceleration Speed Comparison tool.

Note: Transfer Acceleration doesn't support cross-Region copies using CopyObject.


AWS 官方
AWS 官方已更新 2 年前
4 評論

what is the size of object after which s3 cli uses multi part upload ?

AWS
回答 1 年前

Thank you for your comment. We'll review and update the Knowledge Center article as needed.

profile pictureAWS
管理員
回答 1 年前

I had to switch the AWS CLI to use the CRT (AWS Common Runtime) library for S3 that has better performance than the Python library. This is explained in more detail in https://awscli.amazonaws.com/v2/documentation/api/latest/topic/s3-config.html#preferred-transfer-client . It would be nice to have a link to that article here.

The following 2 commands helped me improve the performance significantly:

aws configure set default.s3.preferred_transfer_client crt
aws configure set default.s3.target_bandwidth 100Gb/s

Adjusting the multipart_chunksize variable can help as well.

AWS
回答 4 個月前

Thank you for your comment. We'll review and update the Knowledge Center article as needed.

profile pictureAWS
管理員
回答 4 個月前