How can I improve the transfer speeds for copying data between my S3 bucket and EC2 instance?
Last updated: 2019-12-17
I want to transfer data from my Amazon Elastic Compute Cloud (Amazon EC2) instance to my Amazon Simple Storage Service (Amazon S3) bucket. How can I improve the transfer speeds?
The transfer speeds for copying, moving, or syncing data from Amazon EC2 to Amazon S3 depend on several factors. The following methods are best practices for improving the transfer speed when you copy, move, or sync data between an EC2 instance and an S3 bucket:
- Use enhanced networking on the EC2 instance.
- Use parallel workloads for the data transfer.
- Customize the upload configurations on the AWS Command Line Interface (AWS CLI).
- Use an Amazon Virtual Private Cloud (Amazon VPC) endpoint for Amazon S3.
- Use S3 Transfer Acceleration between geographically distant AWS Regions.
- Upgrade your EC2 instance type.
- Use chunked transfers.
Use enhanced networking on the EC2 instance
Enhanced networking provides higher bandwidth, higher packet per second (PPS) performance, and lower inter-instance latencies. You can enable enhanced networking at no additional charge.
If your EC2 instance's PPS rate seems to have reached its ceiling, the instance has likely reached the upper thresholds of the virtual network interface driver. If this happens, consider enabling enhanced networking.
Note: Be sure to review the instance requirements for enhanced networking.
Use parallel workloads for the data transfer
To potentially improve the overall time it takes to complete the transfer, consider splitting the transfer into multiple mutually exclusive operations. For example, if you're using the AWS CLI, you can run multiple instances of aws s3 cp (copy), aws s3 mv (move), or aws s3 sync (synchronize) at the same time.
Note: As a best practice, confirm that you're using the latest version of the AWS CLI.
Customize the upload configurations on the AWS CLI
You can customize the following AWS CLI configurations for Amazon S3 to speed up the data transfer:
- multipart_chunksize: This value sets the size of each part that the AWS CLI uploads in a multipart upload for an individual file. This setting allows you to break down a larger file (for example, 300 MB) into smaller parts for quicker upload speeds.
Note: A multipart upload requires that a single file is uploaded in not more than 10,000 distinct parts. You must be sure that the chunksize that you set balances the part file size and the number of parts.
- max_concurrent_requests: By default, the AWS CLI supports multithreading. You can change the max_concurrent_requests value to increase the number of requests that can be sent to Amazon S3 at a time. The default value is 10. After you increase this value, you might get a stagnant response. However, when you combine a higher max_concurrent_requests value with parallel workloads, you can achieve better transfer speeds overall.
Note: Running more threads consumes more resources on your machine. You must be sure that your machine has enough resources to support the maximum number of concurrent requests that you want.
Use a VPC endpoint for Amazon S3
If your EC2 instance is in the same Region as the S3 bucket, then consider using a VPC endpoint for Amazon S3. VPC endpoints can help improve overall performance and reduce the load on your network address translation (NAT).
Another benefit to using a VPC endpoint is that you can privately connect to a VPC without an internet gateway, NAT device, or VPN connection. Instances in a VPC don't require public IP addresses to communicate with resources like an Amazon S3 bucket. When you use a VPC endpoint, the data traffic between the VPC and Amazon S3 is routed on the AWS network.
Use S3 Transfer Acceleration between geographically distant AWS Regions
The data transfer speed can be higher if the EC2 instance and the S3 bucket are geographically closer to each other. If the instance and the bucket are in geographically distant AWS Regions, consider enabling Amazon S3 Transfer Acceleration. Transfer Acceleration provides fast and secure transfers over long distances using Amazon CloudFront's globally distributed edge locations.
Transfer Acceleration incurs additional charges, so be sure to review pricing. To determine if Transfer Acceleration will improve the transfer speeds for your use case, review the Amazon S3 Transfer Acceleration Speed Comparison tool.
Upgrade your EC2 instance type
If your EC2 instance's CPU utilization is high, it can be a bottleneck to your overall transfer speeds. You can upgrade your instance to another instance type that provides higher memory and network performance. Larger instance sizes for an instance type typically provide better network performance than smaller instance sizes of the same type.
Note: As a best practice, choose an instance type with at least 10 Gbps network connectivity for sustained and reliable network bandwidth between the EC2 instance and Amazon S3.
Use chunked transfers
If you're transferring large files, multipart uploads and ranged GETs can help improve overall transfer performance.