AWS Developer Tools Blog

Introducing multipart download support for AWS Tools for PowerShell v5

The new multipart download support in AWS Tools for PowerShell v5 improves the performance of downloading large objects from Amazon Simple Storage Service (Amazon S3) compared to the single-stream downloads. The Read-S3Object and Copy-S3Object cmdlets now deliver faster download speeds through an opt-in switch parameter -UseMultipartDownload for multipart downloads, reducing the need for complex code to manage concurrent connections, handle retries, and coordinate multiple download streams. It uses the AWS SDK for .NET v4 S3 Transfer Manager under the hood to execute the downloads.

In this post, we’ll show you how to configure and use these new multipart download capabilities, including downloading single objects and directories, choosing between download strategies, customizing parallelism settings, and migrating your existing download methods to take advantage of these performance improvements.

Parallel download using part numbers and byte-ranges

For download operations, Read-S3Object and Copy-S3Object now support both part number and byte-range fetches. Part number fetches download the object in parts, using the part number that Amazon S3 assigned to each object part during upload. Byte-range fetches download the object with byte ranges and work on objects, regardless of whether they were originally uploaded using multipart upload or not. The transfer manager splits your GetObject request into multiple smaller requests, each of which retrieves a specific portion of the object. The transfer manager executes your requests through concurrent connections to Amazon S3.

Choosing between part number and byte-range strategies

Choose between part number and byte-range downloads based on your object’s structure. Part number downloads (the default) work best for objects uploaded with standard multipart upload part sizes. If the object is a non-multipart object, choose byte-range downloads. Range downloads facilitate greater parallelization when objects have large parts and work with S3 objects regardless of the upload method that was used.

Keep in mind that smaller range sizes result in more S3 requests. Each API call incurs a request cost beyond the data transfer itself, so balance parallelism benefits against the number of requests for your use case.

Now that you understand the download strategies, let’s get started.

Getting started

To get started with multipart downloads in AWS Tools for PowerShell, follow these steps:

Update your module

Update AWS Tools modules to latest version. Available from version 5.0.208 and later.

 # Modular (recommended)
PS> Update-Module AWS.Tools.S3

# Or monolithic
PS> Update-Module AWSPowerShell.NetCore

# Or Windows PowerShell monolithic
PS> Update-Module AWSPowerShell

Download an object to file

To download an object from an Amazon S3 bucket to a local file with multipart support, add the -UseMultipartDownload switch to your Read-S3Object command. You must provide the source bucket, the S3 object key, and the destination file path.

# Download large file with multipart support (Part number strategy)
PS> $response = Read-S3Object -BucketName amzn-s3-demo-bucket -Key "data/large-dataset.zip" `
    -File "C:\downloads\large-dataset.zip" `
    -UseMultipartDownload
	
$response.ContentRange
$response.ETag
$response.ChecksumType
# And other 33 response object parameters.

# Download using byte-range strategy (works with any S3 object)
PS> Read-S3Object -BucketName amzn-s3-demo-bucket -Key "data/any-object.dat" `
    -File "C:\downloads\any-object.dat" `
    -UseMultipartDownload `
    -MultipartDownloadType RANGE `
    -PartSize 16MB 

You can customize the following options:

# Custom concurrent connections (default is 10) (Linux based example)
PS> Read-S3Object -BucketName amzn-s3-demo-bucket -Key "data/large-file.bin" `
    -File "/home/user/downloads/large-file.bin" `
    -UseMultipartDownload `
    -MultipartDownloadType RANGE `
    -PartSize 64MB `
    -ConcurrentServiceRequest 20

Experiment with these values to find the best configuration for your use case. Factors like object size and available network bandwidth will influence which settings work best.

Download a directory

To download multiple objects from an S3 bucket prefix to a local directory, use the -KeyPrefix and -Folder parameters with -UseMultipartDownload. The cmdlet automatically applies multipart download to each individual object in the directory.

# Download entire directory with multipart support for large files
PS> Read-S3Object -BucketName amzn-s3-demo-bucket -KeyPrefix "data/" `
    -Folder "C:\downloads\data" `
    -UseMultipartDownload `
    -ConcurrentServiceRequest 10 `
    -DownloadFilesConcurrently

The -DownloadFilesConcurrently parameter facilitates file-level parallelism, downloading multiple files at the same time. When combined with -UseMultipartDownload, each individual file also benefits from part-level parallelism, providing high throughput for directory downloads containing many large files.

Using Copy-S3Object

The same multipart download parameters are available on Copy-S3Object for S3-to-local download operations.

PS> $response = Copy-S3Object -BucketName -Key "data/large-file.bin" `
    -LocalFile "C:\downloads\large-file.bin" `
    -UseMultipartDownload `
    -MultipartDownloadType RANGE `
    -PartSize 16MB
$response.ContentRange
$response.ETag
$response.ChecksumType
# And other 33 response object parameters.

Note: The multipart download parameters only apply to S3-to-local download operations in Copy-S3Object. They are not available for S3-to-S3 copy operations.

New parameters at a glance

Parameter Description
1 -UseMultipartDownload Opt-in switch for multipart parallel download
2 -MultipartDownloadType PART (default) or RANGE
3 -PartSize Part size for RANGE mode (e.g., 8MB, 64MB, 1GB). Default is 8 MB
4 -ConcurrentServiceRequest Maximum number of parallel HTTP connections. Default is 10
5 -DownloadFilesConcurrently File-level parallelism for directory downloads
6 -FailurePolicy AbortOnFailure (default) or ContinueOnFailure for directory downloads

Migration path

The new -UseMultipartDownload parameter comes with both multipart performance as well as access to S3 response metadata. Here’s how to migrate your existing code:

# Existing code (still works but returns legacy response object System.IO.FileInfo) 
PS> Read-S3Object -BucketName amzn-s3-demo-bucket -Key "data/large-dataset.zip" -File "C:\downloads\large-dataset.zip" 

For directory downloads:

# Existing code (still works but returns legacy response object System.IO.DirectoryInfo) 
PS> Read-S3Object -BucketName amzn-s3-demo-bucket -KeyPrefix "data/" -Folder "C:\downloads\data"

# Enhanced version with multipart support (returns S3 response metadata) 
PS> Read-S3Object -BucketName amzn-s3-demo-bucket -KeyPrefix "data/" `
    -Folder "C:\downloads\data" `
    -UseMultipartDownload `
    -ConcurrentServiceRequest 10 `
    -DownloadFilesConcurrently

Conclusion

The multipart download support in AWS Tools for PowerShell provides performance improvements for downloading large objects from Amazon S3. By using parallel byte-range or part-number fetches, you can reduce transfer times. This feature is fully opt-in and available in all three module variants: AWS.Tools.S3, AWSPowerShell.NetCore, and AWSPowerShell.

Next steps: Try implementing multipart downloads in your scripts and measure the performance improvements for your specific use cases.

To learn more about AWS Tools for PowerShell, visit the AWS Tools for PowerShell documentation. For questions or feedback about this feature, visit the GitHub issues page. For more details on the underlying multipart download engine, see the AWS SDK for .NET blog post.

Sanket Tangade

Sanket Tangade

Sanket Tangade is a Software Development Engineer at AWS. As an SDE on the AWS SDK for .NET/PowerShell team, he builds tools that improve the developer experience for customers using .NET & PowerShell to manage their AWS infrastructure. You can find him on LinkedIn @sanket-tangade.