AWS Developer Tools Blog

Introducing CRT-based S3 Client and the S3 Transfer Manager in the AWS SDK for Java 2.x

We are excited to announce the general availability of two new features in the AWS SDK for Java 2.x that enable accelerated object transfer with Amazon Simple Storage Service (Amazon S3): An AWS Common Runtime (CRT)-based S3 SDK client, and the S3 Transfer Manager.

The CRT-based S3 client allows you to transfer objects to and from Amazon S3 with enhanced performance and reliability by automatically leveraging Amazon S3 multipart upload API and byte-range fetches. It implements the same interface as the existing S3 async client and offers improved throughput out-of-the-box.

The S3 Transfer Manager is a high-level transfer utility built on top of the S3 client. It provides a simple API to allow you to transfer files and directories between your application and Amazon S3. The S3 Transfer Manager also enables you to monitor a transfer’s progress in real-time, as well as pause the transfer for execution at a later time.

The following table shows the use-cases for the two new features.

This post walks you through the key features of these additions and shows you how to use them.

Key Features

Improved throughput for upload, download and copy

For an upload or copy S3 object request, the CRT-based S3 client can convert a single PutObject or CopyObject request to multiple MultiPartUpload requests and sends those requests in parallel to improve performance.

For a download S3 object request, the CRT-based S3 client can split a GetObject request into multiple smaller requests when it will improve performance. These requests will transfer different portions of the object in parallel to achieve higher aggregated throughput.

In addition, in the event of a network failure, the CRT-based S3 client retries individual failed parts without starting the whole transfer over, improving transfer reliability.

We have conducted performance tests comparing the CRT-based S3 client with the existing S3 async client for getObject, putObject and copyObject operations. From our tests, the CRT based S3 client has better performance for a 256MB object. The following table shows you the improvement percentage for those three operations. We expect customers to observe even greater improvements for objects of larger size.

Enhanced connection management

The CRT-based S3 client offers enhanced connection pooling and Domain Name System (DNS) load balancing. Behind the scenes, the CRT-based S3 client builds a pool of Amazon S3 server IPs and loads balance across multiple Amazon S3 endpoints. This improves throughput beyond what a single server IP can provide and allows for automatic fail-over in the unlikely event of a slow endpoint or server outage.

Faster startup time

You can benefit from faster startup time by using the CRT-based S3 client. We have observed an improvement of up-to 68% in AWS Lambda startup performance between the existing S3 client and new CRT-based S3 client. Note that the results may vary based on your application configuration.

The following chart compares the AWS Lambda cold start duration by invoking ListBuckets API using the existing S3 async client and the new CRT-based S3 client.

Getting Started

CRT-Based S3 Client

Add a dependency for the CRT-based S3 Client

First, you need to add two dependencies to your project. Search the Maven central repository for the most recent versions of the s3 and aws-crt artifacts.

<dependency>
  <groupId>software.amazon.awssdk</groupId>
  <artifactId>s3</artifactId>
  <version>${aws.sdk.version}</version>
</dependency>
<dependency>
  <groupId>software.amazon.awssdk.crt</groupId>
  <artifactId>aws-crt</artifactId>
  <version>${aws.crt.version}</version>
</dependency>

Instantiate the CRT-based S3 Client

You can easily instantiate the CRT-based S3 client using the default settings.

S3AsyncClient s3AsyncClient = S3AsyncClient.crtCreate();

If you need to configure the client, you can use the client builder. You can easily switch from a Java based S3 client to CRT-based client by changing the builder method. Note that some of the settings that are available in the standard builder may not be currently supported in the CRT builder.

S3AsyncClient s3AsyncClient = 
       S3AsyncClient.crtBuilder()
                    .credentialsProvider(DefaultCredentialsProvider.create())
                    .region(Region.US_WEST_2)
                    .targetThroughputInGbps(20.0)
                    .minimumPartSizeInBytes(8 * MB)
                    .build();

Invoke S3 APIs

You can use the CRT-based S3 client to invoke any Amazon S3 APIs. In the following section, we show you how to call PutObject and GetObject.

S3AsyncClient s3Client = S3AsyncClient.crtCreate();

// Upload a local file to Amazon S3
PutObjectResponse putObjectResponse = 
      s3Client.putObject(req -> req.bucket("myBucket")
                                   .key("myKey"),
                        AsyncRequestBody.fromFile(Paths.get("myFile.txt")))
              .join();

// Download an object from Amazon S3 to a local file
GetObjectResponse getObjectResponse = 
     s3Client.getObject(req -> req.bucket("myBucket")
                                  .key("myKey"),
                        AsyncResponseTransformer.toFile(Paths.get("myFile.txt")))
             .join();

S3 Transfer Manager

Add a dependency for the S3 Transfer Manager

First, you need to include the s3-transfer-manager and aws-crt dependencies in your project. Search the Maven central repository for the most recent versions of the s3-transfer-manager and aws-crt artifacts.

<dependency>
  <groupId>software.amazon.awssdk</groupId>
  <artifactId>s3-transfer-manager</artifactId>
  <version>${aws.sdk.version}</version>
</dependency>
<dependency>
  <groupId>software.amazon.awssdk.crt</groupId>
  <artifactId>aws-crt</artifactId>
  <version>${aws.crt.version}</version>
</dependency>

Instantiate the S3 Transfer Manager

You can instantiate the Transfer Manager using the default settings using the following snippet.

S3TransferManager transferManager = S3TransferManager.create();

If you wish to configure settings, or use an underlying CRT-based S3 client you have already constructed, we recommend using the builder instead:

S3AsyncClient s3AsyncClient = 
    S3AsyncClient.crtBuilder()
                 .credentialsProvider(DefaultCredentialsProvider.create())
                 .region(Region.US_WEST_2)
                 .targetThroughputInGbps(20.0)
                 .minimumPartSizeInBytes(8 * MB)
                 .build();

S3TransferManager transferManager =
    S3TransferManager.builder()
                     .s3Client(s3AsyncClient)
                     .build();

Transfer a single object

Upload a file to S3 and log the upload’s progress with a TransferListener

To upload a file to Amazon S3, you need to provide the source file path and a PutObjectRequest specifying the target bucket and key. Optionally, you can monitor the progress of the transfer by attaching a TransferListener. The provided LoggingTransferListener logs a basic progress bar; users can also implement their own listeners.

S3TransferManager transferManager = S3TransferManager.create();

UploadFileRequest uploadFileRequest = 
    UploadFileRequest.builder()
                     .putObjectRequest(req -> req.bucket("bucket").key("key"))
                     .addTransferListener(LoggingTransferListener.create())
                     .source(Paths.get("myFile.txt"))
                     .build();

FileUpload upload = transferManager.uploadFile(uploadFileRequest);

// Wait for the transfer to complete
upload.completionFuture().join();

Download an S3 object to a local file and log the download’s progress with a TransferListener

To download an object, you need to provide the destination file path and a GetObjectRequest specifying the source bucket and key. Same as upload, you can monitor the progress of the transfer by attaching a TransferListener.

S3TransferManager transferManager = S3TransferManager.create();

DownloadFileRequest downloadFileRequest = 
    DownloadFileRequest.builder()
                       .getObjectRequest(req -> req.bucket("bucket").key("key"))
                       .destination(Paths.get("myFile.txt"))
                       .addTransferListener(LoggingTransferListener.create())
                       .build();

FileDownload download = transferManager.downloadFile(downloadFileRequest);

// Wait for the transfer to complete
download.completionFuture().join();

Copy an S3 object from one location to another

To copy an object, you need to provide a CopyObjectRequest with a source and destination location.

S3TransferManager transferManager = S3TransferManager.create();
CopyObjectRequest copyObjectRequest = CopyObjectRequest.builder()
                                                       .sourceBucket("source_bucket")
                                                       .sourceKey("source_key")
                                                       .destinationBucket("dest_bucket")
                                                       .destinationKey("dest_key")
                                                       .build();
CopyRequest copyRequest = CopyRequest.builder()
                                     .copyObjectRequest(copyObjectRequest)
                                     .build();

Copy copy = transferManager.copy(copyRequest);

// Wait for the transfer to complete
CompletedCopy completedCopy = copy.completionFuture().join();

Transfer multiple objects in the same directory

Upload a local directory to an S3 bucket

To upload a local directory recursively to an S3 bucket, you need to provide the source directory and the target bucket.

S3TransferManager transferManager = S3TransferManager.create();
DirectoryUpload directoryUpload =
    transferManager.uploadDirectory(UploadDirectoryRequest.builder()
                                                          .source(Paths.get("source/directory"))
                                                          .bucket("bucket")
                                                          .build());

// Wait for the transfer to complete
CompletedDirectoryUpload completedDirectoryUpload = directoryUpload.completionFuture().join();

// Print out the failed uploads
completedDirectoryUpload.failedTransfers().forEach(System.out::println);

Download S3 objects within the same bucket to a local directory

To download all S3 objects within the same bucket, you need to provide the destination directory and the source bucket.

S3TransferManager transferManager = S3TransferManager.create();
DirectoryDownload directoryDownload =
    transferManager.downloadDirectory(DownloadDirectoryRequest.builder()
                                                              .destination(Paths.get("destination/directory"))
                                                              .bucket("bucket")
                                                              .build());
// Wait for the transfer to complete
CompletedDirectoryDownload completedDirectoryDownload = directoryDownload.completionFuture().join();

// Print out the failed downloads
completedDirectoryDownload.failedTransfers().forEach(System.out::println);

Conclusion

In this blog post, we went over the key features of the CRT-based S3 client and the S3 Transfer Manager in the AWS SDK for Java 2.x. We showed you how easy it is to utilize them to transfer objects. To learn more about how to set up and begin using these features, visit our Developer Guide and API Reference. Try it out today and let us what you think! You can reach out to us by creating an issue on our GitHub repo.

Zoe Wang

Zoe Wang

Zoe is a Software Development Engineer working on the AWS SDK for Java. She is passionate about building tools to improve the developer experience. You can find her on GitHub @zoewangg.