Category: Java

Amazon S3 TransferManager

by Jason Fulghum | on | in Java | Permalink | Comments |  Share

One of the great APIs inside the AWS SDK for Java is a class called TransferManager that makes working with uploads and downloads from Amazon S3 easy and convenient.

TransferManager provides asynchronous management for uploads and downloads between your application and Amazon S3. You can easily check on the status of your transfers, add handlers to run code when a transfer completes, cancel transfers, and more.

But perhaps the best thing about TransferManager is how it hides the complexity of transferring files behind an extremely simple API. TransferManager is essentially two operations: upload and download. From there you just work with your upload and download objects to interact with your transfers. The following example shows how easy it is to create a TransferManager instance, upload a file, and print out its progress as a percent while it’s transferring.

// Each instance of TransferManager maintains its own thread pool
// where transfers are processed, so share an instance when possible
TransferManager tx = new TransferManager(credentials);

// The upload and download methods return immediately, while
// TransferManager processes the transfer in the background thread pool
Upload upload = tx.upload(bucketName, myFile.getName(), myFile);

// While the transfer is processing, you can work with the transfer object
while (upload.isDone() == false) {
    System.out.println(upload.getProgress().getPercentTransferred() + "%");

Behind this simple API, TransferManager is doing a lot of work for you. Depending on the size and data source for your upload, TransferManager adjusts the algorithm it uses to process your transfer, in order to get the best performance and reliability. Whenever possible, uploads are broken up into multiple pieces, so that several pieces can be sent in parallel to provide better throughput. In addition to higher throughput, this approach also enables more robust transfers, since an I/O error in any individual piece means the SDK only needs to retransmit the one affected piece, and not the entire transfer.

TransferManager includes several more advanced features, such as recursively downloading entire sections of S3 buckets, or the ability to clean up pieces of failed multipart uploads. One of the more commonly used options is the ability to attach a progress listener to your uploads and downloads, which can run custom code at different points in the transfer’s lifecycle. The following example demonstrates using a progress listener to periodically print out the transfer’s progress, and print a final message when the transfer completes.

TransferManager tx = new TransferManager(credentials);
Upload upload = tx.upload(bucketName, myFile.getName(), myFile);

// You can set a progress listener directly on a transfer, or you can pass one into
// the upload object to have it attached to the transfer as soon as it starts
upload.setProgressListener(new ProgressListener() {
    // This method is called periodically as your transfer progresses
    public void progressChanged(ProgressEvent progressEvent) {
        System.out.println(upload.getProgress().getPercentTransferred() + "%");

        if (progressEvent.getEventCode() == ProgressEvent.COMPLETED_EVENT_CODE) {
            System.out.println("Upload complete!!!");

// waitForCompletion blocks the current thread until the transfer completes
// and will throw an AmazonClientException or AmazonServiceException if
// anything went wrong.

For a complete example of using Amazon S3 TransferManager and progress listeners, see the AmazonS3TransferManager sample that ships with the SDK for Java.

Are you using TransferManager in any of your projects yet? What custom code do you run in your progress listeners? Let us know in the comments!

Asynchronous Requests with the AWS SDK for Java

by Jason Fulghum | on | in Java | Permalink | Comments |  Share

In addition to the standard, blocking/synchronous clients in the AWS SDK for Java that you’re probably already familiar with, the SDK also contains non-blocking/asynchronous clients that are just as easy to use, and often more convenient for certain types of applications.

When you call an operation with one of the standard, synchronous clients in the SDK, your code is blocked while the SDK sends your request, waits for the service to process it, and parses the response. This is an easy way to work with the SDK, but there are some situations where you just want to kick off the request, and let your code continue executing. The asynchronous clients in the SDK allow you to do exactly that. Kick off your requests, and check back later to see if they completed.

AmazonDynamoDBAsync dynamoDB = new AmazonDynamoDBAsyncClient(myCredentials);
dynamoDB.describeTableAsync(new DeleteTableRequest(myTableName));
// Your code immediately continues executing, while your request runs in the background

Now that you know how to kick off your asynchronous request, how do you handle the response when it arrives? All of the asynchronous operations return a Future object that you can poll to see if your request has completed processing and if a response object is available. But sitting around polling a Future defeats the purpose of freeing up your code to continue executing after you kick off the request.

Usually, what you really want to do is, when the request finishes, execute some code to process the response. The asynchronous operations allow you to pass in an AsyncHandler implementation, which the SDK automatically runs as soon as your request finishes processing.

For example, the following piece of code kicks off an asynchronous request to describe an Amazon DynamoDB table. It passes in an AsyncHandler implementation, and when the request completes, the SDK runs the onSuccess method, which updates a UI label with the table’s status. AsyncHandler also provides an onError method, that allows you to handle any errors that occur while processing your request.

AmazonDynamoDBAsync dynamoDB = new AmazonDynamoDBAsyncClient(myCredentials);
dynamoDB.describeTableAsync(new DescribeTableRequest().withTableName(myTableName), 
    new AsyncHandler<DescribeTableRequest, DescribeTableResult>() {
        public void onSuccess(DescribeTableRequest request, DescribeTableResult result) {
        public void onError(Exception exception) {
            System.out.println("Error describing table: " + exception.getMessage());
            // Callers can also test if exception is an instance of 
            // AmazonServiceException or AmazonClientException and cast 
            // it to get additional information

Using the asynchronous clients in the SDK is easy and convenient. There are a lot of applications where processing requests in the background makes sense. UI applications are a great fit for asynchronous clients, since you don’t want to lock up your main UI thread, and consequently, the entire UI, while the SDK processes a request. Network issues could result in longer processing times, and an unresponsive UI that results in unhappy customers.

Another great use for the asynchronous clients is when you want to kick off a large batch of requests. If the requests don’t need to be executed serially, then you can gain a lot of throughput in your application by using the asynchronous clients to kick off many requests, all from a single thread.

Have you tried the asynchronous clients in the AWS SDK for Java yet? What kinds of applications are you using them for? Let us know how they’re working for you in the comments below.

More information on asynchronous programming with the AWS SDK for Java

Managing Multiple AWS Accounts with the AWS Toolkit for Eclipse

When you’re building the next great application with AWS services, you’ll probably end up with several different AWS accounts. You may have one account for your production application’s resources, another for your development environment, and a couple more for personal testing. It can be really helpful to switch between these various accounts during your development, either to move resources between accounts, to compare configuration values, or to debug a problem that only occurs in one environment.

The AWS Toolkit for Eclipse makes it painless to work with multiple AWS accounts. You can configure the toolkit to store as many different accounts as you like using the toolkit preferences page:

The previous screenshot illustrates configuring each account with a name to help you remember what it’s for, as well as its access credentials. If you’re importing your credentials into Eclipse for the first time, you can follow the links in the preferences dialog box to the credentials page on, where you can copy and paste them.

To configure multiple accounts, simply click the “Add account” button and fill in the account’s name and credentials. You can use the drop-down menu to edit the details of any individual account, as well as select the active account the toolkit will use.

Once you have all your accounts configured, you can quickly switch between them using the triangle drop-down menu in the upper-right-hand corner of the AWS Explorer view. It’s easy to miss this menu in Eclipse’s UI, so here’s a screenshot illustrating where to find it. The same drop-down menu also contains a shortcut to the accounts preferences page.

Switching the active account will cause the AWS Explorer view to refresh, showing you the AWS resources for whichever account you select. The active account will also be used for any actions you select from the orange AWS cube menu, such as launching a new Amazon EC2 instance.

How are you using the AWS Toolkit for Eclipse to manage your AWS accounts? Is the interface easy to understand? Does it work well for your use case? Let us know in the comments!

Working with AWS CloudFormation in Eclipse

One of the latest features we’ve added to the AWS Toolkit for Eclipse is support for working with AWS CloudFormation.

If you’re not familiar with AWS CloudFormation yet, it gives developers and systems administrators an easy way to create and manage a collection of related AWS resources, provisioning and updating them in an orderly and predictable fashion. Templates describe the AWS resources, and any associated dependencies or runtime parameters, required to run your application. For example, your template might describe a set of Amazon EC2 instances, all located in an Auto Scaling group, configured behind an elastic load balancer, and an elastic IP. You don’t need to figure out the order in which AWS services need to be provisioned or the subtleties of how to make those dependencies work. CloudFormation takes care of this for you. Once your AWS resources are deployed, you can modify and update them in a controlled and predictable way allowing you to version control your AWS infrastructure in the same way you version control your software.

AWS CloudFormation is a powerful tool for building applications on the AWS platform, and the integration in Eclipse makes it easy to harness.

When you launch an AWS CloudFormation template, you create a stack, which is all your running infrastructure, as defined by your template. You can quickly see all the AWS CloudFormation stacks running in your currently selected account and active region by opening the AWS Explorer view in Eclipse.

If you don’t have any stacks running yet, you might want to start by launching one of the many sample templates. These sample templates are a great way to get a feel for what’s possible with AWS CloudFormation, and also to learn the AWS CloudFormation template syntax. Often, you can find a sample template that’s close to your application’s architecture, and use it as a starting point for your own custom template.

To launch a new AWS CloudFormation stack from Eclipse, right-click the AWS CloudFormation node in the AWS Explorer view, and then click Create Stack. The New Stack wizard allows you to specify your own custom template, or the URL for one of the sample templates.

Once you launch your stack, you can open the stack editor by double-clicking your stack listed under the AWS CloudFormation node in the AWS Explorer view. The stack editor shows you all the information about your running stack. While your stack is launching, you can use the stack editor to view the different events for your stack as AWS CloudFormation brings up all the pieces of your infrastructure and configures them for you. You can also view the various AWS resources that are part of your stack through the stack editor, and see the parameters and outputs declared in your template.

When you’re ready to start writing your own templates, or editing existing templates, the AWS Toolkit for Eclipse has a template editor that makes it easy to work with CloudFormation templates. Just copy your template into one of your projects, and open it in the template editor. You’ll get syntax highlighting, integration with Eclipse’s outline view, content assist, and JSON syntax error reporting. There’s a lot of functionality available in the template editor, and lots more that we plan to add over time. Stay tuned to the AWS Java Blog for more updates and in-depth examples of the various features.

Are you already using AWS CloudFormation in any of your projects? Have you tried creating your own custom templates yet? Tell us how it’s going in the comments below.

Understanding Auto-Paginated Scan with DynamoDBMapper

by zachmu | on | in Java | Permalink | Comments |  Share

The DynamoDBMapper framework is a simple way to get Java objects into Amazon DynamoDB and back out again. In a blog post a few months ago, we outlined a simple use case for saving an object to DynamoDB, loading it, and then deleting it. If you haven’t used the DynamoDBMapper framework before, you should take a few moments to read the previous post, since the use case we’re examining today is more advanced.

Reintroducing the User Class

For this example, we’ll be working with the same simple User class as the last post. The class has been properly annotated with the DynamoDBMapper annotations so that it works with the framework. The only difference is that, this time, the class has a @DynamoDBRangeKey attribute.

@DynamoDBTable(tableName = "users")
public static class User {
    private Integer id;
    private Date joinDate;
    private Set<String> friends;
    private String status;
    public Integer getId() { return id; }
    public void setId(Integer id) { = id; }
    public Date getJoinDate() { return joinDate; }       
    public void setJoinDate(Date joinDate) { this.joinDate = joinDate; }

    @DynamoDBAttribute(attributeName = "allFriends")
    public Set<String> getFriends() { return friends; }
    public void setFriends(Set<String> friends) { this.friends = friends; }
    public String getStatus() { return status; }
    public void setStatus(String status) { this.status = status; }        

Let’s say that we want to find all active users that are friends with someone named Jason. To do so, we can issue a scan request like so:

DynamoDBMapper mapper = new DynamoDBMapper(dynamo);

DynamoDBScanExpression scanExpression = new DynamoDBScanExpression();
Map<String, Condition> filter = new HashMap<String, Condition>();
filter.put("allFriends", new Condition().withComparisonOperator(ComparisonOperator.CONTAINS)
                .withAttributeValueList(new AttributeValue().withS("Jason")));
                new Condition().withComparisonOperator(ComparisonOperator.EQ).withAttributeValueList(
                        new AttributeValue().withS("active")));

List<User> scanResult = mapper.scan(User.class, scanExpression);

Note the "allFriends" attribute on line 5. Even though the Java object property is called "friends," the @DyamoDBAttribute annotation overrides the name of the attribute to be "allFriends." Also notice that we’re using the CONTAINS comparison operator, which will check to see if a set-typed attribute contains a given value. The scan method on DynamoDBMapper immediately returns a list of results, which we can iterate over like so:

int usersFound = 0;
for ( User user : scanResult ) {
    System.out.println("Found user with id: " + user.getId());
System.out.println("Found " + usersFound + " users.");

So far, so good. But if we run this code on a large table, one with thousands or millions of items, we might notice some strange behavior. For one thing, our logging statements may not come at regular intervals—the program would seem to pause unpredictably in between chunks of results. And if you have wire-level logging turned on, you might notice something even stranger.

Found user with id: 5
DEBUG com.amazonaws.request - Sending Request: POST ... 
DEBUG com.amazonaws.request - Sending Request: POST ...
DEBUG com.amazonaws.request - Sending Request: POST ...
DEBUG com.amazonaws.request - Sending Request: POST ...
Found user with id: 6

Why does it take four service calls to iterate from user 5 to user 6? To answer this question, we need to understand how the scan operation works in DynamoDB, and what the scan operation in DynamoDBMapper is doing for us behind the scenes.

The Limit Parameter and Provisioned Throughput

In DynamoDB, the scan operation takes an optional limit parameter. Many new customers of the service get confused by this parameter, assuming that it’s used to limit the number of results that are returned by the operation, as is the case with the query operation. This isn’t the case at all. The limit for a scan doesn’t apply to how many results are returned, but to how many table items are examined. Because scan works on arbitrary item attributes, not the indexed table keys like query does, DynamoDB has to scan through every item in the table to find the ones you want, and it can’t predict ahead of time how many items it will have to examine to find a match. The limit parameter is there so that you can control how much of your table’s provisioned throughput to consume with the scan before returning the results collected so far, which may be empty. That’s why it took four services calls to find user 6 after finding user 5: DynamoDB had to scan through three full pages of the table before it found another item that matched the filters we specified. The List object returned by DynamoDBMapper.scan() hides this complexity from you and magically returns all the matching items in your table, no matter how many service calls it takes, so that you can concentrate on working with the domain objects in your search, rather than writing service calls in a loop. But it’s still helpful to understand what’s going on behind the scenes, so that you know how the scan operation can affect your table’s available provisioned throughput.

Auto-Pagination to the Rescue

The scan method returns a PaginatedList, which lazily loads more results from DynamoDB as necessary. The list will make as many service calls as necessary to load the next item in the list. In the example above, it had to make four service calls to find the next matching user between user 5 and user 6. Importantly, not all methods from the List interface can take advantage of lazy loading. For example, if you call get(), the list will try to load as many items as the index you specified, if it hasn’t loaded that many already. If you call the size() method, the list will load every single result in order to give you an accurate count. This can result in lots of provisioned throughput being consumed without you intending to, so be careful. On a very large table, it could even exhaust all the memory in your JVM.

We’ve had customer requests to provide manually paginated scan and query methods for DynamoDBMapper to enable more fine-tuned control of provisioned throughput consumption, and we’re working on getting those out in a future release. In the meantime, tell us how you’re using the auto-paginated scan and query functionality, and what you would like to see improved, in the comments!

Subscribing Queues to Topics

by Jason Fulghum | on | in Java | Permalink | Comments |  Share

Amazon Simple Notification Service (Amazon SNS) is a terrific service for publishing notification messages and having them automatically delivered to all your subscribers. You simply send a message to your SNS topic, and it gets delivered to all the subscribers for that topic. Amazon SNS supports many different types of subscribers for topics:

  • HTTP/HTTPS endpoints
  • Email addresses (with text or JSON format messages)
  • SMS/text-message addresses
  • Amazon SQS queues

Each type of subscriber is useful, but one of the most versatile for building systems is connecting an Amazon SQS queue directly to your Amazon SNS topic. This is a really handy and common architecture pattern when building applications on the AWS platform, and for good reason. The Amazon SNS topic provides you with a common point for sending messages and having them published to a dynamically managed list of subscribers, and the Amazon SQS queue provides you with a scalable, robust storage location for those delivered messages, while your application pulls them off the queue to process them.

Now that we’ve convinced you about the value of this pattern, let’s take a look at how to execute it in code, using the AWS SDK for Java. The first thing we need to do is create our Amazon SQS queue, and our Amazon SNS topic.

AmazonSNS sns = new AmazonSNSClient(credentials);
AmazonSQS sqs = new AmazonSQSClient(credentials);

String myTopicArn = sns.createTopic(new CreateTopicRequest("topicName")).getTopicArn();
String myQueueUrl = sqs.createQueue(new CreateQueueRequest("queueName")).getQueueUrl();

In order for a queue to receive messages from a topic, it needs to be subscribed and also needs a custom security policy to allow the topic to deliver messages to the queue. The following code in the SDK handles both of these for you automatically, without you ever having to deal with the details around building that custom policy.

Topics.subscribeQueue(sns, sqs, myTopicArn, myQueueUrl);

Now that your queue is connected to your topic, you’re ready to send messages to your topic, then pull them off of your queue. Note that it may take a few moments for the queue’s policy to be updated when the queue is initially subscribed to the topic.

sns.publish(new PublishRequest(myTopicArn, "Hello SNS World").withSubject("Subject"));

List<Message> messages = sqs.receiveMessage(new ReceiveMessageRequest(myQueueUrl).getMessages();
if (messages.size() > 0) {
    byte[] decodedBytes = Base64.decodeBase64((messages.get(0)).getBody().getBytes());
    System.out.println("Message: " +  new String(decodedBytes));

For more information on using this new method to subscribe an Amazon SQS queue to an Amazon SNS topic, including an explanation of the policy that is applied to your queue, see the AWS SDK for Java API documentation for Topics.subscribeQueue(…).

AWS Java Meme Generator Sample Application

If you couldn’t make it to AWS re:Invent this year, you can watch all of the presentations on the AWS YouTube channel. My talk was about using the AWS Toolkit for Eclipse to develop and deploy a simple meme generation app.

The application uses a common AWS architectural design pattern to process its workload and serve content. All the binary image data is stored in an Amazon S3 bucket; the image metadata is stored in Amazon DynamoDB; and the image processing jobs are managed using an Amazon SQS queue.

Here’s what happens when a customer creates a new meme image:

  1. The JSP page running in AWS Elastic Beanstalk asks Amazon S3 for a set of all the images in the bucket, and displays them to the customer.
  2. The customer selects their image and a caption to write onto it, then initiates a post.
  3. The JSP page inserts a new item into DynamoDB containing the customer’s choices, such as the S3 key of the blank image and the caption to write onto it.
  4. The JSP page inserts a message into the SQS queue containing the ID of the DynamoDB item inserted in the previous step.
  5. The JSP page polls the DynamoDB item periodically, waiting for the state to become “DONE”.
  6. A back-end image processing node on Amazon EC2 polls the SQS queue for work to do and finds the message inserted by the JSP page.
  7. The back-end worker loads the appropriate item from DynamoDB, downloads the blank macro image from Amazon S3, writes the caption onto the image, then uploads it back to the bucket.
  8. The back-end worker marks the DynamoDB item as “DONE”.
  9. The JSP page notices the work is done and displays the finished image to the customer.

Several customers in attendance expressed interest in the source code for the application, so we have released it on GitHub. It takes a little work to set up, mostly because you need to add the SDK and its third-party libraries to the project’s classpath. Follow the instructions in the README file, and please let us know how we can improve them!

Iterating Over Your Objects with Amazon S3

by Jason Fulghum | on | in Java | Permalink | Comments |  Share

There are a lot of hidden gems inside the AWS SDK for Java, and we’ll be highlighting as many as we can through this blog.

Today, we look at how to interact with paginated object and version listings from Amazon S3. Normally, when you list the contents of your Amazon S3 bucket, you’re responsible for understanding that Amazon S3 returns paginated results. This means that you get a page of results (not necessarily the entire result set), and then have to use a nextToken parameter to request the next page of results, and repeat this process until you’ve read the complete data set.

Fortunately, the AWS SDK for Java provides some utilities to automatically handle these paginated result sets for you. The S3Objects and S3Versions classes allow you to easily iterate over objects and object versions in your Amazon S3 buckets, without having to explicitly deal with pagination.

Using these iterators to traverse objects in your bucket is easy. Instead of calling s3.listObjects(...) directly, just use one of the static methods in S3Objects, such as withPrefix(...) to get back an iterable list. This allows you to easily traverse all the object summaries for the objects in your bucket, without ever having to explicitly deal with pagination.

AmazonS3Client s3 = new AmazonS3Client(myCredentials);
for ( S3ObjectSummary summary : S3Objects.withPrefix(s3, "my-bucket", "photos/") ) {
    System.out.printf("Object with key '%s'n", summary.getKey());

If you’ve enabled object versioning for your buckets, then you can use the S3Versions class in exactly the same way to iterate through all the object versions in your buckets.

AmazonS3Client s3 = new AmazonS3Client(myCredentials);
for ( S3VersionSummary summary : S3Versions.forPrefix(s3, "my-bucket", "photos/") ) {
    System.out.printf("Version '%s' of key '%s'n", 
                      summary.getVersionId(), summary.getKey());

Sending Email with JavaMail and AWS

by Jason Fulghum | on | in Java | Permalink | Comments |  Share

The Amazon Simple Email Service and the JavaMail API are a natural match for each other. Amazon Simple Email Service (Amazon SES) provides a highly scalable and cost-effective solution for bulk and transactional email-sending. JavaMail provides a standard and easy-to-use API for sending mail from Java applications. The AWS SDK for Java brings these two together with an AWS JavaMail provider, which gives developers the power of Amazon SES, combined with the ease of use and standard interface of the JavaMail API.

Using the AWS JavaMail provider from the SDK is easy. The following code shows how to set up a JavaMail session and send email using the AWS JavaMail transport.

 * Setup JavaMail to use Amazon SES by specifying
 * the "aws" protocol and our AWS credentials.
Properties props = new Properties();
props.setProperty("mail.transport.protocol", "aws");
props.setProperty("", credentials.getAWSAccessKeyId());
props.setProperty("", credentials.getAWSSecretKey());

Session session = Session.getInstance(props);

// Create a new Message
Message msg = new MimeMessage(session);
msg.setFrom(new InternetAddress(""));
msg.addRecipient(Message.RecipientType.TO, new InternetAddress(";
msg.setSubject("Hello AWS JavaMail World");
msg.setText("Sending email with the AWS JavaMail provider is easy!");

// Reuse one Transport object for sending all your messages
// for better performance
Transport t = new AWSJavaMailTransport(session, null);
t.sendMessage(msg, null);

// Close your transport when you're completely done sending
// all your messages.

You can find the complete source code for this sample in the samples directory of the SDK, or go directly to it on GitHub.

The full sample also demonstrates how to verify email addresses using Amazon SES, a necessary prerequisite for sending email to those addresses until you request full production access to Amazon SES. See the Amazon Simple Email Service Developer Guide for more information on Verifying Email Addresses and Requesting Production Access.

Running the AWS SDK for Android S3Uploader sample with Eclipse

As we announced previously, the AWS Toolkit for Eclipse now supports creating AWS-enabled Android projects, making it easier to get started talking to AWS services from your Android app. The Toolkit will also optionally create a sample Android application that talks to S3. Let’s walk through creating a new AWS Android project and running the sample.

First, make sure that you have the newest AWS Toolkit for Eclipse, available at

To create a new AWS-enabled Android project, choose File > New > Project… and find the AWS Android Project wizard.

The wizard will ask you to choose a project name and an Android target. If you haven’t set up your Android SDK yet, you’ll be able to do so from this wizard. Also make sure the option to create a sample application is checked.

That’s it! The newly created project is configured with the AWS SDK for Android and the sample application. You’ll want to edit the file to fill in your AWS credentials and choose an S3 bucket name before running the application.

If this is your first time using the Android Eclipse plug-in, you may need to create an Android Virtual Device at this point using the AVD Manager view. On Windows 7, I found that I couldn’t start the emulator with the default memory settings, as referenced in this Stack Overflow question, so I had to change them:

With this change, the emulator started right up for me, and I was able to see the S3Uploader application in the device’s application list.

Finally, there’s one last trick you might find useful in using the sample application: it relies on images in the Android image gallery of the emulated device. If you can’t be bothered with mounting a file system, a simple way to get some images in there is to save them from the web browser. Just start the web browser, then tap-hold on an image and choose “Save Image”.

We’re excited by how much easier it is to get this sample running now that Eclipse does most of the setup for you. Give it a try, and let us know how it works for you!