AWS Official Blog

High Performance Multithreaded Access to Amazon SimpleDB

by Jeff Barr | on | in Coding Tip |

The code samples described in this post are no longer available.

Please consider using Amazon DynamoDB for new applications.

Simpledb_s3_query_sample_2 We have just released a new code sample.

Written in Java, this new sample shows how Amazon SimpleDB can be used as a repository for metadata which describes objects stored in Amazon S3. The code was written to illustrate best practices for indexing S3 data and for getting the best indexing and query performance from SimpleDB.

Indexing is implemented at two levels. At the first level, multiple threads (implemented using the Java Executor) are used to ensure that a number of S3 reads and a number of SimpleDB writes are taking place simultaneously. At the second level, Amazon SQS is used to coordinate index tasks running on multiple systems, leading to an even higher degree of concurrency.

Bulk queries are implemented using a pair of thread pools. The first pool runs SimpleDB queries and the second retrieves SimpleDB attributes. With the proper balance between the two pools, a Small Amazon EC2 instance was able to make over 300 requests per second.

Check it out!

– Jeff;