Building for Performance and Reliability with Amazon SimpleDB

What programming techniques should you use when building an application on top of Amazon SimpleDB? How can you achieve the best throughput and performance? How should you partition data among multiple domains? These topics and more are covered in this article, to help you on your way to building great applications using Amazon SimpleDB.


Submitted By: Bruce@AWS
AWS Products Used: Amazon SimpleDB
Created On: April 11, 2008


By Aleksey Martynov, Amazon Web Services

Amazon SimpleDB is a web service for running queries on structured data in real time. This service works in close conjunction with Amazon Simple Storage Service (Amazon S3) and Amazon Elastic Compute Cloud (Amazon EC2), collectively providing the ability to store, process and query data sets in the cloud. These services are designed to make web-scale computing easier and more cost-effective for developers.

Using BatchPutAttributes to improve put performance

With the BatchPutAttributes operation, you can perform multiple PutAttribute operations in a single call. This yields savings in round trips and latencies and enables Amazon SimpleDB to optimize requests, which generally yields better throughput.  When you have a large amount of data to load, it is highly recommended to use BatchPutAttributes instead of PutAttributes.
Note: The BatchPutAttributes operation succeeeds or fails in its entirety. There are no partial puts.

In general, the more operations included in one BatchPutAttributes request, the more throughput you get. However, the BatchPutAttributes operation has some limitations:

  • 256 attribute name-value pairs
  • 1 MB request size
  • 1 billion attributes per domain
  • 10 GB of total user data storage per domain
  • 25 item limit per BatchPutAttributes operation

Amazon SimpleDB service was built to support massively parallel operations. In order to achieve maximum performance and throughput, you should understand how to use request parallelization.

Submitting requests to the Amazon SimpleDB service serially (i.e. issuing requests one after another, with each request starting only when the previous one completes) magnifies the Internet roundtrip latencies present in every web service call. Picture your browser downloading a typical web page; if all elements of that page (images, scripts, style sheets) were loaded serially, the time to render would be noticeably slow. Most current browsers parallelize the process by loading multiple page elements simultaneously, which greatly improves the performance. This same concept can be extended to applications built on top of Amazon SimpleDB.

In this section, we discuss parallelizing requests against a single domain; however, the same techniques pertain when your application needs to execute high request rates against multiple domains.

As described above, although we can use BatchPutAttributes to optimize the put latency and throughput, it is limited by request size restrictions. In this section, we show how to further optimize the put throughput by using request parallelization.

To improve the put performance, you can distribute the data among multiple threads, or processes, that execute BatchPutAttributes calls in parallel.  Ideally, the improvement is proportional to the number of threads. However, large numbers of concurrent BatchPutAttributes calls can result in Service Unavailable (503) responses due to resource limitation. This technique could be combined with the exponential backoff algorithm we describe later in this article to find the optimal throughput level for your application and data set.

 

The pattern for executing parallel queries that are independent from each other is very similar to executing any other request: the application spawns multiple threads or processes, and each one is responsible for processing Select calls against Amazon SimpleDB independently.

There is one additional interesting use case for parallelizing queries that involves retrieving a large number of pages for a single query expression. At first glance, it seems that this use case is not well suited for parallelization, since the application needs to page sequentially through the result. However, by splitting the search range into multiple segments, it is possible to parallelize across these segments.

Consider the following Clickstream domain.

Domain: Clickstream

Item name

URL Hash

Hits

00001

c791751b-8d74-414a-8838-0074dc7858e2

0012

00002

55f1add7-44b7-4165-82bf-ebb611c9a9cf

0000

08764

f11b0b27-e5be-4a1b-86ed-ada040a2bbed

5364

08765

6a56024f-1b59-451c-9b17-1cd15c857850

0763

59187

39d12161-6d19-48c3-b6bf-2bd27abf9794

1211

59188

0621d81e-f20b-49f6-b626-02fa455387bb

0897

The following scenarios demonstrate how parallel queries compare favorably against the sequential method.

Scenario 1:

Single thread invokes a query (select * from Clickstream where Hits between '0000' and '1000'), which produces 10 pages of results retrieved sequentially.

Scenario 2:

Application starts 5 threads of execution as follows:

Thread 1: select * from Clickstream where Hits between '0000' and '0200' retrieves 2 pages of results
Thread 2: select * from Clickstream where Hits between '0200' and '0400' retrieves 2 pages of results
Thread 3: select * from Clickstream where Hits between '0400' and '0600' retrieves 2 pages of results
Thread 4: select * from Clickstream where Hits between '0600' and '0800' retrieves 2 pages of results
Thread 5: select * from Clickstream where Hits between '0800' and '1000' retrieves 2 pages of results

Assuming an even distribution of values over the result set, splitting the search range into multiple segments can provide up to a fivefold increase in overall execution time for this particular example.

Parallelizing with Perl and PHP

There are several popular languages that do not natively support multi-threading. PHP and Perl are examples of such languages. Does that mean that developers using PHP or Perl cannot take advantage of the massively parallel programming model that Amazon SimpleDB supports? Absolutely not!

Perl provides an Async.pm module for asynchronous processing of requests. Below is the description of the module from the CPAN documentation:

Async executes some code in a separate process and retrieves the result. Since the code is running in a separate process, your main program can continue with whatever it was doing while the separate code is executing. This separate code is called an “asynchronous computation”.

For PHP, executing multiple requests against Amazon SimpleDB can be achieved through the use of the curl_multi set of functions, which are part of the cURL module. The article, Simultaneuos HTTP Requests in PHP with cURL by Stoyan Stefanov provides an example of how to use curl_multi to issue simultaneous HTTP requests in PHP.

A typical mechanism for improving overall throughput of a database application is horizontal partitioning of the data. Amazon SimpleDB allows each subscriber to create multiple domains under their account. Application designers can use this feature to their advantage by partitioning their data set among multiple domains in order to parallelize their requests and improve overall throughput.

Factors that can influence the decision to partition your data:

  • Application requires higher overall put throughput than a single domain can provide. For example, your application requires 500 puts per second and with your data set you are only able to achieve 50 puts per second on a single domain. Partitioning your data across 10 domains can help you reach your goal.
  • Application requires executing queries that may require most of the items in the domain to be examined. In this case, the size of your domain will likely influence the query performance.
  • Application needs to exceed the maximum storage limitation for your domain. For example, your application logs 2 GB of data per day and you are limited to 10 GB per domain. A single domain can therefore only handle five days worth of data. Consequently, partitioning your data across multiple domains can significantly increase your overall storage capacity.

There are several options for partitioning your data set. Let’s consider a sample data set that contains some basic statistics about four NATO countries.

Domain: NATO

Item name

Country

Year

Personnel

Equipment

00001

France

2000

026460

008262

00002

France

2001

026436

008466

00003

France

2002

027057

008520

00004

France

2003

027045

009413

00005

France

2004

027080

009860

00006

France

2005

026813

009830

00007

Germany

2000

021853

004867

00008

Germany

2001

021371

004970

00009

Germany

2002

021124

004996

00010

Germany

2003

021068

004838

00011

Germany

2004

020150

005029

00012

Germany

2005

019680

005011

00013

Netherlands

2000

004086

001370

00014

Netherlands

2001

003960

001375

00015

Netherlands

2002

004221

001308

00016

Netherlands

2003

004395

001245

00017

Netherlands

2004

004321

001423

00018

Netherlands

2005

004331

001528

00019

USA

2000

121593

070636

00020

USA

2001

117645

083408

00021

USA

2002

131525

100067

00022

USA

2003

149896

101730

00023

USA

2004

155680

111330

00024

USA

2005

147844

114446

 

You may be able to find a particular dimension in your data that is well suited for partitioning. For example, “Country” may be a good partitioning dimension. In the following example, the application has created different domains for each country (France, Germany, Netherlands and USA):

Domain: France

Item name

Year

Personnel

Equipment

00001

2000

026460

008262

00002

2001

026436

008466

00003

2002

027057

008520

00004

2003

027045

009413

00005

2004

027080

009860

00006

2005

026813

009830

 

Domain: Germany

Item name

Year

Personnel

Equipment

00007

2000

021853

004867

00008

2001

021371

004970

00009

2002

021124

004996

00010

2003

021068

004838

00011

2004

020150

005029

00012

2005

019680

005011

 

Domain: Netherlands

Item name

Year

Personnel

Equipment

00013

2000

004086

001370

00014

2001

003960

001375

00015

2002

004221

001308

00016

2003

004395

001245

00017

2004

004321

001423

00018

2005

004331

001528

 

Domain: USA

Item name

Year

Personnel

Equipment

00019

2000

121593

070636

00020

2001

117645

083408

00021

2002

131525

100067

00022

2003

149896

101730

00023

2004

155680

111330

00024

2005

147844

114446

 

Another option for partitioning is related to the date and time associated with your data. You may be able to create separate domains for each day of the month or for each month of the year. For this particular data set, we could partition on the “Year” attribute, creating separate domains for 2000, 2001, 2002 and so on, as shown in the example domains below.

Domain: 2000

Item name

Country

Personnel

Equipment

00001

France

026460

008262

00007

Germany

021853

004867

00013

Netherlands

004086

001370

00019

USA

121593

070636

 

Domain: 2001

Item name

Country

Personnel

Equipment

00002

France

026436

008466

00008

Germany

021371

004970

00014

Netherlands

003960

001375

00020

USA

117645

083408

 

Domain: 2002

Item name

Country

Personnel

Equipment

00003

France

027057

008520

00009

Germany

021124

004996

00015

Netherlands

004221

001308

00021

USA

131525

100067

 

There are cases where data sets do not naturally present an easy parameter for partitioning; typical examples are event logs or web-crawler data. In this event, various hashing algorithms could be used to create a uniform distribution of items across multiple domains. Below are sample steps for partitioning such a data set into four different domains:

  1. Determine the hash of an item name using a well-behaved hash function, such as MD5
  2. Consider the last 2 bits of the resulting hash value and place each item in the specified domain:
    1. Last two bits equal to 00 – place item in Domain0
    2. Last two bits equal to 01 – place item in Domain1
    3. Last two bits equal to 10 – place item in Domain2
    4. Last two bits equal to 11 – place item in Domain3

The outlined algorithm provides a distribution of items across domains, the uniformity of which is directly controlled by the hash function used. The additional advantage of this scheme is the ease with which it can be adjusted to partition your data across a larger number of domains by considering more and more bits of the hash value (i.e. 3 bits will distribute to 8 domains, 4 bits to 16 domains and so on).

Using our example data set, let’s walk through the steps of partitioning by a hash of an item name:

  1. Calculate MD5 hash of “00001” to equal “4c68cea7e58591b579fd074bcdaff740”
  2. Determine that last two bits of the hash value are equal to “00”
  3. Place item “00001” in Domain0

After repeating the above steps for all item names in the original dataset, we get the following distribution:

Domain: Domain0

Item name

Country

Year

Personnel

Equipment

00001

France

2000

026460

008262

00002

France

2001

026436

008466

00007

Germany

2000

021853

004867

00012

Germany

2005

019680

005011

00013

Netherlands

2000

004086

001370

00017

Netherlands

2004

004321

001423

00022

USA

2003

149896

101730

00024

USA

2005

147844

114446

 

Domain: Domain1

Item name

Country

Year

Personnel

Equipment

00021

USA

2002

131525

100067

 

Domain: Domain2

Item name

Country

Year

Personnel

Equipment

00003

France

2002

027057

008520

00005

France

2004

027080

009860

00006

France

2005

026813

009830

00009

Germany

2002

021124

004996

00019

USA

2000

121593

070636

00023

USA

2004

155680

111330

 

Domain: Domain3

Item name

Country

Year

Personnel

Equipment

00004

France

2003

027045

009413

00008

Germany

2001

021371

004970

00010

Germany

2003

021068

004838

00011

Germany

2004

020150

005029

00014

Netherlands

2001

003960

001375

00015

Netherlands

2002

004221

001308

00016

Netherlands

2003

004395

001245

00018

Netherlands

2005

004331

001528

00020

USA

2001

117645

083408

Given that our sample data set size is small, the distribution is not optimal. With larger dataset sizes, however, the distribution evens out.

Query performance optimization

Avoiding is null comparison

Depending on your data set, is null comparison may be an expensive operation to perform. Since Amazon SimpleDB is a schema-less data store, it does not index an absence of a value for a particular attribute. Although in certain situations, using is null comparison is perfectly valid and necessary, for those use cases that require frequent checks for an absence of a particular attribute value, it is advisable to populate attribute values with a predefined value such as “Null” or “None.”

Consider an example automotive application that stores information about different cars for its users. One of the attributes for each car is Make and for some items that information is not known. The developer has two choices – either do not store the attribute Make for those items that do not have it or store the value “None” instead.

Scenario 1:

Do not store any value for the attribute Make for items that do not have that information.

In order to retrieve all cars that do not have Make, execute the query select * from mydomain where Make is null.

Scenario 2:

Store the value “None” for all cars that do not have Make information.

In order to retrieve all cars that do not have Make, execute query select * from mydomain where Make = 'None'.

The second scenario is more advisable and is likely to show better performance.

Reducing the number of predicates by creating composite attributes

Another way to reduce the overall number of predicates in your query expression is to create attributes that represent a combination of commonly queried values in your domain. Let’s consider a product catalog application:

Domain: Products

ItemName

Title

Type

Price

520246306

Insomniac

Book

0017.97

0312341423

Cross

Book

0014.37

B0013BNY2Q

Accelerate

CD

0009.99

B0012QGP00

Keep It Simple

CD

0009.99

0374299250

Lush Life

Book

0015.60

B000ZN8036

Into The Wild

DVD

0024.99

B0011HOEY4

American Gangster

DVD

0014.99

A typical query against this domain may be select * from mydomain where `Type` = 'CD' intersection `Price` < '0010.00' (show all CDs under $10). As this is a two-predicate query, a single predicate query may perform better. In order to improve performance we can create a composite attribute of Type and Price (i.e. TypePrice) for every item in the domain:

Domain: Products

ItemName

Title

Type

Price

TypePrice

520246306

Insomniac

Book

0017.97

Book0017.97

0312341423

Cross

Book

0014.37

Book0014.37

B0013BNY2Q

Accelerate

CD

0009.99

CD0009.99

B0012QGP00

Keep It Simple

CD

0009.99

CD0009.99

0374299250

Lush Life

Book

0015.60

Book0015.60

B000ZN8036

Into The Wild

DVD

0024.99

DVD0024.99

B0011HOEY4

American Gangster

DVD

0014.99

DVD0014.99

After creating a new composite attribute, we can achieve the same goal with a single predicate query (select * from mydomain where TypePrice` > 'CD' and TypePrice < 'CD00101.00'). While this example is extremely simple, composite attributes can provide a great way for improving the performance of multi-predicate queries.

The latencies for individual operations against Amazon SimpleDB are often under 100 milliseconds. This is specifically the case for query operations. When considering such low latencies for a web service, the network roundtrip time from the client application starts playing a significant role in the overall speed of execution. For example, if a given query call executes in 30 milliseconds within Amazon SimpleDB, but the network roundtrip time from the client application is 200 milliseconds, then over 85% of the overall execution time is spent on network communication.

The Amazon EC2 service provides a solution to the network latency issue. Since an application running on Amazon EC2 lives within the same “cloud” as Amazon SimpleDB, the network latencies are significantly reduced. Developers have reported network roundtrip times of 2 milliseconds while accessing Amazon SimpleDB from EC2, which allows client applications to utilize the true performance benefits. As an additional bonus of running your application on Amazon EC2, all network bandwidth usage is free between different Amazon Web Services!

Retries and Exponential Backoff

Building applications that work well in a networked environment with unpredictable latencies opens a new set of potential error conditions when compared to typical desktop or client-server applications. Invoking requests to a web service such as Amazon SimpleDB, whether over the Internet or from an EC2 instance within the Amazon cloud, requires application designers to consider issues such as network connectivity, request timeouts and remote errors due to overload or throttling. There are numerous components on the network, such as DNS servers, switches, load-balancers, and others that could generate errors anywhere in the life of a given request.

Once a request arrives at the service, Amazon SimpleDB uses standard HTTP error codes to communicate server and overload errors through the REST interface:

HTTP 500 – Internal Server Error

HTTP 503 – Service Unavailable (system overload message)

Amazon SimpleDB uses the following error codes to communicate server and overload errors through the SOAP interface:

Server.InternalError – Internal Server Error (equivalent to HTTP 500 error)

Server.RequestThrottled – Request was denied due to throttling (equivalent to HTTP 503 error)

Server.ServiceOverload – Service is currently overloaded (equivalent to HTTP 503 error)

Server.ServiceUnavailable – Service is currently unavailable (equivalent to HTTP 503 error)

The usual technique for dealing with such error responses in a networked environment is to implement retries in the client application layer, regardless of whether those errors come from the service itself or any component on the network between Amazon SimpleDB and client application. This technique increases the reliability of the applications and reduces operational costs for the developer. One of the first products built on top of Amazon SimpleDB, the Alexa Thumbnail service, was implemented with retry logic in place and as a result has had very little operational overhead. The Alexa application maintains an excellent level of performance and availability because it automatically handles the server and overload errors.

In addition to simple retries, we recommend using an exponential backoff algorithm for better flow control. The concept behind exponential backoff is to use progressively longer waits between retries for consecutive error responses: up to 400 milliseconds before the first retry, up to 1600 milliseconds before the second, up to 6400 milliseconds before third, and so on. You can read more about the common definition of the exponential backoff in this article. We outline the basic algorithm below:

 currentRetry = 0
 DO
   status = execute Amazon SimpleDB request
   IF status = success OR status = client error (4xx)
     set retry to false
     process the response or client error as appropriate
   ELSE
     set retry to true
     currentRetry = currentRetry + 1
     wait for a random delay between 0 and (4^currentRetry * 100) milliseconds
   END-IF
 WHILE (retry = true AND currentRetry < MaxNumberOfRetries) 

Following is a snippet of Java code that implements the exponential backoff:

 boolean shouldRetry = true;
 int retries = 0;
 do {
   try {
    /* Submit request to Amazon SimpleDB*/
     if (status == HttpStatus.SC_OK) {
       shouldRetry = false;
      /* Process successful response from Amazon SimpeDB */
     } else {
       if (status == HttpStatus.SC_INTERNAL_SERVER_ERROR 
              || status == HttpStatus.SC_SERVICE_UNAVAILABLE) {
         shouldRetry = true;
         long delay = (long) (Math.random() * (Math.pow(4, retries++) * 100L));
         try {
           Thread.sleep(delay);
         } catch (InterruptedException iex){
           log.error("Caught InterruptedException exception", iex);
         }
       } else {
         shouldRetry = false;
        /* Process 4xx (Client) error */
       }
     }
   } catch (IOException ioe) {
     log.error("Caught IOException exception", ioe);
   } catch (Exception e) {
     log.error("Caught Exception", e);
   } finally {
     /* Perform clean-up as necessary */
   }
 } while (shouldRetry && retries < MAX_NUMBER_OF_RETRIES); 

You can download a complete Java code sample that implements the exponential backoff algorithm from the Amazon sample libraries at https://developer.amazonwebservices.com/connect/entry.jspa?externalID=1132.

You can also download sample libraries in other languages from the AWS Resource Center for Amazon SimpleDB (Community Code section):