Building for Performance and Reliability with Amazon SimpleDB
What programming techniques should you use when building an application on top of Amazon SimpleDB? How can you achieve the best throughput and performance? How should you partition data among multiple domains? These topics and more are covered in this article, to help you on your way to building great applications using Amazon SimpleDB.
Submitted By: Bruce@AWS
AWS Products Used: Amazon SimpleDB
Created On: April 11, 2008
By Aleksey Martynov, Amazon Web Services
Amazon SimpleDB is a web service for running queries on structured data in real time. This service works in close conjunction with Amazon Simple Storage Service (Amazon S3) and Amazon Elastic Compute Cloud (Amazon EC2), collectively providing the ability to store, process and query data sets in the cloud. These services are designed to make web-scale computing easier and more cost-effective for developers.
Using BatchPutAttributes to improve put performance
With the BatchPutAttributes operation, you can perform multiple PutAttribute operations in a single call. This yields savings in round trips and latencies and enables Amazon SimpleDB to optimize requests, which generally yields better throughput. When you have a large amount of data to load, it is highly recommended to use BatchPutAttributes instead of PutAttributes.
Note: The BatchPutAttributes operation succeeeds or fails in its entirety. There are no partial puts.
In general, the more operations included in one BatchPutAttributes request, the more throughput you get. However, the BatchPutAttributes operation has some limitations:
- 256 attribute name-value pairs
- 1 MB request size
- 1 billion attributes per domain
- 10 GB of total user data storage per domain
- 25 item limit per BatchPutAttributes operation
Amazon SimpleDB service was built to support massively parallel operations. In order to achieve maximum performance and throughput, you should understand how to use request parallelization.
Submitting requests to the Amazon SimpleDB service serially (i.e. issuing requests one after another, with each request starting only when the previous one completes) magnifies the Internet roundtrip latencies present in every web service call. Picture your browser downloading a typical web page; if all elements of that page (images, scripts, style sheets) were loaded serially, the time to render would be noticeably slow. Most current browsers parallelize the process by loading multiple page elements simultaneously, which greatly improves the performance. This same concept can be extended to applications built on top of Amazon SimpleDB.
In this section, we discuss parallelizing requests against a single domain; however, the same techniques pertain when your application needs to execute high request rates against multiple domains.
As described above, although we can use BatchPutAttributes to optimize the put latency and throughput, it is limited by request size restrictions. In this section, we show how to further optimize the put throughput by using request parallelization.
To improve the put performance, you can distribute the data among multiple threads, or processes, that execute BatchPutAttributes calls in parallel. Ideally, the improvement is proportional to the number of threads. However, large numbers of concurrent BatchPutAttributes calls can result in Service Unavailable (503) responses due to resource limitation. This technique could be combined with the exponential backoff algorithm we describe later in this article to find the optimal throughput level for your application and data set.
The pattern for executing parallel queries that are independent from each other is very similar to executing any other request: the application spawns multiple threads or processes, and each one is responsible for processing Select calls against Amazon SimpleDB independently.
There is one additional interesting use case for parallelizing queries that involves retrieving a large number of pages for a single query expression. At first glance, it seems that this use case is not well suited for parallelization, since the application needs to page sequentially through the result. However, by splitting the search range into multiple segments, it is possible to parallelize across these segments.
Consider the following Clickstream domain.
Domain: Clickstream
Item name |
URL Hash |
Hits |
---|---|---|
00001 |
c791751b-8d74-414a-8838-0074dc7858e2 |
0012 |
00002 |
55f1add7-44b7-4165-82bf-ebb611c9a9cf |
0000 |
… |
… |
… |
08764 |
f11b0b27-e5be-4a1b-86ed-ada040a2bbed |
5364 |
08765 |
6a56024f-1b59-451c-9b17-1cd15c857850 |
0763 |
… |
… |
… |
59187 |
39d12161-6d19-48c3-b6bf-2bd27abf9794 |
1211 |
59188 |
0621d81e-f20b-49f6-b626-02fa455387bb |
0897 |
The following scenarios demonstrate how parallel queries compare favorably against the sequential method.
Scenario 1:
Single thread invokes a query (select * from Clickstream where Hits between '0000' and '1000'), which produces 10 pages of results retrieved sequentially.
Scenario 2:
Application starts 5 threads of execution as follows:
Thread 1: select * from Clickstream where Hits between '0000' and '0200' retrieves 2 pages of results
Thread 2: select * from Clickstream where Hits between '0200' and '0400' retrieves 2 pages of results
Thread 3: select * from Clickstream where Hits between '0400' and '0600' retrieves 2 pages of results
Thread 4: select * from Clickstream where Hits between '0600' and '0800' retrieves 2 pages of results
Thread 5: select * from Clickstream where Hits between '0800' and '1000' retrieves 2 pages of results
Assuming an even distribution of values over the result set, splitting the search range into multiple segments can provide up to a fivefold increase in overall execution time for this particular example.
Parallelizing with Perl and PHP
There are several popular languages that do not natively support multi-threading. PHP and Perl are examples of such languages. Does that mean that developers using PHP or Perl cannot take advantage of the massively parallel programming model that Amazon SimpleDB supports? Absolutely not!
Perl provides an Async.pm module for asynchronous processing of requests. Below is the description of the module from the CPAN documentation:
Async executes some code in a separate process and retrieves the result. Since the code is running in a separate process, your main program can continue with whatever it was doing while the separate code is executing. This separate code is called an “asynchronous computation”.
For PHP, executing multiple requests against Amazon SimpleDB can be achieved through the use of the curl_multi set of functions, which are part of the cURL module. The article, Simultaneuos HTTP Requests in PHP with cURL by Stoyan Stefanov provides an example of how to use curl_multi to issue simultaneous HTTP requests in PHP.
A typical mechanism for improving overall throughput of a database application is horizontal partitioning of the data. Amazon SimpleDB allows each subscriber to create multiple domains under their account. Application designers can use this feature to their advantage by partitioning their data set among multiple domains in order to parallelize their requests and improve overall throughput.
Factors that can influence the decision to partition your data:
- Application requires higher overall put throughput than a single domain can provide. For example, your application requires 500 puts per second and with your data set you are only able to achieve 50 puts per second on a single domain. Partitioning your data across 10 domains can help you reach your goal.
- Application requires executing queries that may require most of the items in the domain to be examined. In this case, the size of your domain will likely influence the query performance.
- Application needs to exceed the maximum storage limitation for your domain. For example, your application logs 2 GB of data per day and you are limited to 10 GB per domain. A single domain can therefore only handle five days worth of data. Consequently, partitioning your data across multiple domains can significantly increase your overall storage capacity.
There are several options for partitioning your data set. Let’s consider a sample data set that contains some basic statistics about four NATO countries.
Domain: NATO
Item name |
Country |
Year |
Personnel |
Equipment |
---|---|---|---|---|
00001 |
France |
2000 |
026460 |
008262 |
00002 |
France |
2001 |
026436 |
008466 |
00003 |
France |
2002 |
027057 |
008520 |
00004 |
France |
2003 |
027045 |
009413 |
00005 |
France |
2004 |
027080 |
009860 |
00006 |
France |
2005 |
026813 |
009830 |
00007 |
Germany |
2000 |
021853 |
004867 |
00008 |
Germany |
2001 |
021371 |
004970 |
00009 |
Germany |
2002 |
021124 |
004996 |
00010 |
Germany |
2003 |
021068 |
004838 |
00011 |
Germany |
2004 |
020150 |
005029 |
00012 |
Germany |
2005 |
019680 |
005011 |
00013 |
Netherlands |
2000 |
004086 |
001370 |
00014 |
Netherlands |
2001 |
003960 |
001375 |
00015 |
Netherlands |
2002 |
004221 |
001308 |
00016 |
Netherlands |
2003 |
004395 |
001245 |
00017 |
Netherlands |
2004 |
004321 |
001423 |
00018 |
Netherlands |
2005 |
004331 |
001528 |
00019 |
USA |
2000 |
121593 |
070636 |
00020 |
USA |
2001 |
117645 |
083408 |
00021 |
USA |
2002 |
131525 |
100067 |
00022 |
USA |
2003 |
149896 |
101730 |
00023 |
USA |
2004 |
155680 |
111330 |
00024 |
USA |
2005 |
147844 |
114446 |
You may be able to find a particular dimension in your data that is well suited for partitioning. For example, “Country” may be a good partitioning dimension. In the following example, the application has created different domains for each country (France, Germany, Netherlands and USA):
Domain: France
Item name |
Year |
Personnel |
Equipment |
---|---|---|---|
00001 |
2000 |
026460 |
008262 |
00002 |
2001 |
026436 |
008466 |
00003 |
2002 |
027057 |
008520 |
00004 |
2003 |
027045 |
009413 |
00005 |
2004 |
027080 |
009860 |
00006 |
2005 |
026813 |
009830 |
Domain: Germany
Item name |
Year |
Personnel |
Equipment |
---|---|---|---|
00007 |
2000 |
021853 |
004867 |
00008 |
2001 |
021371 |
004970 |
00009 |
2002 |
021124 |
004996 |
00010 |
2003 |
021068 |
004838 |
00011 |
2004 |
020150 |
005029 |
00012 |
2005 |
019680 |
005011 |
Domain: Netherlands
Item name |
Year |
Personnel |
Equipment |
---|---|---|---|
00013 |
2000 |
004086 |
001370 |
00014 |
2001 |
003960 |
001375 |
00015 |
2002 |
004221 |
001308 |
00016 |
2003 |
004395 |
001245 |
00017 |
2004 |
004321 |
001423 |
00018 |
2005 |
004331 |
001528 |
Domain: USA
Item name |
Year |
Personnel |
Equipment |
---|---|---|---|
00019 |
2000 |
121593 |
070636 |
00020 |
2001 |
117645 |
083408 |
00021 |
2002 |
131525 |
100067 |
00022 |
2003 |
149896 |
101730 |
00023 |
2004 |
155680 |
111330 |
00024 |
2005 |
147844 |
114446 |
Another option for partitioning is related to the date and time associated with your data. You may be able to create separate domains for each day of the month or for each month of the year. For this particular data set, we could partition on the “Year” attribute, creating separate domains for 2000, 2001, 2002 and so on, as shown in the example domains below.
Domain: 2000
Item name |
Country |
Personnel |
Equipment |
---|---|---|---|
00001 |
France |
026460 |
008262 |
00007 |
Germany |
021853 |
004867 |
00013 |
Netherlands |
004086 |
001370 |
00019 |
USA |
121593 |
070636 |
Domain: 2001
Item name |
Country |
Personnel |
Equipment |
---|---|---|---|
00002 |
France |
026436 |
008466 |
00008 |
Germany |
021371 |
004970 |
00014 |
Netherlands |
003960 |
001375 |
00020 |
USA |
117645 |
083408 |
Domain: 2002
Item name |
Country |
Personnel |
Equipment |
---|---|---|---|
00003 |
France |
027057 |
008520 |
00009 |
Germany |
021124 |
004996 |
00015 |
Netherlands |
004221 |
001308 |
00021 |
USA |
131525 |
100067 |
There are cases where data sets do not naturally present an easy parameter for partitioning; typical examples are event logs or web-crawler data. In this event, various hashing algorithms could be used to create a uniform distribution of items across multiple domains. Below are sample steps for partitioning such a data set into four different domains:
- Determine the hash of an item name using a well-behaved hash function, such as MD5
- Consider the last 2 bits of the resulting hash value and place each item in the specified domain:
- Last two bits equal to 00 – place item in Domain0
- Last two bits equal to 01 – place item in Domain1
- Last two bits equal to 10 – place item in Domain2
- Last two bits equal to 11 – place item in Domain3
The outlined algorithm provides a distribution of items across domains, the uniformity of which is directly controlled by the hash function used. The additional advantage of this scheme is the ease with which it can be adjusted to partition your data across a larger number of domains by considering more and more bits of the hash value (i.e. 3 bits will distribute to 8 domains, 4 bits to 16 domains and so on).
Using our example data set, let’s walk through the steps of partitioning by a hash of an item name:
- Calculate MD5 hash of “00001” to equal “4c68cea7e58591b579fd074bcdaff740”
- Determine that last two bits of the hash value are equal to “00”
- Place item “00001” in Domain0
After repeating the above steps for all item names in the original dataset, we get the following distribution:
Domain: Domain0
Item name |
Country |
Year |
Personnel |
Equipment |
---|---|---|---|---|
00001 |
France |
2000 |
026460 |
008262 |
00002 |
France |
2001 |
026436 |
008466 |
00007 |
Germany |
2000 |
021853 |
004867 |
00012 |
Germany |
2005 |
019680 |
005011 |
00013 |
Netherlands |
2000 |
004086 |
001370 |
00017 |
Netherlands |
2004 |
004321 |
001423 |
00022 |
USA |
2003 |
149896 |
101730 |
00024 |
USA |
2005 |
147844 |
114446 |
Domain: Domain1
Item name |
Country |
Year |
Personnel |
Equipment |
---|---|---|---|---|
00021 |
USA |
2002 |
131525 |
100067 |
Domain: Domain2
Item name |
Country |
Year |
Personnel |
Equipment |
---|---|---|---|---|
00003 |
France |
2002 |
027057 |
008520 |
00005 |
France |
2004 |
027080 |
009860 |
00006 |
France |
2005 |
026813 |
009830 |
00009 |
Germany |
2002 |
021124 |
004996 |
00019 |
USA |
2000 |
121593 |
070636 |
00023 |
USA |
2004 |
155680 |
111330 |
Domain: Domain3
Item name |
Country |
Year |
Personnel |
Equipment |
---|---|---|---|---|
00004 |
France |
2003 |
027045 |
009413 |
00008 |
Germany |
2001 |
021371 |
004970 |
00010 |
Germany |
2003 |
021068 |
004838 |
00011 |
Germany |
2004 |
020150 |
005029 |
00014 |
Netherlands |
2001 |
003960 |
001375 |
00015 |
Netherlands |
2002 |
004221 |
001308 |
00016 |
Netherlands |
2003 |
004395 |
001245 |
00018 |
Netherlands |
2005 |
004331 |
001528 |
00020 |
USA |
2001 |
117645 |
083408 |
Given that our sample data set size is small, the distribution is not optimal. With larger dataset sizes, however, the distribution evens out.
Query performance optimization
Avoiding is null comparison
Depending on your data set, is null comparison may be an expensive operation to perform. Since Amazon SimpleDB is a schema-less data store, it does not index an absence of a value for a particular attribute. Although in certain situations, using is null comparison is perfectly valid and necessary, for those use cases that require frequent checks for an absence of a particular attribute value, it is advisable to populate attribute values with a predefined value such as “Null” or “None.”
Consider an example automotive application that stores information about different cars for its users. One of the attributes for each car is Make and for some items that information is not known. The developer has two choices – either do not store the attribute Make for those items that do not have it or store the value “None” instead.
Scenario 1:
Do not store any value for the attribute Make for items that do not have that information.
In order to retrieve all cars that do not have Make, execute the query select * from mydomain where Make is null.
Scenario 2:
Store the value “None” for all cars that do not have Make information.
In order to retrieve all cars that do not have Make, execute query select * from mydomain where Make = 'None'.
The second scenario is more advisable and is likely to show better performance.
Reducing the number of predicates by creating composite attributes
Another way to reduce the overall number of predicates in your query expression is to create attributes that represent a combination of commonly queried values in your domain. Let’s consider a product catalog application:
Domain: Products
ItemName |
Title |
Type |
Price |
---|---|---|---|
520246306 |
Insomniac |
Book |
0017.97 |
0312341423 |
Cross |
Book |
0014.37 |
B0013BNY2Q |
Accelerate |
CD |
0009.99 |
B0012QGP00 |
Keep It Simple |
CD |
0009.99 |
0374299250 |
Lush Life |
Book |
0015.60 |
B000ZN8036 |
Into The Wild |
DVD |
0024.99 |
… |
… |
… |
… |
B0011HOEY4 |
American Gangster |
DVD |
0014.99 |
A typical query against this domain may be select * from mydomain where `Type` = 'CD' intersection `Price` < '0010.00' (show all CDs under $10). As this is a two-predicate query, a single predicate query may perform better. In order to improve performance we can create a composite attribute of Type and Price (i.e. TypePrice) for every item in the domain:
Domain: Products
ItemName |
Title |
Type |
Price |
TypePrice |
---|---|---|---|---|
520246306 |
Insomniac |
Book |
0017.97 |
Book0017.97 |
0312341423 |
Cross |
Book |
0014.37 |
Book0014.37 |
B0013BNY2Q |
Accelerate |
CD |
0009.99 |
CD0009.99 |
B0012QGP00 |
Keep It Simple |
CD |
0009.99 |
CD0009.99 |
0374299250 |
Lush Life |
Book |
0015.60 |
Book0015.60 |
B000ZN8036 |
Into The Wild |
DVD |
0024.99 |
DVD0024.99 |
… |
… |
… |
… |
… |
B0011HOEY4 |
American Gangster |
DVD |
0014.99 |
DVD0014.99 |
After creating a new composite attribute, we can achieve the same goal with a single predicate query (select * from mydomain where TypePrice` > 'CD' and TypePrice < 'CD00101.00'). While this example is extremely simple, composite attributes can provide a great way for improving the performance of multi-predicate queries.
The latencies for individual operations against Amazon SimpleDB are often under 100 milliseconds. This is specifically the case for query operations. When considering such low latencies for a web service, the network roundtrip time from the client application starts playing a significant role in the overall speed of execution. For example, if a given query call executes in 30 milliseconds within Amazon SimpleDB, but the network roundtrip time from the client application is 200 milliseconds, then over 85% of the overall execution time is spent on network communication.
The Amazon EC2 service provides a solution to the network latency issue. Since an application running on Amazon EC2 lives within the same “cloud” as Amazon SimpleDB, the network latencies are significantly reduced. Developers have reported network roundtrip times of 2 milliseconds while accessing Amazon SimpleDB from EC2, which allows client applications to utilize the true performance benefits. As an additional bonus of running your application on Amazon EC2, all network bandwidth usage is free between different Amazon Web Services!
Retries and Exponential Backoff
Building applications that work well in a networked environment with unpredictable latencies opens a new set of potential error conditions when compared to typical desktop or client-server applications. Invoking requests to a web service such as Amazon SimpleDB, whether over the Internet or from an EC2 instance within the Amazon cloud, requires application designers to consider issues such as network connectivity, request timeouts and remote errors due to overload or throttling. There are numerous components on the network, such as DNS servers, switches, load-balancers, and others that could generate errors anywhere in the life of a given request.
Once a request arrives at the service, Amazon SimpleDB uses standard HTTP error codes to communicate server and overload errors through the REST interface:
HTTP 500 – Internal Server Error
HTTP 503 – Service Unavailable (system overload message)
Amazon SimpleDB uses the following error codes to communicate server and overload errors through the SOAP interface:
Server.InternalError – Internal Server Error (equivalent to HTTP 500 error)
Server.RequestThrottled – Request was denied due to throttling (equivalent to HTTP 503 error)
Server.ServiceOverload – Service is currently overloaded (equivalent to HTTP 503 error)
Server.ServiceUnavailable – Service is currently unavailable (equivalent to HTTP 503 error)
The usual technique for dealing with such error responses in a networked environment is to implement retries in the client application layer, regardless of whether those errors come from the service itself or any component on the network between Amazon SimpleDB and client application. This technique increases the reliability of the applications and reduces operational costs for the developer. One of the first products built on top of Amazon SimpleDB, the Alexa Thumbnail service, was implemented with retry logic in place and as a result has had very little operational overhead. The Alexa application maintains an excellent level of performance and availability because it automatically handles the server and overload errors.
In addition to simple retries, we recommend using an exponential backoff algorithm for better flow control. The concept behind exponential backoff is to use progressively longer waits between retries for consecutive error responses: up to 400 milliseconds before the first retry, up to 1600 milliseconds before the second, up to 6400 milliseconds before third, and so on. You can read more about the common definition of the exponential backoff in this article. We outline the basic algorithm below:
currentRetry = 0 DO status = execute Amazon SimpleDB request IF status = success OR status = client error (4xx) set retry to false process the response or client error as appropriate ELSE set retry to true currentRetry = currentRetry + 1 wait for a random delay between 0 and (4^currentRetry * 100) milliseconds END-IF WHILE (retry = true AND currentRetry < MaxNumberOfRetries)
Following is a snippet of Java code that implements the exponential backoff:
boolean shouldRetry = true; int retries = 0; do { try { /* Submit request to Amazon SimpleDB*/ if (status == HttpStatus.SC_OK) { shouldRetry = false; /* Process successful response from Amazon SimpeDB */ } else { if (status == HttpStatus.SC_INTERNAL_SERVER_ERROR || status == HttpStatus.SC_SERVICE_UNAVAILABLE) { shouldRetry = true; long delay = (long) (Math.random() * (Math.pow(4, retries++) * 100L)); try { Thread.sleep(delay); } catch (InterruptedException iex){ log.error("Caught InterruptedException exception", iex); } } else { shouldRetry = false; /* Process 4xx (Client) error */ } } } catch (IOException ioe) { log.error("Caught IOException exception", ioe); } catch (Exception e) { log.error("Caught Exception", e); } finally { /* Perform clean-up as necessary */ } } while (shouldRetry && retries < MAX_NUMBER_OF_RETRIES);
You can download a complete Java code sample that implements the exponential backoff algorithm from the Amazon sample libraries at https://developer.amazonwebservices.com/connect/entry.jspa?externalID=1132.
You can also download sample libraries in other languages from the AWS Resource Center for Amazon SimpleDB (Community Code section):
- Perl at https://developer.amazonwebservices.com/connect/entry.jspa?externalID=1136
- PHP at https://developer.amazonwebservices.com/connect/entry.jspa?externalID=1135
- C# at https://developer.amazonwebservices.com/connect/entry.jspa?externalID=1133
- VB.NET at https://developer.amazonwebservices.com/connect/entry.jspa?externalID=1134