Amazon SimpleDB provides a simple web services interface to create and store multiple data sets, query your data easily, and return the results. The service manages infrastructure provisioning, hardware and software maintenance, replication and indexing of data items, and performance tuning. This enables you to focus on application development and simply pay for the resources you actually consume storing your data and issuing requests. Amazon SimpleDB also enables scalability by allowing you to partition your workload across multiple domains. If your workload exceeds the storage and request throughput provided by a single domain, you can obtain higher throughput by creating additional domains and spreading your data and requests across them. By spreading your data and requests across multiple domains (and thus, machine resources), you benefit from a greater “surface area” of compute resources to perform requests and queries against. For example, if you spread your data across 10 domains and execute 10 queries in parallel, you will get much higher throughput than performing 10 queries sequentially against a single domain that contains all of your data.
The flexibility of Amazon SimpleDB allows you to change your data model on the fly, adding or removing attributes without breaking a rigid schema. As a result, you can reflect changes to your application and business quickly without costly refactoring or painful schema updates. You can also choose between consistent or eventually consistent read requests, gaining the flexibility to match read performance (latency and throughput) and consistency requirements to the demands of your application, or even disparate parts within your application.With Amazon SimpleDB, what the service doesn’t require you to do is equally important. Amazon SimpleDB automatically manages infrastructure provisioning, hardware and software maintenance, replication and indexing of data items, and performance tuning.
Both services are non-relational databases that remove the work of database administration. Amazon DynamoDB focuses on providing seamless scalability and fast, predictable performance. It runs on solid state disks (SSDs) for low-latency response times, and there are no limits on the request capacity or storage size for a given table. This is because Amazon DynamoDB automatically partitions your data and workload over a sufficient number of servers to meet the scale requirements you provide. In contrast, a table in Amazon SimpleDB has a strict storage limitation of 10 GB and is limited in the request capacity it can achieve (typically under 25 writes/second); it is up to you to manage the partitioning and re-partitioning of your data over additional SimpleDB tables if you need additional scale. While SimpleDB has scaling limitations, it may be a good fit for smaller workloads that require query flexibility. Amazon SimpleDB automatically indexes all item attributes and thus supports greater query functionality at the cost of performance and scale.
Please see Running Databases on AWS for additional guidance on which solution is best for you.
AWS provides a number of database alternatives for developers. Amazon SimpleDB provides simple index and query capabilities. Amazon RDS enables you to run a fully featured relational database while offloading database administration. And, using one of our many relational database AMIs on Amazon EC2 and Amazon EBS allows you to operate your own relational database in the cloud. There are important differences between these alternatives that may make one more appropriate for your use case.
Please see Running Databases on AWS for additional guidance on which solution is best for you.
Amazon S3 stores raw data. Amazon SimpleDB takes your data as input and indexes all the attributes, enabling you to quickly query that data. Additionally, Amazon S3 and Amazon SimpleDB use different types of physical storage. Amazon S3 uses dense storage drives that are optimized for storing larger objects inexpensively. Amazon SimpleDB stores smaller bits of data and uses less dense drives that are optimized for data access speed.
In order to optimize your costs across AWS services, large objects or files should be stored in Amazon S3, while smaller data elements or file pointers (possibly to Amazon S3 objects) are best saved in Amazon SimpleDB. Because of the close integration between services and the free data transfer within the AWS environment, developers can easily take advantage of both the speed and querying capabilities of Amazon SimpleDB as well as the low cost of storing data in Amazon S3, by integrating both services into their applications. To learn more about the benefits of using Amazon SimpleDB in conjunction with Amazon S3, follow this link.
The Amazon SimpleDB data model is comprised of domains, items, attributes and values. Domains are collections of items that are described by attribute-value pairs.
Think of these terms as analogous to concepts in a traditional spreadsheet table. For example, take the details of a customer management database shown in the table below and consider how they would be represented in Amazon SimpleDB. The whole table would be your domain named “customers.” Individual customers would be rows in the table or items in your domain. The contact information would be described by column headers (attributes). Values are in individual cells.
|CustomerID||First name||Last name||Street address||City||State||Zip||Telephone|
|123||Bob||Smith||123 Main St||Springfield||MO||65801||222-333-4444|
|456||James||Johnson||456 Front St||Seattle||WA||98104||333-444-5555|
There are several factors to consider based on your specific application. You may want to store your data in a Region that…
Amazon SimpleDB supports two read consistency options: eventually consistent reads and consistent reads.
Eventually Consistent Reads (Default). The eventually consistent read option maximizes your read performance (in terms of low latency and high throughput). However, an eventually consistent read (using Select or GetAttributes) might not reflect the results of a recently completed write (using PutAttributes, BatchPutAttributes, DeleteAttributes). Consistency across all copies of data is usually reached within a second; repeating a read after a short time should return the updated data.
Consistent Reads. In addition to eventually consistent reads, Amazon SimpleDB also gives you the flexibility and control to request a consistent read if your application, or an element of your application, requires it. A consistent read (using Select or GetAttributes with ConsistentRead=true) returns a result that reflects all writes that received a successful response prior to the read.By default, GetAttributes and Select perform an eventually consistent read. Since a consistent read can potentially incur higher latency and lower read throughput it is best to use it only when an application scenario mandates that a read operation absolutely needs to read all writes that received a successful response prior to that read. For all other scenarios the default eventually consistent read will yield the best performance. To learn more about consistency options with Amazon SimpleDB, please see our Developer Guide.
As previously mentioned, the flexibility Amazon SimpleDB provides in specifying your read consistency requirements is important because different types of applications and use cases may have different requirements in terms of performance and consistency. Note also that Amazon SimpleDB allows you to specify consistency settings for each individual read request, so the same application could have disparate parts following different consistency settings. Here is some guidance on times when each read consistency option may be most appropriate:
Eventually Consistent Reads:
Any application (or part of an application) that values read performance (latency and throughput) higher than strong consistency will be well suited to the eventually consistent read. Data that has a high read to write ratio often fits this description. For example, friend/follower lists, photo tags, and personal details within a social network. In general, use cases where performance (providing an answer) is more important than providing the most up-to-date answer. An example might be an ad network, where showing users an ad from inventory as fast as possible is more important than showing the ad (based on logic updated within the past second). Another guideline for whether eventually consistent reads are appropriate for your application is whether it can deal with the notion of user-perceived consistency. Imagine an application that involves direct user interaction rather than programmatic access. For example, imagine a user updating a blog post and hitting refresh, or another user posting a comment to the blog. This wait time is what we refer to as user-perceived consistency – as long as the data is consistent in time for the end user to see it, the application can utilize eventual consistency. In these scenarios, the amount of time required for a write to reach all copies of the data is smaller than the time lag before the customer expects the new data to be visible (e.g., refreshes the page). As mentioned previously, Amazon SimpleDB usually reaches consistency within a second. If end users of your application will not notice or care if updates are reflected within a second, eventual consistency makes sense for the general read performance benefits.
When an item is updated an eventually consistent read may return the current value or the old value. When an item is inserted an eventually consistent read may not return the item.
Depending on your application, you may need users who read a data item to view the most recently updated version from amongst many concurrent write updates. For example, you may be running a statistics or reporting application where you can’t accept the risk that a recent write operation is not be reflected in the results of a GetAttributes call or Select query. In such a case, passing the ConsistentRead = True parameter will provide consistent results.
Storing application in-memory state in SimpleDB is another example. As the value of the application state changes, the application can update SimpleDB. If the application goes down and needs to be restarted then the application can issue a consistent GetAttributes or Select call to SimpleDB to obtain the last updated application state.
Amazon SimpleDB is not a relational database and sacrifices complex transactions and relations (i.e., joins) in order to provide unique functionality and performance characteristics. However, Amazon SimpleDB does offer transactional semantics such as:
Conditional Puts/Deletes — enable you to insert, replace, or delete values for one or more attributes of an item if the existing value of an attribute matches the value you specify. If the value does not match or is not present, the update is rejected. Conditional Puts/Deletes are useful for preventing lost updates when different sources write concurrently to the same item.
Conditional puts and deletes are exposed via the PutAttributes and DeleteAttributes APIs by specifying an optional condition with an expected value. For example, if your application was reserving seats or selling tickets to an event, you might allow a purchase (i.e., write update) only if the specified seat was still available (the optional condition). These semantics can also be used to implement functionality such as counters, inserting an item only if it does not already exist, and optimistic concurrency control (OCC). An application can implement OCC by maintaining a version number (or a timestamp) attribute as part of an item and by performing a conditional put/delete based on the value of this version number.
To learn more about transactional semantics with Amazon SimpleDB, please refer to the Amazon SimpleDB Developer Guide.
You can get started with SimpleDB for free and without risk. Under the free tier program, you pay no charges on the first 25 Machine Hours, and 1 GB of Storage that you consume every month. Amazon SimpleDB lets developers pay only for what they consume and there is no minimum fee.
For full Amazon SimpleDB pricing, please click here.
The following examples refer to charges for usage beyond the free usage levels described above. As previously described, usage below the monthly free tier is provided at no charge.
Amazon SimpleDB measures the machine utilization of each request and charges based on the amount of machine capacity used to complete the particular request (QUERY, GET, PUT, etc.), normalized to the hourly capacity of a circa 2007 1.7 GHz Xeon processor. Machine utilization is driven by the amount of data (# of attributes, length of attributes) processed by each request. A GET operation that retrieves 256 attributes will use more resources than a GET that retrieves only 1 attribute. A multi-predicate QUERY that examines 100,000 attributes will cost more than a single predicate query that examines 250.
In the response message for each request, Amazon SimpleDB returns a field called Box Usage. Box Usage is the measure of machine resources consumed by each request. It does not include bandwidth or storage. Box usage is reported as the portion of a machine hour used to complete a particular request. For the US East (Northern Virginia) Region and US West (Oregon) Region, the cost of an individual request is Box Usage (expressed in hours) * $0.14 per Amazon SimpleDB Machine hour. The cost of all your requests is the sum of Box Usage (expressed in hours) * $0.14.
For example, if over the course of a month, the sum of the Box Usage for your requests uses the equivalent of one 1.7 GHz Xeon processor for 9 hours, your charge will be:
9 hours * $0.14 per Amazon SimpleDB Machine hour = $1.26.
If your query domains are located in the EU (Ireland) Region, Asia Pacific (Singapore) Region, Asia Pacific (Sydney) Region, or US West (Northern California) Region, Amazon SimpleDB Machine hours are priced at $.154 per Machine Hour. If your query domains are located in the Asia Pacific (Tokyo) Region, Amazon SimpleDB Machine Hours are priced at $0.162 per Machine Hour. If your query domains are located in the South America (Sao Paulo) Region, Amazon SimpleDB Machine Hours are priced at $0.19 per Amazon SimpleDB Machine Hour. All cost calculations should be adjusted to reflect pricing in the relevant region.
Data Transfer Example:
You transfer 500 MB of data out of Amazon SimpleDB each day during the month of March in the US (Northern Virginia) Region.
Total Data Transfer Out for the month = 500 MB x (1 GB / 1,024 MB) x 31 days = 15.14 GB
Total charge = 15.14 GB x ($0.12 / GB) = $1.82
The best way to predict the size of your structured data storage is as follows:
Raw byte size (GB) of all item IDs + 45 bytes per item + Raw byte size (GB) of all attribute names + 45 bytes per attribute name + Raw byte size (GB) of all attribute-value pairs + 45 bytes per attribute-value pair
To calculate your estimated monthly storage cost for the US East (Northern Virginia) Region or US West (Oregon) Region, take the resulting size in GB and multiply by $0.25. For the EU (Ireland) Region, Asia Pacific (Singapore) Region, Asia Pacific (Sydney) Region, or the US West (Northern California) Region, take the resulting size in GB and multiply by $.275. For the Asia Pacific (Tokyo) Region, take the resulting size in GB and multiply by $0.29. For the South America (Sao Paulo) Region, take the resulting size in GB and multiply by $0.34.
We charge less where our costs are less. For example, our costs are lower in the Northern Virginia Region than in the Northern California Region. Similarly, our bandwidth costs are higher in the Singapore Region than in the Northern California Region.
You organize your structured data into domains and can run queries across all of the data stored in a particular domain. Domains are comprised of items, and items are described by attribute-value pairs. To understand these elements, consider the metaphor of data stored in a spreadsheet table. An Amazon SimpleDB domain is like a worksheet, items are like rows of data, attributes are like column headers, and values are the data entered in each of the cells.However unlike a spreadsheet, Amazon SimpleDB allows for multiple values to be associated with each “cell” (e.g., for item “123,” the attribute “color” can have both value “blue” and value “red”). Additionally, in Amazon SimpleDB, each item can have its own unique set of associated attributes (e.g., item “123” might have attributes “description” and “color” whereas item “789” has attributes “description,” “color” and “material”). Amazon SimpleDB automatically indexes your data, making it easy to quickly find the information that you need. There is no need to pre-define a schema or change a schema if new data is added later.
The service runs within Amazon’s high-availability data centers to provide strong and consistent performance. To prevent data from being lost or becoming unavailable, your fully indexed data is stored redundantly across multiple servers and data centers. This reliability is consistent across all Amazon SimpleDB Regions.
Anyone can use Amazon SimpleDB. You just have to decide which Region you want Amazon SimpleDB to store your data in.