|
|
|
|
| | |
Amazon SimpleDB™- Limited Beta
Amazon SimpleDB is a web service for running queries on structured data in real time. This service works in close conjunction with Amazon Simple Storage Service (Amazon S3) and Amazon Elastic Compute Cloud (Amazon EC2), collectively providing the ability to store, process and query data sets in the cloud. These services are designed to make web-scale computing easier and more cost-effective for developers.
Traditionally, this type of functionality has been accomplished with a clustered relational database that requires a sizable upfront investment, brings more complexity than is typically needed, and often requires a DBA to maintain and administer. In contrast, Amazon SimpleDB is easy to use and provides the core functionality of a database - real-time lookup and simple querying of structured data - without the operational complexity. Amazon SimpleDB requires no schema, automatically indexes your data and provides a simple API for storage and access. This eliminates the administrative burden of data modeling, index maintenance, and performance tuning. Developers gain access to this functionality within Amazon's proven computing environment, are able to scale instantly, and pay only for what they use.
Amazon SimpleDB Functionality
Amazon SimpleDB provides a simple web services interface to create and store multiple data sets, query your data easily, and return the results.
You organize your structured data into domains and can run queries across all of the data stored in a particular domain. Domains are comprised of items, and items are described by attribute-value pairs. To understand these elements, consider the metaphor of data stored in a spreadsheet table. An Amazon SimpleDB domain is like a worksheet, items are like rows of data, attributes are like column headers, and values are the data entered in each of the cells.
However unlike a spreadsheet, Amazon SimpleDB allows for multiple values to be associated with each "cell" (e.g., for item "123," the attribute "color" can have both value "blue" and value "red"). Additionally, in Amazon SimpleDB, each item can have its own unique set of associated attributes (e.g., item "123" might have attributes "description" and "color" whereas item "789" has attributes "description," "color" and "material"). Amazon SimpleDB automatically indexes your data, making it easy to quickly find the information that you need. There is no need to pre-define a schema or change a schema if new data is added later.
To use Amazon SimpleDB you:
- CREATE a new domain to house your unique set of structured data.
- GET, PUT or DELETE items in your domain, along with the attribute-value pairs that you associate with each item. Amazon SimpleDB automatically indexes data as it is added to your domain so that it can be quickly retrieved; there is no need to pre-define a schema or change a schema if new data is added later. Each item can have up to 256 attribute values. Each attribute value can range from 1 to 1,024 bytes.
- QUERY your data set using this simple set of operators: =, !=, <, > <=, >=, STARTS-WITH, AND, OR, NOT, INTERSECTION AND UNION. Query execution time is currently limited to 5 seconds. Amazon SimpleDB is designed for real-time applications and is optimized for those use cases.
- Pay only for the resources that you consume.
Service Highlights
- Simple to use
Amazon SimpleDB provides streamlined access to the lookup and query functions that traditionally are achieved using a relational database cluster - while leaving out other complex, often-unused database operations. The service allows you to quickly add data and easily retrieve or edit that data through a simple set of API calls. Accessing these capabilities through a web service also eliminates the complexity of maintaining and scaling these operations.
- Flexible
With Amazon SimpleDB, it is not necessary to pre-define all of the data formats you will need to store; simply add new attributes to your Amazon SimpleDB data set when needed, and the system will automatically index your data accordingly. The ability to store structured data without first defining a schema provides developers with greater flexibility when building applications.
- Scalable
Amazon SimpleDB allows you to easily scale your application. You can quickly create new domains as your data grows or your request throughput increases. For the Beta release, a single domain is limited in size to 10 GB and you are limited to a maximum of 100 domains; however, over time these limits may be raised.
- Fast
Amazon SimpleDB provides quick, efficient storage and retrieval of your data to support high performance web applications.
- Reliable
The service runs within Amazon's high-availability data centers to provide strong and consistent performance. To prevent data from being lost or becoming unavailable, your fully indexed data is stored redundantly across multiple servers and data centers.
- Designed for use with other Amazon Web Services
Amazon SimpleDB is designed to integrate easily with other web-scale services such as Amazon EC2 and Amazon S3. For example, developers can run their applications in Amazon EC2 and store their data objects in Amazon S3. Amazon SimpleDB can then be used to query the object metadata from within the application in Amazon EC2 and return pointers to the objects stored in Amazon S3.
- Inexpensive
Amazon SimpleDB passes on to you the financial benefits of Amazon's scale. You pay only for resources you actually consume. Compare this with the significant up-front expenditures traditionally required to obtain software licenses and purchase and maintain hardware, either in-house or hosted. This frees you from many of the complexities of capacity planning, transforms large capital expenditures into much smaller operating costs, and eliminates the need to over-buy "safety net" capacity to handle periodic traffic spikes.
Pricing
Pay only for what you use. There is no minimum fee.
Machine Utilization - $0.14 per Amazon SimpleDB Machine Hour consumed
Amazon SimpleDB measures the machine utilization of each request and charges based on the amount of machine capacity used to complete the particular request (QUERY, GET, PUT, etc.), normalized to the hourly capacity of a circa 2007 1.7 GHz Xeon processor.
Data Transfer
$0.100 per GB - all data transfer in
$0.170 per GB - first 10 TB / month data transfer out $0.130 per GB - next 40 TB / month data transfer out $0.110 per GB - next 100 TB / month data transfer out $0.100 per GB - data transfer out / month over 150 TB
Data transfer "in" and "out" refers to transfer into and out of Amazon SimpleDB. Data transferred between Amazon SimpleDB and other Amazon Web Services is free of charge (i.e., $0.00 per GB).
Structured Data Storage - $1.50 per GB-month
Amazon SimpleDB measures the size of your billable data by adding the raw byte size of the data you upload + 45 bytes of overhead for each item, attribute name and attribute-value pair.
Amazon SimpleDB is designed to store relatively small amounts of data and is optimized for fast data access and flexibility in how that data is expressed. In order to minimize your costs across AWS services, large objects or files should be stored in Amazon S3, while the pointers and the meta-data associated with those files can be stored in Amazon SimpleDB. This will allow you to quickly search for and access your files, while minimizing overall storage costs. See below for detailed descriptions on calculating your own structured data storage requirements and for a more detailed explanation of how storage in Amazon SimpleDB and storage in Amazon S3 differ.
(Amazon SimpleDB is sold by Amazon Web Services LLC.)
Resources
Detailed Description
The Data Model: Domains, Items, Attributes and Values The data model used by Amazon SimpleDB makes it easy to input, manage and query your structured data. Developers organize their data-set into domains and can run queries across all of the data stored in a particular domain. Domains are collections of items that are described by attribute-value pairs. Think of these terms as analogous to concepts in a traditional spreadsheet table. For example, take the details of a product catalog shown in the table below and consider how they would be represented in Amazon SimpleDB. The whole table/catalog would be your domain named "clothing." Individual products would be rows in the table or items in your domain. The characteristics of items would be described by column headers (attributes). Values are in individual cells. Now consider the items below - a sweater, a dress shirt and a pair of shoes - are new products you would like to add to your domain.
|
 |
|
| | In Amazon SimpleDB, to add the items above, you would PUT the three itemIDs into your domain along with the attribute-value pairs for each of the items. Without the specific syntax, it would look something like this:
PUT (item, 123), (description, sweater), (color, blue), (color, red) PUT (item, 456), (description, dress shirt), (color, white), (color, blue) PUT (item, 789), (description, shoes), (color, black), (material, leather)
Amazon SimpleDB differs from tables of traditional databases in several important ways. First, you have the flexibility to easily go back later on and add new attributes that only apply to certain items - for example, sleeve length for dress shirts. Additionally there is no need to pre-define data types. If you have an attribute called "size," it can have values of 10.5 for shoes and XL for sweaters, making the service extremely flexible and easy to use.
Amazon SimpleDB automatically indexes all of your data, enabling you to easily query for an item based on attributes and their values. In the above example, you could submit a query for items where (color = blue INTERSECTION description = dress shirt), and Amazon SimpleDB would quickly return item 456 as the result.
API Summary Amazon SimpleDB provides a small number of simple API calls which implement writing, indexing and querying data. The interface and feature set are intentionally focused on core functionality, providing a basic API for developers to build upon and making the service easy to learn and simple to use.
- CreateDomain - Create a domain that contains your dataset.
- DeleteDomain - Delete a domain.
- ListDomains - List all domains and associated metadata.
- PutAttributes - Add or update an item and its attributes, or add attribute-value pairs to items that exist already. Items are automatically indexed as they are received.
- GetAttributes - Retrieve an item and all or a subset of its attributes and values.
- DeleteAttributes - Delete an item, an attribute, or an attribute value.
- Query - Query the dataset using a query expression which specifies value tests on one or more attributes. Supported value tests are: =, !=, <, > <=, >=, starts-with. Example: ["price" < "12.00"] INTERSECTION ["color" = "green"]
Amazon SimpleDB and Relational Databases within AWS Today, many developers correlate the word "database" with Relational Database Management Systems (RDBMS). While RDBMS offerings provide deep functionality, for many use cases, they introduce more complexity (and more cost) than is necessary. Many developers simply want to store, process, and query their data without worrying about managing schemas, maintaining indexes, tuning performance or scaling access to their data. Amazon SimpleDB removes the need to maintain a schema, while your attributes are automatically indexed to provide fast real-time lookup and querying capabilities. This flexibility minimizes the performance tuning required as the demands for your data increase.
Amazon SimpleDB eliminates administrative complexity by providing a simple set of APIs focused on the core functionality necessary to store, process, and query your data. The simplicity of this set of APIs, and the ability to access this service "in the cloud," allow you to quickly develop sophisticated applications without employing a DBA. Amazon SimpleDB allows you to easily scale your application based on your needs. You can quickly create new domains as your data grows or your request throughput increases. You no longer have to be concerned about obtaining software licenses, purchasing and maintain hardware, and managing capacity. You pay only for what you use.
Some developers do require a complex schema or broader functionality, and will undertake the extra work required to run their own relational database. Many developers are doing just this by hosting their own databases inside the Amazon EC2 compute environment. This provides them complete control over whatever database they choose to run, while still accessing the benefits of Amazon's computing infrastructure and the ability to scale capacity up and down instantly.
Either choice is fine with us. Over time, we plan to continue to add features that make it as easy as possible for developers to pursue whichever option they prefer for obtaining database functionality.
Data Storage in Amazon SimpleDB vs. Data Storage in Amazon S3 Unlike Amazon S3, Amazon SimpleDB is not storing raw data. Rather, it takes your data as input and expands it to create indices across multiple dimensions, which enables you to quickly query that data. Additionally, Amazon S3 and Amazon SimpleDB use different types of physical storage. Amazon S3 uses dense storage drives that are optimized for storing larger objects inexpensively. Amazon SimpleDB stores smaller bits of data and uses less dense drives that are optimized for data access speed.
In order to optimize your costs across AWS services, large objects or files should be stored in Amazon S3, while smaller data elements or file pointers (possibly to Amazon S3 objects) are best saved in Amazon SimpleDB. Because of the close integration between services and the free data transfer within the AWS environment, developers can easily take advantage of both the speed and querying capabilities of Amazon SimpleDB as well as the low cost of storing data in Amazon S3, by integrating both services into their applications.
Calculating Your Storage Needs The best way to predict the size of your structured data storage is as follows:
Raw byte size (GB) of all item IDs + 45 bytes per item + Raw byte size (GB) of all attribute names + 45 bytes per attribute name + Raw byte size (GB) of all attribute-value pairs + 45 bytes per attribute-value pair
To calculate your estimated monthly storage cost, take the resulting size in GB and multiply by $1.50
Machine Utilization Example Amazon SimpleDB measures the machine utilization of each request and charges based on the amount of machine capacity used to complete the particular request (QUERY, GET, PUT, etc.), normalized to the hourly capacity of a circa 2007 1.7 GHz Xeon processor. Machine utilization is driven by the amount of data (# of attributes, length of attributes) processed by each request. A GET operation that retrieves 256 attributes will use more resources than a GET that retrieves only 1 attribute. A multi-predicate QUERY that examines 100,000 attributes will cost more than a single predicate query that examines 250.
In the response message for each request, Amazon SimpleDB returns a field called Box Usage. Box Usage is the measure of machine resources consumed by each request. It does not include bandwidth or storage. Box usage is reported as the portion of a machine hour used to complete a particular request. The cost of an individual request is Box Usage (expressed in hours) * $0.14 per Amazon SimpleDB Machine hour. The cost of all your requests is the sum of Box Usage (expressed in hours) * $0.14.
For example, if over the course of a month, the sum of the Box Usage for your requests uses the equivalent of one 1.7 GHz Xeon processor for 9 hours, your charge will be:
9 hours * $0.14 per Amazon SimpleDB Machine hour = $1.26.
Getting Started The best way to understand Amazon SimpleDB is to work through the Getting Started Guide, part of our Technical Documentation. Within a few minutes, you will be able to create your domain and start building your index!
Intended Usage and Restrictions
Your use of this service is subject to the Amazon Web Services Customer Agreement.
|
|
|
|
| Contact Us | Contact our sales and business development teams with your specific questions -- Contact Us
|
|
|
|