- Getting Started
- Provisioned Throughput
- Data Model & APIs
- Scale, Availability, and Durability
- Local Secondary Indexes
- Fine-Grained Access Control
- Additional Features
- Reserved Capacity
Amazon DynamoDB is a fully managed NoSQL database service that provides fast and predictable performance with seamless scalability. Amazon DynamoDB enables customers to offload the administrative burdens of operating and scaling distributed databases to AWS, so they don’t have to worry about hardware provisioning, setup and configuration, replication, software patching, or cluster scaling.
Q: What does Amazon DynamoDB manage on my behalf?
Amazon DynamoDB takes away one of the main stumbling blocks of scaling databases, the management of the database software and the provisioning of hardware needed to run it. Customers can deploy a non-relational database in a matter of minutes. DynamoDB automatically partitions and re-partitions your data and provisions additional server capacity as your table size grows or you increase your provisioned throughput. In addition, Amazon DynamoDB synchronously replicates data across three facilities in an AWS Region, giving you high availability and data durability.
Q: What does read consistency mean? Why should I care?
Amazon DynamoDB stores three geographically distributed replicas of each table to enable high availability and data durability. Read consistency represents the manner and timing in which the successful write or update of a data item is reflected in a subsequent read operation of that same item. Amazon DynamoDB exposes logic that enables you to specify the consistency characteristics you desire for each read request within your application.
Q: What is the consistency model of Amazon DynamoDB?
When reading data from Amazon DynamoDB, users can specify whether they want the read to be eventually consistent or strongly consistent:
Eventually Consistent Reads (Default) – the eventual consistency option maximizes your read throughput. However, an eventually consistent read might not reflect the results of a recently completed write. Consistency across all copies of data is usually reached within a second. Repeating a read after a short time should return the updated data.
Strongly Consistent Reads — in addition to eventual consistency, Amazon DynamoDB also gives you the flexibility and control to request a strongly consistent read if your application, or an element of your application, requires it. A strongly consistent read returns a result that reflects all writes that received a successful response prior to the read.
Q: Does DynamoDB support in-place atomic updates?
Amazon DynamoDB supports fast in-place updates. You can increment or decrement a numeric attribute in a row using a single API call. Similarly, you can add or remove to a set of strings atomically as well. View our documentation for more information on atomic updates.
Q: Why is Amazon DynamoDB built on Solid State Drives?
Amazon DynamoDB runs exclusively on Solid State Drives (SSDs). SSDs help us achieve our design goals of predictable low-latency response times for storing and accessing data at any scale. The high I/O performance of SSDs also enables us to serve high-scale request workloads cost efficiently, and to pass this efficiency along in low request pricing.
Q: DynamoDB’s storage cost seems high. Is this a cost-effective service for my use case?
As with any product, we encourage potential customers of Amazon DynamoDB to consider the total cost of a solution, not just a single pricing dimension. The total cost of servicing a database workload is a function of the request traffic requirements and the amount of data stored. Most database workloads are characterized by a requirement for high I/O (high reads/sec and writes/sec) per GB stored. Amazon DynamoDB is built on SSD drives, which raises the cost per GB stored, relative to spinning media, but it also allows us to offer very low request costs. Based on what we see in typical database workloads, we believe that the total bill for using the SSD-based DynamoDB service will usually be lower than the cost of using a typical spinning media-based relational or non-relational database. If you have a use case that involves storing a large amount of data that you rarely access, then DynamoDB may not be right for you. We recommend that you use S3 for such use cases.
It should also be noted that the storage cost reflects the cost of storing multiple copies of each data item across multiple facilities within an AWS Region.
Q: Is DynamoDB only for high-scale applications?
No. DynamoDB offers seamless scaling so you can start small and scale up and down in line with your requirements. If you need fast, predictable performance at any scale then DynamoDB may be the right choice for you.
Click “Sign Up” to get started with Amazon DynamoDB today. From there, you can begin interacting with Amazon DynamoDB using either the AWS Management Console or Amazon DynamoDB APIs. If you are using the AWS Management Console, you can create a table with Amazon DynamoDB and begin exploring with just a few clicks.
Q: What kind of query functionality does DynamoDB support?
Amazon DynamoDB supports key-value GET/PUT operations using a user-defined primary key. The primary key is the only required attribute for items in a table and it uniquely identifies each item. You specify the primary key when you create a table.
A primary key can either be a single-attribute hash key or a composite hash-range key. A single attribute hash primary key could be, for example, “UserID”. This would allow you to quickly read and write data for an item associated with a given user ID.
A composite hash-range key is indexed as a hash key element and a range key element. This multi-part key maintains a hierarchy between the first and second element values. For example, a composite hash-range key could be a combination of “UserID” (hash) and “Timestamp” (range). Holding the hash key element constant, you can search across the range key element to retrieve items. This would allow you to use the Query API to, for example, retrieve all items for a single UserID across a range of timestamps.
Q: How do I update and query data items with Amazon DynamoDB?
After you have created a table using the AWS Management Console or CreateTable API, you can use the PutItem or BatchWriteItem APIs to insert items. Then you can use the GetItem, BatchGetItem, or, if composite primary keys are enabled and in use in your table, the Query API to retrieve the item(s) you added to the table.
Q: Does Amazon DynamoDB support conditional operations?
Yes, you can specify a condition that must be satisfied for a PUT, update, or delete operation on an item to be completed. For example, you could choose to update an item only if it has a certain value. You could also choose to PUT an item into the table only if no record exists for the primary key you have specified. Conditional operations allow users to implement optimistic concurrency control systems on DynamoDB. For more information on conditional operations, please see our documentation.
Q: Does Amazon DynamoDB support increment or decrement operations?
Yes, Amazon DynamoDB allows atomic increment and decrement operations on scalar values.
Q: When should I use Amazon DynamoDB vs a relational database engine on Amazon RDS or Amazon EC2?
Today’s web-based applications generate and consume massive amounts of data. For example, an online game might start out with only a few thousand users and a light database workload consisting of 10 writes per second and 50 reads per second. However, if the game becomes successful, it may rapidly grow to millions of users and generate tens (or even hundreds) of thousands of writes and reads per second. It may also create terabytes or more of data per day. Developing your applications against Amazon DynamoDB enables you to start small and simply dial-up your request capacity for a table as your requirements scale, without incurring downtime. You pay highly cost-efficient rates for the request capacity you provision, and let Amazon DynamoDB do the work over partitioning your data and traffic over sufficient server capacity to meet your needs. Amazon DynamoDB does the database management and administration, and you simply store and request your data. Automatic replication and failover provides built-in fault tolerance, high availability, and data durability. Amazon DynamoDB gives you the peace of mind that your database is fully managed and can grow with your application requirements.
While Amazon DynamoDB tackles the core problems of database scalability, management, performance, and reliability, it does not have all the functionality of a relational database. It does not support complex relational queries (e.g. joins) or complex transactions. If your workload requires this functionality, or you are looking for compatibility with an existing relational engine, you may wish to run a relational engine on Amazon RDS or Amazon EC2. While relational database engines provide robust features and functionality, scaling a workload beyond a single relational database instance is highly complex and requires significant time and expertise. As such, if you anticipate scaling requirements for your new application and do not need relational features, Amazon DynamoDB may be the best choice for you.
Q: How does Amazon DynamoDB differ from Amazon SimpleDB?
Which should I use?Both services are non-relational databases that remove the work of database administration. Amazon DynamoDB focuses on providing seamless scalability and fast, predictable performance. It runs on solid state disks (SSDs) for low-latency response times, and there are no limits on the request capacity or storage size for a given table. This is because Amazon DynamoDB automatically partitions your data and workload over a sufficient number of servers to meet the scale requirements you provide. In contrast, a table in Amazon SimpleDB has a strict storage limitation of 10 GB and is limited in the request capacity it can achieve (typically under 25 writes/second); it is up to you to manage the partitioning and re-partitioning of your data over additional SimpleDB tables if you need additional scale. While SimpleDB has scaling limitations, it may be a good fit for smaller workloads that require query flexibility. Amazon SimpleDB automatically indexes all item attributes and thus supports query flexibility at the cost of performance and scale.
Amazon CTO Werner Vogels' DynamoDB blog post provides additional context on the evolution of non-relational database technology at Amazon.
Q: When should I use Amazon DynamoDB vs Amazon S3?
Amazon DynamoDB stores structured data, indexed by primary key, and allows low latency read and write access to items ranging from 1 byte up to 64KB. Amazon S3 stores unstructured blobs and suited for storing large objects up to 5 TB. In order to optimize your costs across AWS services, large objects or infrequently accessed data sets should be stored in Amazon S3, while smaller data elements or file pointers (possibly to Amazon S3 objects) are best saved in Amazon DynamoDB.
Amazon DynamoDB lets you specify the request throughput you want your table to be able to achieve. Behind the scenes, the service handles the provisioning of resources to achieve the requested throughput rate. Rather than asking you to think about instances, hardware, memory, and other factors that could affect your throughput rate, we simply ask you to provision the throughput level you want to achieve. This is the provisioned throughput model of service.
Amazon DynamoDB lets you specify your throughput needs in terms of units of read capacity and write capacity for your table. During creation of a table, you specify your required read and write capacity needs and Amazon DynamoDB automatically partitions and reserves the appropriate amount of resources to meet your throughput requirements. To decide on the required read and write throughput values, consider the number of read and write data plane API calls you expect to perform per second. If at any point you anticipate traffic growth that may exceed your provisioned throughput, you can simply update your provisioned throughput values via the AWS Management Console or Amazon DynamoDB APIs. You can also reduce the provisioned throughput value for a table as demand decreases. Amazon DynamoDB will remain available while scaling it throughput level up or down.
Q: How does selection of primary key influence the scalability I can achieve?
When storing data, Amazon DynamoDB divides a table into multiple partitions and distributes the data based on the hash key element of the primary key. While allocating capacity resources, Amazon DynamoDB assumes a relatively random access pattern across all primary keys. You should set up your data model so that your requests result in a fairly even distribution of traffic across primary keys. If a table has a very small number of heavily accessed hash key elements, possibly even a single very heavily used hash key element, traffic is concentrated on a small number of partitions – potentially only one partition. If the workload is heavily unbalanced, meaning disproportionately focused on one or a few partitions, the operations will not achieve the overall provisioned throughput level. To get the most out of Amazon DynamoDB throughput, build tables where the hash key element has a large number of distinct values, and values are requested fairly uniformly, as randomly as possible. An example of a good primary key is CustomerID if the application has many customers and requests made to various customer records tend to be more or less uniform. An example of a heavily skewed primary key is “Product Category Name” where certain product categories are more popular than the rest.
Q: What is a read/write capacity unit?
How do I estimate how many read and write capacity units I need for my application?A unit of Write Capacity enables you to perform one write per second for items of up to 1Kb in size. Similarly, a unit of Read Capacity enables you to perform one strongly consistent read per second (or two eventually consistent reads per second) of items of up to 1Kb in size. Larger items will require more capacity. You can calculate the number of units of read and write capacity you need by estimating the number of reads or writes you need to do per second and multiplying by the size of your items (rounded up to the nearest KB).
Units of Capacity required for writes = Number of item writes per second x item size (rounded up to the nearest KB)
Units of Capacity required for reads* = Number of item reads per second x item size (rounded up to the nearest KB)
* If you use eventually consistent reads you’ll get twice the throughput in terms of reads per second.
If your items are less than 1KB in size, then each unit of Read Capacity will give you 1 read/second of capacity and each unit of Write Capacity will give you 1 write/second of capacity. For example, if your items are 512 bytes and you need to read 100 items per second from your table, then you need to provision 100 units of Read Capacity.
If your items are larger than 1KB in size, then you should calculate the number of units of Read Capacity and Write Capacity that you need. For example, if your items are 1.5KB and you want to do 100 reads/second, then you would need to provision 100 (read per second) x 2 (1.5KB rounded up to the nearest whole number) = 200 units of Read Capacity.
Note that the required number of units of Read Capacity is determined by the number of items being read per second, not the number of API calls. For example, if you need to read 500 items per second from your table, and if your items are 1KB or less, then you need 500 units of Read Capacity. It doesn’t matter if you do 500 individual GetItem calls or 50 BatchGetItem calls that each return 10 items.
Q: Will I always be able to achieve my level of provisioned throughput?
Amazon DynamoDB assumes a relatively random access pattern across all primary keys. You should set up your data model so that your requests result in a fairly even distribution of traffic across primary keys. If you have a highly uneven or skewed access pattern, you may not be able to achieve your level of provisioned throughput.
When storing data, Amazon DynamoDB divides a table into multiple partitions and distributes the data based on the hash key element of the primary key. The provisioned throughput associated with a table is also divided among the partitions; each partition's throughput is managed independently based on the quota allotted to it. There is no sharing of provisioned throughput across partitions. Consequently, a table in Amazon DynamoDB is best able to meet the provisioned throughput levels if the workload is spread fairly uniformly across the hash key values. Distributing requests across hash key values distributes the requests across partitions, which helps achieve your full provisioned throughput level.
If you have an uneven workload pattern across primary keys and are unable to achieve your provisioned throughput level, you may be able to meet your throughput needs by increasing your provisioned throughput level further, which will give more throughput to each partition. However, it is recommended that you considering modifying your request pattern or your data model in order to achieve a relatively random access pattern across primary keys.
Q: What is the maximum throughput I can provision for a single DynamoDB table?
DynamoDB is designed to scale without limits However, if you wish to exceed throughput rates of 10,000 write capacity units or 10,000 read capacity units for an individual table, you must first contact Amazon through this online form. If you wish to provision more than 20,000 write capacity units or 20,000 read capacity units from a single subscriber account you must first contact us using the form described above.
Q: What is the minimum throughput I can provision for a single DynamoDB table?
The smallest provisioned throughput you can request is 1 write capacity unit and 1 read capacity unit.
This falls within the free tier which allows for 5 units of write capacity and 10 units of read capacity. The free tier applies at the account level, not the table level. In other words, if you add up the provisioned capacity of all your tables, and if the total capacity is no more than 5 units of write capacity and 10 units of read capacity, your provisioned capacity would fall into the free tier.
Q: Is there any limit on how much I can change my provisioned throughput with a single request?
Yes. Amazon DynamoDB allows you to change your provisioned throughput level by up to 100% with a single UpdateTable API call. If you wish to increase your throughput by more than 100%, you can simply call UpdateTable again.
For example, if your table has 1,000 units of write capacity provisioned, you could not update your table to 3,000 with a single API call as that is more than the maximum allowed change for a single UpdateTable operation. To increase your throughput from 1,000 to 3,000 units of write capacity, simply call UpdateTable to first double your throughput to 2,000, then call UpdateTable a second time to reach 3,000 writes/second.
Q: How am I charged for provisioned throughput?
Every Amazon DynamoDB table has pre-provisioned the resources it needs to achieve the throughput rate you asked for. You are billed at an hourly rate for as long as your table holds on to those resources. For a complete list of prices with examples, see the DynamoDB pricing page.
Q: How do I change the provisioned throughput for an existing DynamoDB table?
There are two ways to update the provisioned throughput of an Amazon DynamoDB table. You can either make the change in the management console, or else you can use the UpdateTable API call. You may change your throughput by up to 100% with a single API call, as described above: “Is there any limit on how much I can change my provisioned throughput with a single API call?"
Amazon DynamoDB will remain available while your provisioned throughput level increases or decreases.
Q: How often can I change my provisioned throughput?
You can increase your provisioned throughput as often as you want. You can decrease it twice per day. A day is defined according to the GMT time zone. For example, if you decrease the provisioned throughput for your table twice on December 12th, you won’t be able to decrease the provisioned throughput for that table again until 12:01am GMT on December 13th.
Keep in mind that you can’t change your provisioned throughput if your Amazon DynamoDB table is still in the process of responding to your last request to change provisioned throughput. Use the management console or the DescribeTables API to check the status of your table. If the status is “CREATING”, “DELETING”, or “UPDATING”, you won’t be able to adjust the throughput of your table. Please wait until you have a table in “ACTIVE” status and try again.
Q: Does the consistency level affect the throughput rate?
Yes. For a given allocation of resources, the read-rate that a DynamoDB table can achieve is different for strongly consistent and eventually consistent reads. If you request “1,000 read capacity units”, DynamoDB will allocate sufficient resources to achieve 1,000 strongly consistent reads per second of items up to 1 KB. If you want to achieve 1,000 eventually consistent reads of items up to 1KB, you will need to half of that capacity, i.e., 500 read capacity units. For additional guidance on choosing the appropriate throughput rate for your table, see our provisioned throughput guide.
Q: Does the item size affect the throughput rate?
Yes. For a given allocation of resources, the read-rate that a DynamoDB table can achieve does depend on the size of an item. When you specify the provisioned throughput you would like to achieve, DynamoDB provisions its resources on the assumption that items will be less than 1KB in size. Every increase of up to 1KB will linearly increase the resources you need to achieve the same throughput rate. For example, if you have provisioned a DynamoDB table with 100 units of write capacity, that means that it can handle 100 1KB writes per second, or 50 2KB writes per second, or 25 4KB writes per second, and so on. For additional guidance on choosing the appropriate throughput rate for your table, see ourprovisioned throughput guide.
Q: What happens if my application performs more reads or writes than my provisioned capacity?
If your application performs more reads/second or writes/second than your table’s provisioned throughput capacity allows, requests above your provisioned capacity will be throttled and you will receive 400 error codes. For instance, if you had asked for 1,000 write capacity units and try to do 1,500 writes/second of 1 KB items, DynamoDB will only allow 1,000 writes/second to go through and you will receive error code 400 on your extra requests. You should use CloudWatch to monitor your request rate to ensure that you always have enough provisioned throughput to achieve the request rate that you need.
Q: How do I know if I am exceeding my provisioned throughput capacity?
DynamoDB publishes your consumed throughput capacity as a CloudWatch metric. You can set an alarm on this metric so that you will be notified if you get close to your provisioned capacity.
Q: How long does it take to change the provisioned throughput level of a table?
In general, decreases in throughput will take anywhere from a few seconds to a few minutes, while increases in throughput will typically take anywhere from a few minutes to a few hours.
We strongly recommend that you do not try and schedule increases in throughput to occur at almost the same time when that extra throughput is needed. We recommend provisioning throughput capacity sufficiently far in advance to ensure that it is there when you need it.
Data Models and APIs
The data model for Amazon DynamoDB is as follows:
Table: A table is a collection of data items – just like a table in a relational database is a collection of rows. Each table can have an infinite number of data items. Amazon DynamoDB is schema-less, in that the data items in a table need not have the same attributes or even the same number of attributes. Each table must have a primary key. The primary key can be a single attribute key or a “composite” attribute key that combines two attributes. The attribute(s) you designate as a primary key must exist for every item as primary keys uniquely identify each item within the table.
Item: An Item is composed of a primary or composite key and a flexible number of attributes. There is no explicit limitation on the number of attributes associated with an individual item, but the aggregate size of an item, including all the attribute names and attribute values, is 64KB.
Attribute: Each attribute associated with a data item is composed of an attribute name (e.g. “Color”) and a value or set of values (e.g. “Red” or “Red, Yellow, Green”). Individual attributes have no explicit size limit, but the total value of an item (including all attribute names and values) cannot exceed 64KB.
Q: Is there a limit on the size of an item?
The total size of an item, including attribute names and attribute values, cannot exceed 64KB.
Q: Is there a limit on the number of attributes an item can have?
There is no limit to the number of attributes that an item can have. However, the total size of an item, including attribute names and attribute values, cannot exceed 64KB.
Q: What are the APIs?
- CreateTable – Creates a table and specifies the primary index used for data access.
- UpdateTable – Updates the provisioned throughput values for the given table.
- DeleteTable – Deletes a table.
- DescribeTable – Returns table size, status, and index information.
- ListTables – Returns a list of all tables associated with the current account and endpoint.
- PutItem – Creates a new item, or replaces an old item with a new item (including all the attributes). If an item already exists in the specified table with the same primary key, the new item completely replaces the existing item. You can also use conditional operators to replace an item only if its attribute values match certain conditions, or to insert a new item only if that item doesn’t already exist.
- BatchWriteItem – Inserts, replaces, and deletes multiple items across multiple tables in a single request, but not as a single transaction. Supports batches of up to 25 items to Put or Delete, with a maximum total request size of 1 MB.
- UpdateItem – Edits an existing item's attributes. You can also use conditional operators to perform an update only if the item’s attribute values match certain conditions.
- DeleteItem – Deletes a single item in a table by primary key. You can also use conditional operators to perform a delete an item only if the item’s attribute values match certain conditions.
- GetItem – The GetItem operation returns a set of Attributes for an item that matches the primary key. The GetItem operation provides an eventually consistent read by default. If eventually consistent reads are not acceptable for your application, use ConsistentRead.
- BatchGetItem – The BatchGetItem operation returns the attributes for multiple items from multiple tables using their primary keys. A single response has a size limit of 1 MB and returns a maximum of 100 items. Supports both strong and eventual consistency.
- Query – Gets one or more items using the table primary key, or from a secondary index using the secondary index key. You can narrow the scope of the query by using comparison operators on the range key value, or on the secondary index key. Supports both strong and eventual consistency. A single response has a size limit of 1 MB.
- Scan – Gets one or more items and attributes by performing a full scan across the table. The items returned may be limited by your specifying filters against one or more attributes. This API can thus be used to enable ad-hoc querying of a table against attributes which are not the table’s primary key. However, since it is a full table scan without an index, it should not be used for any application query use case that requires predictable performance. The result set of a scan API request will be eventually consistent. You can think of the Scan API as an iterator. Once the aggregate size of items scanned for a given Scan API request exceeds a 1 MB limit, the given request will terminate and fetched results will be returned along with a LastEvaluatedKey (to continue the scan in a subsequent operation).
Q: What data types does DynamoDB support?
Amazon DynamoDB supports three scalar data types: Number, String, and Binary. Additionally, Amazon DynamoDB supports multi-valued types: Number Set, String Set, and Binary Set.
Scale, Availability & Durability
No. You can store any amount of storage you can put into an Amazon DynamoDB table. As the size of your data set grows, Amazon DynamoDB will automatically spread your data over sufficient machine resources to meet your storage requirements.
Q: Is there a limit to how much throughput I can get out of a single table?
No, you can increase the throughput you have provisioned for your table using UpdateTable API or in the AWS Management Console. DynamoDB is able to operate at massive scale and there is no theoretical limit on the maximum throughput you can achieve. DynamoDB automatically divides your table across multiple partitions, where each partition is an independent parallel computation unit. DynamoDB can achieve increasingly high throughput rates by adding more partitions.
If you wish to exceed throughput rates of 10,000 writes/second or 10,000 reads/second, you must first contact Amazon through this online form.
Q: Does Amazon DynamoDB remain available when I ask it to scale up or down by changing the provisioned throughput?
Yes. Amazon DynamoDB is designed to scale its provisioned throughput up or down while still remaining available.
Q: Do I need to manage client-side partitioning on top of Amazon DynamoDB?
No. Amazon DynamoDB removes the need to partition across database tables for throughput scalability.
Q: How highly available is Amazon DynamoDB?
The service runs across Amazon’s proven, high-availability data centers. The service replicates data across three facilities in an AWS Region to provide fault tolerance in the event of a server failure or Availability Zone outage.
Q: How does Amazon DynamoDB achieve high uptime and durability?
To achieve high uptime and durability, Amazon DynamoDB synchronously replicates data across three facilities within an AWS Region.
Local Secondary Indexes
Local secondary indexes enable some common queries to run more quickly and cost-efficiently, that would otherwise require retrieving a large number of items and then filtering the results. It means your applications can rely on more flexible queries based on a wider range of attributes.
Before the launch of local secondary indexes, if you wanted to find specific items within a hash key bucket (items that share the same hash key), DynamoDB would have fetched all objects that share a single hash key, and filter the results accordingly. For instance, consider an e-commerce application that stores customer order data in a DynamoDB table with hash-range schema of customer id-order timestamp. Without LSI, to find an answer to the question “Display all orders made by Customer X with shipping date in the past 30 days, sorted by shipping date”, you had to use the Query API to retrieve all the objects under the hash key “X”, sort the results by shipment date and then filter out older records.
With local secondary indexes, we are simplifying this experience. Now, you can create an index on “shipping date” attribute and execute this query efficiently and just retieve only the necessary items. This significantly reduces the latency and cost of your queries as you will retrieve only items that meet your specific criteria. Moreover, it also simplifies the programming model for your application as you no longer have to write customer logic to filter the results. We call this new secondary index a ‘local’ secondary index because it is used along with the hash key and hence allows you to search locally within a hash key bucket. So while previously you could only search using the hash key and the range key, now you can also search using a secondary index in place of the range key, thus expanding the number of attributes that can be used for queries which can be conducted efficiently.
Redundant copies of data attributes are copied into the local secondary indexes you define. These attributes include the table hash and range key, plus the alternate range key you define. You can also redundantly store other data attributes in the local secondary index, in order to access those other attributes without having to access the table itself.
Local secondary indexes are not appropriate for every application. They introduce some constraints on the volume of data you can store within a single hash key value. For more information, see the FAQ items below about item collections.
Q: What are Projections?
The set of attributes that is copied into a local secondary index is called a projection. The projection determines the attributes that you will be able to retrieve with the most efficiency. When you query a local secondary index, Amazon DynamoDB can access any of the projected attributes, with the same performance characteristics as if those attributes were in a table of their own. If you need to retrieve any attributes that are not projected, Amazon DynamoDB will automatically fetch those attributes from the table.
When you define a local secondary index, you need to specify the attributes that will be projected into the index. At a minimum, each index entry consists of: (1) the table hash key value, (2) an attribute to serve as the index range key, and (3) the table range key value.
Beyond the minimum, you can also choose a user-specified list of other non-key attributes to project into the index. You can even choose to project all attributes into the index, in which case the index replicates the same data as the table itself, but the data is organized by the alternate range key you specify.
Q: How can I create a LSI?
You need to create a LSI at the time of table creation. It can’t currently be added later on. To create an LSI, specify the following two parameters:
Indexed Range key – the attribute that will be indexed and queried on.
Projected Attributes – the list of attributes from the table that will be copied directly into the local secondary index, so they can be returned more quickly without fetching data from the primary index, which contains all the items of the table. Without projected attributes, local secondary index contains only primary and secondary index keys.
Q: What is the consistency model for LSI?
Local secondary indexes are updated automatically when the primary index is updated. Similar to reads from a primary index, LSI supports both strong and eventually consistent read options.
Q: Do local secondary indexes contain references to all items in the table?
No, not necessarily. Local secondary indexes only reference those items that contain the indexed range key specified for that LSI. DynamoDB’s flexible schema means that not all items will necessarily contain all attributes.
This means local secondary index can be sparsely populated, compared with the primary index. Because local secondary indexes are sparse, they are efficient to support queries on attributes that are uncommon.
For example, in the Orders example described above, a customer may have some additional attributes in an item that are included only if the order is canceled (such as CanceledDateTime, CanceledReason). For queries related to canceled items, an local secondary index on either of these attributes would be efficient since the only items referenced in the index would be those that had these attributes present.
Q: How do I query local secondary indexes?
Local secondary indexes can only be queried via the Query API.
To query a local secondary index, explicitly reference the index in addition to the name of the table you’d like to query. You must specify the index hash attribute name and value. You can optionally specify a condition against the index key range attribute.
Your query can retrieve non-projected attributes stored in the primary index by performing a table fetch operation, with a cost of additional read capacity units.
Both strongly consistent and eventually consistent reads are supported for query using local secondary index.
Q: How do I create local secondary indexes?
Local secondary indexes must be defined at time of table creation. The primary index of the table must use a hash-range composite key.
Q: Can I add local secondary indexes to an existing table?
No, it’s not possible to add local secondary indexes to existing tables at this time. We are working on adding this capability and will be releasing it in the future. When you create a table with local secondary index, you may decide to create local secondary index for future use by defining a range key element that is currently not used. Since local secondary index are sparse, this index costs nothing until you decide to use it.
Q: How many local secondary indexes can I create on one table?
Each table can have up to five local secondary indexes.
Q: How many projected non-key attributes can I create on one table?
Each table can have up to 20 projected non-key attributes, in total across all local secondary indexes within the table. Each index may also specifify that all non-key attributes from the primary index are projected.
Q: Can I modify the index once it is created?
No, an index cannot be modified once it is created. We are working to add this capability in the future.
Q: Can I delete local secondary indexes?
No, local secondary indexes cannot be removed from a table once they are created at this time. Of course, they are deleted if you also decide to delete the entire table. We are working on adding this capability and will be releasing it in the future.
Q: How do local secondary indexes consume provisioned capacity?
You don’t need to explicitly provision capacity for a local secondary index. It consumes provisioned capacity as part of the table with which it is associated.
Reads and writes to LSI’s consume capacity by the standard formula of 1 unit per 1KB of data read or written each second, with the following differences:
When writes contain data that are relevant to one or more local secondary indexes, those writes are mirrored to the appropriate local secondary indexes. In these cases, write capacity will be consumed for the table itself, and additional write capacity will be consumed for each relevant LSI.
Updates that overwrite an existing item can result in two operations– delete and insert – and thereby consume extra units of write capacity per 1KB of data.
When a read query requests attributes that are not projected into the LSI, DynamoDB will fetch those attributes from the primary index. This implicit GetItem request consumes one read capacity unit per 1KB of item data fetched.
Q: How much storage will local secondary indexes consume?
Local secondary indexes consume storage for the attribute name and value of each LSI’s primary and index keys, for all projected non-key attributes, plus 100 bytes per item reflected in the LSI.
Q: What data types can be indexed?
All scalar data types (Number, String, Binary) can be used for the range key element of the local secondary index key. Set types cannot be used.
Q: What data types can be projected into a local secondary index?
All data types (including set types) can be projected into a local secondary index.
Q: What are item collections and how are they related to LSI?
In Amazon DynamoDB, an item collection is any group of items that have the same hash key, across a table and all of its local secondary indexes. Traditional partitioned (or sharded) relational database systems call these shards or partitions, referring to all database items or rows stored under a hash key.
Item collections are automatically created and maintained for every table that includes local secondary indexes. DynamoDB stores each item collection within a single disk partition.
Q: Are there limits on the size of an item collection?
Every item collection in Amazon DynamoDB is subject to a maximum size limit of 10 gigabytes. For any distinct hash key value, the sum of the item sizes in the table plus the sum of the item sizes across all of that table's local secondary indexes must not exceed 10 GB.
The 10 GB limit for item collections does not apply to tables without local secondary indexes; only tables that have one or more local secondary indexes are affected.
Although individual item collections are limited in size, the storage size of an overall table with local secondary indexes is not limited. The total size of an indexed table in Amazon DynamoDB is effectively unlimited, provided the total storage size (table and indexes) for any one hash key does not exceed the 10 GB threshold.
Q: How can I track the size of an item collection?
DynamoDB’s write APIs (PutItem, UpdateItem, DeleteItem, and BatchWriteItem) include an option, which allows the API response to include an estimate of the relevant item collection’s size. This estimate includes lower and upper size estimate for the data in a particular item collection, measured in gigabytes.
We recommend that you instrument your application to monitor the sizes of your item collections. Your applications should examine the API responses regarding item collection size, and log an error message whenever an item collection exceeds a user-defined limit (8 GB, for example). This would provide an early warning system, letting you know that an item collection is growing larger, but giving you enough time to do something about it.
Q: What if I exceed the 10GB limit for an item collection?
If a particular item collection exceeds the 10GB limit, then you will not be able to write new items, or increase the size of existing items, for that particular hash key. Read and write operations that shrink the size of the item collection are still allowed. Other item collections in the table are not affected.
To address this problem , you can remove items or reduce item sizes in the collection that has exceeded 10GB. Alternatively, you can introduce new items under a new hash key value to work around this problem. If your table includes historical data that is infrequently accessed, consider archiving the historical data to Amazon S3, Amazon Glacier or another data store.
Fine-Grained Access Control
Fine Grained Access Control (FGAC) gives a DynamoDB table owner a high degree of control over data in the table. Specifically, the table owner can indicate who (caller) can access which items or attributes of the table and perform what actions (read / write capability). FGAC is used in concert with AWS Identity and Access Management (IAM), which manages the security credentials and the associated permissions.
Q: What are the common use cases for DynamoDB FGAC?
FGAC can benefit any application that tracks information in a DynamoDB table, where the end user (or application client acting on behalf of an end user) wants to read or modify the table directly, without a middle-tier service. For instance, a developer of a mobile app named Acme can use FGAC to track the top score of every Acme user in a DynamoDB table. FGAC allows the application client to modify only the top score for the user that is currently running the application.
Q: Without FGAC, how can a developer achieve item level access control?
To achieve this level of control without FGAC, a developer would have to choose from a few potentially onerous approaches. Some of these are:
- Proxy: The application client sends a request to a brokering proxy that performs the authentication and authorization. Such a solution increases the complexity of the system architecture and can result in a higher total cost of ownership (TCO).
- Per Client Table: Every application client is assigned its own table. Since application clients access different tables, they would be protected from one another. This could potentially require a developer to create millions of tables, thereby making database management extremely painful.
- Per-Client Embedded Token: A secret token is embedded in the application client. The shortcoming of this is the difficulty in changing the token and handling its impact on the stored data. Here, the key of the items accessible by this client would contain the secret token.
Q: How does DynamoDB FGAC work?
With FGAC, an application requests a security token that authorizes the application to access only specific items in a specific DynamoDB table. With this token, the end user application agent can make requests to DynamoDB directly. Upon receiving the request, the incoming request’s credentials are first evaluated by DynamoDB, which will use IAM to authenticate the request and determine the capabilities allowed for the user. If the user’s request is not permitted, FGAC will prevent the data from being accessed.
Q: How much does DynamoDB FGAC cost?
There is no additional charge for using FGAC. As always, you only pay for the provisioned throughput and storage associated with the DynamoDB table.
Q: How do I get started?
Refer to the Fine-Grained Access Control section of the DynamoDB Developer Guide to learn how to create an access policy, create an IAM role for your app (e.g. a role named AcmeFacebookUsers for a Facebook app_id of 34567), and assign your access policy to the role. The trust policy of the role determines which identity providers are accepted (e.g. Login with Amazon, Facebook, or Google), and the access policy describes which AWS resources can be accessed (e.g. a DynamoDB table). Using the role, your app can now to obtain temporary credentials for DynamoDB by calling the AssumeRoleWithIdentityRequest API of the AWS Security Token Service (STS).
Q: How do I allow users to Query a Local Secondary Index, but prevent them from causing a table fetch to retrieve non-projected attributes?
Some Query operations on a Local Secondary Index can be more expensive than others if they request attributes that are not projected into an index. You an restrict such potentially expensive “fetch” operations by limiting the permissions to only projected attributes, using the "dynamodb:Attributes" context key.
Q: How do I prevent users from accessing specific attributes?
The recommended approach to preventing access to specific attributes is to follow the principle of least privilege, and Allow access to only specific attributes.
Alternatively, you can use a Deny policy to specify attributes that are disallowed. However, this is not recommended for the following reasons:
- With a Deny policy, it is possible for the user to discover the hidden attribute names by issuing repeated requests for every possible attribute name, until the user is ultimately denied access.
- Deny policies are more fragile, since DynamoDB could introduce new API functionality in the future that might allow an access pattern that you had previously intended to block.
Q: How do I prevent users from adding invalid data to a table?
The available FGAC controls can determine which items changed or read, and which attributes can be changed or read. Users can add new items without those blocked attributes, and change any value of any attribute that is modifiable.
Q: Can I grant access to multiple attributes without listing all of them?
The IAM policy panguage supports a rich set of comparison operations, including StringLike, StringNotLike, and many others. For example, the following policy snippet matches all attributes beginning with “public_”
Q: How do I create an appropriate policy?
We recommend that you use the DynamoDB Policy Generator from the DynamoDB console. You may also compare your policy to those listed in the Amazon DynamoDB Developer Guide to make sure you are following a recommended pattern. You can post policies to the AWS Forums to get thoughts from the DynamoDB community.
Q: Can I grant access based on a canonical user id instead of separate ids for the user based on the identity provider they logged in with?
Not without running a “token vending machine”. If a user retrieves federated access to your IAM role directly using Facebook credentials with STS, those temporary credentials only have information about that user’s Facebook login, and not their Amazon login, or Google login. If you want to internally store a mapping of each of these logins to your own stable identifier, you can run a service that the user contacts to log in, and then call STS and provide them with credentials scoped to whatever hash key value you come up with as their canonical user id.
Q: What information cannot be hidden from callers using FGAC?
Certain information cannot currently be blocked from the caller about the items in the table:
- Item collection metrics. The caller can ask for the estimated number of items and size in bytes of the item collection.
- Consumed throughput The caller can ask for the detailed breakdown or summary of the provisioned throughput consumed by operations.
- Validation cases. In certain cases, the caller can learn about the existence and primary key schema of a table when you did not intend to give them access. To prevent this, follow the principle of least privilege and only allow access to the tables and actions that you intended to allow access to.
- If you deny access to specific attributes instead of whitelisting access to specific attributes, the caller can theoretically determine the names of the hidden attributes if “allow all except for” logic. It is safer to whitelist specific attribute names instead.
You have the ability to monitor table performance for free using Amazon CloudWatch in the AWS Management Console. You have access to information such as: latencies for each operation type, total amount of data stored in the table, request throughput for each API, and any throttled requests in a given time period. You can use this data to proactively scale your database table resources ahead of expected traffic increases.
Q: Does Amazon DynamoDB support IAM permissions?
Yes, DynamoDB will support API-level permissions through AWS Identity and Access Management (IAM) service integration
For more information about IAM, go to:
- AWS Identity and Access Management
- AWS Identity and Access Management Getting Started Guide
- Using AWS Identity and Access Management
Q: Does Amazon DynamoDB support transactions?
DynamoDB supports implicit item-level transactions. When you use UpdateItem, PutItem, or DeleteItem, the operation is guaranteed to either succeed or fail atomically. The atomicity of these operations is guaranteed at the item level. Atomicity is also guaranteed for conditional operations and for increment/decrement operations.
Each DynamoDB table has provisioned read-throughput and write-throughput associated with it. You are billed by the hour for that throughput capacity if you exceed the free tier.
Please note that you are charged by the hour for the throughput capacity that you provision for your table, whether or not you are sending requests to your table. If you would like to change your table’s provisioned throughput capacity, you can do so using the AWS Management Console or the UpdateTable API.
In addition, DynamoDB also charges for indexed data storage as well as the standard internet data transfer fees
To learn more about DynamoDB pricing, please visit the DynamoDB pricing page.
Q: What are some pricing examples?
Here is an example of how to calculate your throughput costs using US East (Northern Virginia) Region pricing. To view prices for other regions, visit our pricing page.
If you create a table and request 10 units of write capacity and 200 units of read capacity of provisioned throughput, you would be charged:
$0.01 + (4 x $0.01) = $0.05 per hour
If your throughput needs changed and you increased your reserved throughput requirement to 10,000 units of write capacity and 50,000 units of read capacity, your bill would then change to:
(1,000 x $0.01) + (1,000 x $0.01) = $20/hour
To learn more about DynamoDB pricing, please visit the DynamoDB pricing page.
Q: Do your prices include taxes?Except as otherwise noted, our prices are exclusive of applicable taxes and duties, including VAT and applicable sales tax. For example, our prices for the Asia Pacific (Tokyo) Region are inclusive of Japan consumption tax.
Reserved Capacity is a billing feature that allows you to obtain discounts on your provisioned throughput capacity in exchange for:
- A one-time up-front payment
- A commitment to a minimum monthly usage level for the duration of the term of the agreement.
Reserved Capacity applies within a single AWS Region and can be purchased with 1-year or 3-year terms. Every DynamoDB table has provisioned throughput capacity associated with it. When you create or update a table, you specify how much read or write capacity you want it to have. This capacity is what determines the read and write throughput rate that your DynamoDB table can achieve. Reserved Capacity is a billing arrangement and has no direct impact on the performance or capacity of your DynamoDB tables. For example, if you buy 5,000 write capacity units of Reserved Capacity, you have agreed to pay for that much capacity for the duration of the agreement (1 or 3 years) in exchange for discounted pricing.Q: How do I buy Reserved Capacity?Log into the AWS Management Console, go to the DynamoDB console page, and then click on "Purchase Reserved Capacity”. This will take you to a form you can fill out to purchase Reserved Capacity. Make sure you have selected the AWS Region in which your Reserved Capacity will be used. Please allow up to two weeks for your purchase request to be processed. You will be notified by email when your purchase request has been processed.
Q: Can I cancel a Reserved Capacity purchase?
No, you cannot cancel your Reserved Capacity and the one-time payment is not refundable. You will continue to pay for every hour during your Reserved Capacity term regardless of your usage.
Q: What is the smallest amount of Reserved Capacity that I can buy?
The smallest Reserved Capacity offering is 5,000 write capacity units and 5,000 read capacity units.
Q: Are there APIs that I can use to buy Reserved Capacity?
Not yet. We will provide APIs and add more Reserved Capacity options over time.
Q: How many Reserved Capacity purchases can I make?
Currently, each AWS account can make one Reserved Capacity purchase. We will expand our Reserved Capacity offering in future to allow more Reserved Capacity purchases per account.
Q: Can I move Reserved Capacity from one Region to another?
No. Reserved Capacity is associated with a single Region.
Q: Can I provision more throughput capacity than my Reserved Capacity?
Yes. When you purchase Reserved Capacity, you are agreeing to a minimum usage level and you pay a discounted rate for that usage level. If you provision more capacity than that minimum level, you will be charged at standard rates for the additional capacity.
Q: How do I use my Reserved Capacity?
Reserved Capacity is automatically applied to your bill. For example, if you purchase 5,000 write capacity units of Reserved Capacity and you have provisioned 6,000, then your Reserved Capacity purchase will automatically cover the cost of 5,000 write capacity units and you will pay standard rates for the remaining 1,000 write capacity units.
Q: What happens if I provision less throughput capacity than my Reserved Capacity?
A Reserved Capacity purchase is an agreement to pay for a minimum amount of provisioned throughput capacity, for the duration of the term of the agreement, in exchange for discounted pricing. If you use less than your Reserved Capacity, you will still be charged each month for that minimum amount of provisioned throughput capacity.
Q: Can I use my Reserved Capacity for multiple DynamoDB tables?
Yes. Reserved Capacity is applied to the total provisioned capacity within the Region in which you purchased your Reserved Capacity. For example, if you purchased 5,000 write capacity units of Reserved Capacity, then you can apply that to one table with 5,000 write capacity units, or 100 tables with 50 write capacity units, or 1,000 tables with 5 write capacity units, etc.