Using Amazon DynamoDB Object Persistence Framework - An Introduction
An introduction to using Amazon DynamoDB with the AWS SDK for .NET object persistence framework.
Submitted By: Pavel@AWS
AWS Products Used: AWS SDK for .NET, Amazon DynamoDB
Language(s): C#
Created On: March 16, 2012
The Amazon DynamoDB support provided in the AWS SDK for .NET can be divided into three layers:
- Base API
- Functionality found under the namespaces Amazon.DynamoDB and Amazon.DynamoDB.Model. This is the API most closely related to the service model, with little overhead or helper functionality.
- Document Model
- Functionality found under the namespace Amazon.DynamoDB.DocumentModel. This is an API based around the Table and Document classes. The Document class represents an arbitrary mapping of attributes to values.
- Data Model (object persistence layer)
- Functionality found under the namespace Amazon.DynamoDB.DataModel. This API provides a way to markup .NET classes for easy marshalling and unmarshalling of data. The functionality in this namespace removes much of the complexity of using Amazon DynamoDB.
Amazon DynamoDB Object Persistence Framework
The object persistence functionality in the AWS SDK for .NET enables you to easily map .NET classes to Amazon DynamoDB items. By using your own classes to store and retrieve Amazon DynamoDB data, you can use Amazon DynamoDB without worrying about data conversion or developing middle-layer solutions that interface with the Amazon DynamoDB service.
The example in this article requires that you have AWS credentials. If you haven't already signed up for Amazon Web Services (AWS), you will need to do that first in order to receive AWS credentials. You can sign up for AWS here. After you have signed up, you can retrieve your credentials at this page.
Scenario
In this article, we design a simple data model representing movies and actors. These are stored in two different Amazon DynamoDB tables: Movies and Actors.
To start, let's first model the .NET data types that represent this data on the client:
public class Movie { public string Title { get; set; } public DateTime ReleaseDate { get; set; } public List |
Movie and Actor are their own classes. The Movie class does not reference Actor objects. Instead, it only lists actor names.
Attribute Markup
To begin using the object persistence functionality in the AWS SDK for .NET, a class must be annotated with (or must inherit) at least the following attributes: DynamoDBTable and DynamoDBHashKey. The class might need to have the DynamoDBRangeKey attribute applied as well. With the exception of the DynamoDBTable attribute, which is applied to a class, all attributes are applied to fields or properties. These members must be public and must support both read and write operations.
Primary Key
In Amazon DynamoDB, a primary key can have one or two components. The first component is called the Hash Key. The second - optional - component is called the Range Key. Whether to use one or two components depends on how the data is structured and how you expect to interact with it.
If your data has a single attribute, the value of which can uniquely identify each item, and one item is not related to another item, you can use just the Hash Key. In our example, we are going to assume that actors have distinct names, and thus this name can be our single unique identifier - or Hash Key - for the Actors table. If items in your data set are related to each other in some way, the Hash Key can be an attribute that links related data together, while the Range Key uniquely identifies a particular item in that subset. In our example, we expect multiple movies to share the same name (for instance, there are at least 8 movies named "Titanic"), but to have been released at different times. So the Movies table will use the movie name as the Hash Key and the movie release date as the Range Key.
A different example involving Hash and Range Keys would be one for forum posts: one table would contain Questions; another would contain Replies. The Questions table would have only a Hash Key. The Replies table would have both a Hash Key and a Range Key: the Hash Key is the question, and the Range Key is the time of the response. This way we can easily query all replies to a question and retrieve chronological results.
Required attributes
- DynamoDBTable
- This attribute designates the target Table and must be applied to a class. This is the table that will hold our data items. All data operations (save, load, query, etc.) will be run against this table.
- We must define this attribute on the Actor and Movie classes. Actor objects are going to be stored in the Actors table, while the Movie objects will go into the Movies table.
[DynamoDBTable("Movies")] public class Movie {...} |
- DynamoDBHashKey
- This attribute marks the Hash Key, that is, the hash component of the primary key.
- On the Movie class, we're going to place it on the Title property. For the Actor class, it will be placed on the Name property.
[DynamoDBHashKey] public string Title { get; set; } |
- DynamoDBRangeKey
- This attribute marks the Range Key, that is, the range component of the primary key. If the target table defines only a hash primary key, then this attribute should not be present.
- On the Movie class, this attribute will be placed on the ReleaseDate property. We're not going to add this attribute to the Actor class because the Actors table does not have a Range Key component.
[DynamoDBRangeKey] public DateTime ReleaseDate { get; set; } |
Optional Attributes
Now that we've defined where our data should be stored and what the primary key is on each class, we can customize our data a bit more.
- DynamoDBIgnore
-
Some properties that we have in our class definitions are there for ease of use,
but don't really need to be stored in Amazon DynamoDB. By default, all public, read-write
properties and fields are stored in Amazon DynamoDB. We can markup those
fields and properties that should be ignored.
- This attribute, DynamoDBIgnore, forces the member to be ignored. The data in the target member will not be stored to DynamoDB, and will not be set upon item retrieval.
- On the Actor class, we have a property called to Comment. This is a field that we can use client-side, but don't want to store it. Thus, we apply the DynamoDBIgnore attribute to it.
[DynamoDBIgnore] public string Comment { ... } |
- DynamoDBProperty
- We might prefer to store some properties under names that are different than those used in our class definition. For instance, a name that is helpful for a .NET developer may be too verbose to store in Amazon DynamoDB. Or it may not match what a different client/SDK is expecting. We can rename these properties without altering how the client sees them.
- By default, attribute data is stored in DynamoDB under an attribute name matching the name of the member (field or property) that is defined on the client-side type. This attribute, DynamoDBProperty, allows a user to specify an alternate attribute name, which is the name that will be used in the target table.
[DynamoDBProperty(AttributeName="Bio")] public string Biography { get; set; } [DynamoDBProperty(AttributeName="Height")] public float HeightInMeters { get; set; } |
DynamoDBContext
Now that we've prepared our data to be stored in Amazon DynamoDB, let's talk about how to actually handle the data.
All operations using the object persistence framework are managed using the DynamoDBContext
class. This class exposes methods to Save, Load, Delete and Query/Scan items.
A DynamoDBContext object needs to be constructed with an AmazonDynamoDB client.
All data operations are then executed against the provided client. The DynamoDBContext
object is thread-safe, so it is a good idea to construct it once per application
and reuse it.
The sample below illustrates the construction of an AmazonDynamoDB client and a DynamoDBContext. Note that this approach retrieves both Access and Secret Key values from ConfigurationManager.AppSettings. The values must be stored under "AWSAccessKey" and "AWSSecretKey", respectively.
AmazonDynamoDBClient client = new AmazonDynamoDBClient(); DynamoDBContext context = new DynamoDBContext(client); |
CRUD (Create, Read, Update, Delete)
We have almost reached the point where we can store and retrieve our data in Amazon DynamoDB.
Before the next step, make sure that the tables you will be working with are already
created. The object persistence layer does not attempt to create tables. But it will verify that the table you're using matches the schema of your data. For instance, if we define the Actors table to contain a Range Key, but don't add one to our Actor class, we will encounter a runtime exception. For an example of the code that creates a table, refer to the Amazon DynamoDB documentation or take a look at the sample app that is provided with the AWS SDK for .NET. |
The sample below illustrates basic CRUD operations: an Actor object is Created, saved to Amazon DynamoDB, Reloaded from Amazon DynamoDB, Updated on the client and saved, then Deleted.
Actor actor = new Actor("John Doe"); context.Save(actor); actor = context.Load |
Here is a similar example with a Movie object. The difference here is that Movie has a Range Key, so we need to specify that in addition to the Hash Key when loading an item.
Movie movie = new Movie("Casablanca", new DateTime(1943, 1, 23)); context.Save(movie); movie = context.Load |
As you can see, it's fairly simple to interact with Amazon DynamoDB once we've sufficiently marked up our classes. This is the power of the object persistence layer: the marshalling and the unmarshalling of data occurs behind the scenes and the developer doesn't need to concern themselves with this "plumbing" in order to talk to Amazon DynamoDB.
Query/Scan
If we are looking for items in a table, there are two ways to search: a Query and a Scan. Both are supported by the object persistence model, but there are limitations to each:
- Query
- Query only operates on a table that has both a Hash Key and a Range Key.
- Query must specify the Hash Key and can define a Range Key filter.
- Scan
- Scan can operate on a table with or without a Range Key.
- Scan processes all items in a table, returning only those that match the specified criteria.
- Scan is an expensive operation and should be used with care to avoid disrupting your higher priority production traffic on the table.
- See the Amazon DynamoDB developer guide for more recommendations for safely using the Scan operation.
When should we use Query and when Scan? If our data has a Hash Key and a Range Key and we already know the Hash component, we should use Query. Otherwise, we could use Scan. However, if we find ourselves in a situation that requires frequent Scans on large tables, we should rethink how our data is stored or used. Perhaps the data could be restructured to make less-expensive Query operations possible.
In a given scenario, we might find ourselves performing a search for the movie "Casablanca", with the added condition that it must have been released before 1960. Below is an example of a Query that would accomplish this.
DateTime date = new DateTime(1960, 1, 1); IEnumerable |
Examples of using Scan in the object persistence framework will be shown in a future Amazon DynamoDB article.
Conclusion
This article described how to prepare your .NET classes to work with the object persistence functionality of Amazon DynamoDB; and how to use object persistence to create, retrieve, update, and delete data in Amazon DynamoDB. We also saw how to Query tables and retrieve matching items. Aside from these basic operations, the object persistence functionality supports a number of advanced features, such as custom data conversion, optimistic locking, batch item retrieval, scan operations and working with both the document model and base APIs. These will be covered in a future article.
References
Find more information about Amazon DynamoDB here.
A sample app that includes this code is provided with the AWS SDK for .NET. The download links can be found here: AWS SDK for .NET.
Questions?
Please feel free to ask questions or provide comments in the .NET Development Forum.