Tag: DynamoDB


General Availability Release of the aws-record Gem

by Alex Wood | on | in Ruby | Permalink | Comments |  Share

Today, we’re pleased to announce the GA release of version 1.0.0 of the aws-record gem.

What Is aws-record?

In version 1 of the AWS SDK for Ruby, the AWS::Record class provided a data mapping abstraction over Amazon DynamoDB operations. Earlier this year, we released the aws-record developer preview as a separately packaged library to provide a similar data mapping abstraction for DynamoDB, built on top of the AWS SDK for Ruby version 2. After customer feedback and some more development work, we’re pleased to move the library out of developer preview to general availability.

How to Include the aws-record Gem in Your Project

The aws-record gem is available now from RubyGems:

 

gem install aws-record

 

You can also include it in your project’s Gemfile:

 

# Gemfile
gem 'aws-record', '~> 1.0'

 

This automatically includes a dependency on the aws-sdk-resources gem, major version 2. Be sure to include the aws-sdk or aws-sdk-resources gem in your Gemfile if you need to lock to a specific version, like so:

 

 # Gemfile
gem 'aws-record', '~> 1.0'
gem 'aws-sdk-resources', '~> 2.5'

 

Working with DynamoDB Tables Using the aws-record Gem

Defining an Aws::Record Model

The aws-record gem provides the Aws::Record module, which you can include in a class definition. This decorates your class with a variety of helper methods that can simplify interactions with Amazon DynamoDB. For example, the following model uses a variety of preset attribute definition helper methods and attribute options:

 

require 'aws-record'

class Forum
  include Aws::Record  

  string_attr     :forum_uuid, hash_key: true
  integer_attr    :post_id,    range_key: true
  string_attr     :author_username
  string_attr     :post_title
  string_attr     :post_body
  string_set_attr :tags,       default_value: Set.new 
  datetime_attr   :created_at, database_attribute_name: "PostCreatedAtTime"
  boolean_attr    :moderation, default_value: false
end

 

Using Validation Libraries with an Aws::Record Model

The aws-record gem does not come with a built-in validation process. Rather, it is designed to be a persistence layer, and to allow you to bring your own validation library. For example, the following model includes the popular ActiveModel::Validations module, and has defined a set of validations that will be run when we attempt to save an item:

 

require 'aws-record'
require 'active_model'

class Forum
  include Aws::Record
  include ActiveModel::Validations

  string_attr     :forum_uuid, hash_key: true
  integer_attr    :post_id,    range_key: true
  string_attr     :author_username
  string_attr     :post_title
  string_attr     :post_body
  string_set_attr :tags,       default_value: Set.new 
  datetime_attr   :created_at, database_attribute_name: "PostCreatedAtTime"
  boolean_attr    :moderation, default_value: false 


  validates_presence_of :forum_uuid, :post_id, :author_username
  validates_length_of :post_title, within: 4..30
  validates_length_of :post_body,  within: 2..5000
end

 

Creating a DynamoDB Table for a Model with Aws::Record::TableMigration

The aws-record gem provides a helper class for table operations, such as migrations. If we wanted to create a table for our Forum model in DynamoDB, we would run the following migration:

 

migration = Aws::Record::TableMigration.new(Forum)
migration.create!(
  provisioned_throughput: {
    read_capacity_units: 5,
    write_capacity_units: 2
  }
)
migration.wait_until_available

 

You can write these migrations in your Rakefile or as standalone helper scripts for your application. Because you don’t need to update your table definition for additions of non-key attributes, you may find that you’re not running migrations as often for your Aws::Record models.

Working with DynamoDB Items Using the aws-record Gem

Creating and Persisting a New Item

Using the example model above, once it has been created in the DynamoDB remote end using Aws::Record::TableMigration (or if it already existed in the remote end), it is simple to create and save a new item:

 

post = Forum.new(
  forum_uuid: FORUM_UUID,
  post_id: 1,
  author_username: "Author One",
  post_title: "Hello!",
  post_body: "Hello, world!"
)
post.created_at = Time.now
post.save # Performs a put_item call.

 

You can set attributes when you initialize a new item and with setter methods that are defined for you automatically.

Finding and Modifying an Item

A class-level method #find is provided to look up items from DynamoDB using your model’s key attributes. After setting a few new attribute values, calling #save will make an update call to DynamoDB, reflecting only the item changes you’ve made. This is important for users who are fetching items with projections (which may not include all attributes), or using single-table inheritance patterns (who may not have modeled all attributes present in a remote item), to avoid clobbering unmodeled or non-included attribute values.

 

post = Forum.find(forum_uuid: FORUM_UUID, post_id: 1)
post.post_title = "(Removed)"
post.post_body = "(Removed)"
post.moderation = true
post.save # Performs an update_item call on dirty attributes only.

 

There is also a class-level method to directly build and make an update call to DynamoDB, using key attributes to identify the item and non-key attributes to form the update expression:

 

Forum.update(
  forum_uuid: FORUM_UUID,
  post_id: 1,
  post_title: "(Removed)",
  post_body: "(Removed)",
  moderation: true
)

 

The preceding two code examples are functionally equivalent. You’ll have the same database state after running either snippet.

A Note on Dirty Tracking

In our last example, we talked about how item updates only reflect changes to modified attributes. Users of ActiveRecord or similar libraries will be familiar with the concept of tracking dirty attribute values, but aws-record is a bit different. That is because DynamoDB supports collection attribute types, and in Ruby, collection types are often modified through object mutation. To properly track changes to an item when objects can be changed through mutable state, Aws::Record items will, by default, keep deep copies of your attribute values when loading from DynamoDB. Attribute changes through mutation, like this example, will work the way you expect:

 

post = Forum.find(forum_uuid: FORUM_UUID, post_id: 1)
post.tags.add("First")
post.dirty? # => true
post.save # Will call update_item with the new tags collection.

 

Tracking deep copies of attribute values has implications for performance and memory. You can turn off mutation tracking at the model level. If you do so, dirty tracking will still work for new object references, but will not work for mutated objects:

 

class NoMTModel
  include Aws::Record
  disable_mutation_tracking
  string_attr :key, hash_key: true
  string_attr :body
  map_attr    :map
end

item = NoMTModel.new(key: "key", body: "body", map: {})
item.save # Will call put_item
item.map[:key] = "value"
item.dirty? # => false, because we won't track mutations to objects
item.body = "New Body"
item.dirty? # => true, because we will still notice reassignment
# Will call update_item, but only update :body unless we mark map as dirty explicitly.
item.save

 

Try the aws-record Gem Today!

We’re excited to hear about what you’re building with aws-record. Feel free to leave your feedback in the comments, or open an issue in our GitHub repo. Read through the documentation and get started!

Introducing the Aws::Record Developer Preview

by Alex Wood | on | in Ruby | Permalink | Comments |  Share

We are happy to announce that the aws-record gem is now in Developer Preview and available for you to try.

What Is Aws::Record?

In version 1 of the AWS SDK for Ruby, the AWS::Record class provided a data mapping abstraction over Amazon DynamoDB operations. As version 2 of the AWS SDK for Ruby was being developed, many of you asked for an updated version of the library.

The aws-record gem provides a data mapping abstraction for DynamoDB built on top of the AWS SDK for Ruby version 2.

Using Aws::Record

You can download the aws-record gem from RubyGems by including the --pre flag in a gem installation:

gem install 'aws-record' --pre

You can also include it in your Gemfile. Do not include a version lock yet, so that bundler can find the pre-release version:

# Gemfile
gem 'aws-record'

Defining a Model

To create an aws-record model, include the Aws::Record module in your class definition:

require 'aws-record'

class Forum
  include Aws::Record
end

This will decorate your class with helper methods you can use to create a model compatible with DynamoDB’s table schemas. You might define keys for your table:

require 'aws-record'

class Forum
  include Aws::Record
  string_attr  :forum_uuid, hash_key: true
  integer_attr :post_id,    range_key: true
end

When you use these helper methods, you do not need to worry about how to define these attributes and types in DynamoDB. The helper methods and marshaler classes are able to define your table and item operations for you. The aws-record gem comes with predefined attribute types that cover a variety of potential use cases:

require 'aws-record'

class Forum
  include Aws::Record
  string_attr   :forum_uuid, hash_key: true
  integer_attr  :post_id,    range_key: true
  string_attr   :author_username
  string_attr   :post_title
  string_attr   :post_body
  datetime_attr :created_at
  map_attr      :post_metadata
end

Creating a DynamoDB Table

The aws-record gem provides a helper class for table operations, such as migrations. If we wanted to create a table for our Forum model in DynamoDB, we would run the following migration:

require 'forum' # Depending on where you defined the class above.

migration = Aws::Record::TableMigration.new(Forum)

migration.create!(
  provisioned_throughput: {
    read_capacity_units: 10,
    write_capacity_units: 4
  }
)

migration.wait_until_available # Blocks until table creation is complete.

Operations with DynamoDB Items

With a model and table defined, we can perform operations that relate to items in our table. Let’s create a post:

require 'forum'
require 'securerandom'

uuid = SecureRandom.uuid

post = Forum.new
post.forum_uuid = uuid
post.post_id = 1
post.author_username = "User1"
post.post_title = "Hello!"
post.post_body = "Hello Aws::Record"
post.created_at = Time.now
post.post_metadata = {
  this_is_a: "Post",
  types_supported_include: ["String", "Integer", "DateTime"],
  how_many_times_ive_done_this: 1
}

post.save # Writes to the database.

This example shows us some of the types that are supported and serialized for you. Using the key we’ve defined, we can also find this object in our table:

my_post = Forum.find(forum_uuid: uuid, post_id: 1)
my_post.post_title # => "Hello!"
my_post.created_at # => #<DateTime: 2016-02-09T14:39:07-08:00 ((2457428j,81547s,0n),-28800s,2299161j)>

You can use the same approach to save changes or, as shown here, you can delete the item from the table:

my_post.delete! # => true

At this point, we know how to use Aws::Record to perform key-value store operations powered by DynamoDB and have an introduction to the types available for use in our tables.

Querying, Scanning, and Collections

Because it is likely that you’re probably doing Query and Scan operations in addition to key-value operations, aws-record provides support for integrating them with your model class.

When you include the Aws::Record module, your model class is decorated with #query and #scan methods, which correspond to the AWS SDK for Ruby client operations. The response is wrapped in a collection enumerable for you. Consider the following basic scan operation:

Forum.scan # => #<Aws::Record::ItemCollection:0x007ffc293ec790 @search_method=:scan, @search_params={:table_name=>"Forum"}, @model=Forum, @client=#<Aws::DynamoDB::Client>>

No client call has been made yet: ItemCollection instances are lazy, and only make client calls only when needed. Because they provide an enumerable interface, you can use any of Ruby’s enumerable methods on your collection, and your result page is saved:

resp = Forum.scan
resp.take(1) # Makes a call to the underlying client. Returns a 'Forum' object.
resp.take(1) # Same result, but does not repeat the client call.

Because the Aws::Record::ItemCollection uses version 2 ofthe AWS SDK for Ruby, pagination support is built-in. So, if your operation requires multiple DynamoDB client calls due to response truncation, ItemCollection will handle the calls required in your enumeration:

def author_posts
  Forum.scan.inject({}) do |acc, post|
    author = post.author_username
    if acc[author]
      acc[author] += 1
    else
      acc[author] = 1
    end
    acc
  end
end

The same applies for queries. Your query result will also be provided as an enumerable ItemCollection:

def posts_by_forum(uuid)
  Forum.query(
    key_condition_expression: "#A = :a",
    expression_attribute_names: {
      "#A" => "forum_uuid"
    },
    expression_attribute_values: {
      ":a" => uuid
    }
  )
end

Given this functionality, you have the flexibility to mix and match Ruby’s enumerable functionality with DynamoDB filter expressions, for example, to curate your results. These two functions return the same set of responses:

def posts_by_author_in_forum(uuid, author)
  posts_by_forum(uuid).select do |post|
    post.author_username == author
  end
end

def posts_by_author_in_forum_with_filter(uuid, author)
  Forum.query(
    key_condition_expression: "#A = :a",
    filter_expression: "#B = :b",
    expression_attribute_names: {
      "#A" => "forum_uuid",
      "#B" => "author_username"
    },
    expression_attribute_values: {
      ":a" => uuid,
      ":b" => author
    }
  )
end

Support for Secondary Indexes

Aws::Record also supports both local and global secondary indexes. Consider this modified version of our Forum table:

require 'aws-record'

class IndexedForum
  include Aws::Record

  string_attr   :forum_uuid, hash_key: true
  integer_attr  :post_id,    range_key: true
  string_attr   :author_username
  string_attr   :post_title
  string_attr   :post_body
  datetime_attr :created_at
  map_attr      :post_metadata

  global_secondary_index(:author,
    hash_key: :author_username,
    projection: {
      projection_type: "INCLUDE",
      non_key_attributes: ["post_title"]
    }
  )

  local_secondary_index(:by_date,
    range_key: :created_at,
    projection: {
      projection_type: "ALL"
    }
  )
end

You can see the table’s attributes are the same, but we’ve included a couple potentially useful indexes.

  • :author: This uses the author name as a partition, which provides a way to search across forums by author user name without having to scan and filter. Take note of the projection, because your global secondary index results will only return the :forum_uuid, :post_id, :author_username, and :post_title. Other attributes will be missing from this projection, and you would have to hydrate your item by calling #reload! on the item instance.
  • :by_date: This provides a way to sort and search within a forum by post creation date.

To create this table with secondary indexes, you create a migration like we did before:

require 'indexed_forum'

migration = Aws::Record::TableMigration.new(IndexedForum)

migration.create!(
  provisioned_throughput: {
    read_capacity_units: 10,
    write_capacity_units: 4
  },
  global_secondary_index_throughput: {
    author: {
      read_capacity_units: 5,
      write_capacity_units: 3
    }
  }
)

migration.wait_until_available

You can use either of these indexes with the query interface:

require 'indexed_forum'

def search_by_author(author)
  IndexedForum.query(
    index_name: "author",
    key_condition_expression: "#A = :a",
    expression_attribute_names: {
      "#A" => "author_username"
    },
    expression_attribute_values: {
      ":a" => author
    }
  )
)

Secondary indexes can be a powerful performance tool, and aws-record can simplify the process of managing them.

Get Involved!

Please download the gem, give it a try, and let us know what you think. This project is a work in progress, so we welcome feature requests, bug reports, and information about the kinds of problems you’d like to solve by using this gem. And, as with other SDKs and tools we produce, we’d also be happy to look at contributions.

You can find the project on GitHub at https://github.com/awslabs/aws-sdk-ruby-record

Please reach out and let us know what you think!

Amazon DynamoDB Document API in Ruby (Part 3 – Update Expressions)

by Trevor Rowe | on | in Ruby | Permalink | Comments |  Share

As we showed in previous posts, it’s easy to put JSON items into Amazon DynamoDB, retrieve specific attributes with projection expressions, and fetch only data that meet some criteria with condition expressions. Now, let’s take a look at how we can conditionally modify existing items with Update Expressions. (Note: this code uses the same ProductCatalog table we used in Parts 1 and 2).

In the following examples, we use the following helper method to perform conditional updates. It performs the UpdateItem operation with return_values set to return the old item. We also use the GetItem operation so the method can return both the old and new items for us to compare. (If the update condition in the request is not met, then the method sets the returned old item to nil.)

def do_update_item(key_id, update_exp, condition_exp, exp_attribute_values)
  begin
    old_result = @dynamodb.update_item(
      :update_expression => update_exp,
      :condition_expression => condition_exp,
      :expression_attribute_values => exp_attribute_values,
      :table_name => "ProductCatalog",
      :key => { :Id => key_id },
      :return_values => "ALL_OLD",
    ).data.attributes
  rescue Aws::DynamoDB::Errors::ConditionalCheckFailedException
    old_result = nil
    puts "Condition not met"
  end

  new_result = @dynamodb.get_item(
    :table_name => "ProductCatalog", :key => { :Id => key_id },
    :consistent_read => true
  ).data.item  

  return old_result, new_result
end

Using Conditional Update Expressions

Updates in DynamoDB are atomic. This allows applications to concurrently update items without worrying about conflicts occurring. For example, the following code demonstrates maintaining a MAX value in DynamoDB with a conditional update using SET. Note that, because DynamoDB is schema-less, we don’t need to define the HighestRating attribute beforehand. Instead, we create it on the first call.

# storing a "max" value with conditional SET
# SET attribute if doesn't exist, otherwise SET if stored highest rating < this rating
def update_highest_rating(rating)
  do_update_item(303,
    "SET HighestRating = :val",
    "attribute_not_exists(HighestRating) OR HighestRating < :val",
    {
      ":val" => rating
    }
  )
end

# multiple threads trying to SET highest value (ranging from 0 to 10)
threads = []
(0..10).to_a.shuffle.each { |i|
  # some number of "Condition not met" depending on shuffled order
  puts i
  threads[i] = Thread.new {
    update_highest_rating(i)
  }
}
threads.each {|t| t.join}

# fetch the item and examine the HighestRating stored
puts "Max = #{@dynamodb.get_item(
  :table_name => "ProductCatalog", :key => { :Id => 303 }
).data.item["HighestRating"].to_i}"   # Max = 10

We can also use update expressions to atomically maintain a count and add to a set:

# ADD to intialize/increment and add to set
threads = []
20.times do |i|
  threads[i] = Thread.new {
    do_update_item(303,
      "ADD TimesViewed :val, Tags :was_here",
      nil, # no condition expression
      {
        # Each of the 20 threads increments by 1
        ":val" => 1,

        # Each thread adds to the tag set
        # Note: type must match stored attribute's type
        ":was_here" => Set.new(["#Thread#{i}WasHere"])
      }
    )
  }
end
threads.each {|t| t.join}

# fetch the item and examine the TimesViewed attribute
item = @dynamodb.get_item(
  :table_name => "ProductCatalog", :key => { :Id => 303 }
).data.item

puts "TimesViewed = #{item["TimesViewed"].to_i}"
# TimesViewed = 20

puts "Tags = #{item["Tags"].inspect}"
# Tags = #<Set: {"#Mars", "#MarsCuriosity", "#StillRoving", ..each thread was here...}>

Similarly, we can decrement the count and remove from the set to undo our previous operations.

# Undo the views and set adding that we just performed
threads = []
20.times do |i|
  threads[i] = Thread.new {
    do_update_item(303,
      "ADD TimesViewed :val DELETE Tags :was_here",
      nil,  # no condition expression
      {
        # Each of the 20 threads decrements by 1
        ":val" => -1,

        # Each thread removes from the tag set
        # Note: type must match stored attribute's type
        ":was_here" => Set.new(["#Thread#{i}WasHere"])
      }
    )
  }
end
threads.each {|t| t.join}

# fetch the item and examine the TimesViewed attribute
item = @dynamodb.get_item(
  :table_name => "ProductCatalog", :key => { :Id => 303 }
).data.item

puts "TimesViewed = #{item["TimesViewed"].to_i}"
# TimesViewed = 0

puts "Tags = #{item["Tags"].inspect}"
# Tags = #<Set: {"#Mars", "#MarsCuriosity", "#StillRoving"}>

We can also use the REMOVE keyword to delete attributes, such as the HighestRating and TimesViewed attributes we added in the previous code.

# removing attributes from items
old_and_new = do_update_item(303,
  "REMOVE HighestRating, TimesViewed",
  nil,  # no condition expression
  nil,  # no attribute expression values
)

puts "OLD HighestRating is nil ? #{old_and_new[0]["HighestRating"] == nil}"
#=> false

puts "OLD TimesViewed is nil ? #{old_and_new[0]["TimesViewed"] == nil}"
#=> false

puts "NEW HighestRating is nil ? #{old_and_new[1]["HighestRating"] == nil}"
#=> true

puts "NEW TimesViewed is nil ? #{old_and_new[1]["TimesViewed"] == nil}"
#=> true

Conclusion

We hope this series was helpful in demonstrating expressions and how they allow you to interact with DynamoDB more flexibly than before. We’re always interested in hearing what developers would like to see in the future, so let us know what you think in the comments or through our forums!

Amazon DynamoDB Document API in Ruby (Part 1 – Projection Expressions)

by Trevor Rowe | on | in Ruby | Permalink | Comments |  Share

Amazon DynamoDB launched JSON Document Support along with several improvements to the DynamoDB API. This post is part of a series where we’ll explore these features in more depth with the AWS SDK for Ruby V2. In particular, this post focuses on putting items into DynamoDB using the Ruby SDK and controlling the data we get back with projection expressions. At the end of the post, we also provide some helpful information for getting started with DynamoDB Local.

Putting JSON data into DynamoDB

DynamoDB now supports the following new data types: Maps, Lists, Booleans, and Nulls. Suppose we have a DynamoDB table for products with a hash key on an "Id" attribute. It’s easy to store such data into DynamoDB with native Ruby types:

# put a JSON item
item = {
  Id: 205, # hash key
  Title: "20-Bicycle 205",
  Description: "205 description",
  BicycleType: "Hybrid",
  Brand: "Brand-Company C",
  Price: 500,
  Gender: "B",
  Color: Set.new(["Red", "Black"]),
  ProductCategory: "Bike",
  InStock: true,
  QuantityOnHand: nil,
  NumberSold: BigDecimal.new("1E4"),
  RelatedItems: [
    341, 
    472, 
    649
  ],
  Pictures: { # JSON Map of views to url String
    FrontView: "http://example.com/products/205_front.jpg", 
    RearView: "http://example.com/products/205_rear.jpg",
    SideView: "http://example.com/products/205_left_side.jpg",
  },
  ProductReviews: { # JSON Map of stars to List of review Strings
    FiveStar: [
      "Excellent! Can't recommend it highly enough!  Buy it!",
      "Do yourself a favor and buy this."
    ],
    OneStar: [
      "Terrible product!  Do not buy this."
    ]
  }
}
dynamodb.put_item(:table_name => "ProductCatalog", :item => item)

Getting data from DynamoDB using projection expressions

Since DynamoDB now supports more interesting data types, we’ve also added projection expressions and expression attribute names to make it easier to retrieve only the attributes we want:

# get only the attributes we want with projection expressions
item = dynamodb.get_item(
  :table_name => "ProductCatalog",

  # Get the item with Id == 205
  :key => {
    :Id => 205
  },

  # for less typing, use expression attribute names to substitute
  # "ProductReviews" with "#pr" and "RelatedItems" with "#ri"
  :expression_attribute_names => {
    "#pr" => "ProductReviews",
    "#ri" => "RelatedItems",
  },

  # get Price, Color, FiveStar reviews, 0th and 2nd related items
  :projection_expression => "Price, Color, #pr.FiveStar, #ri[0], #ri[2], 
    #pr.NoStar, #ri[4]" # try projecting non-existent attributes too
).data.item

puts item["Price"].to_i
# 500

puts item["Color"].inspect
# #<Set: {"Black", "Red"}>

puts item["ProductReviews"]["FiveStar"][0]
# Excellent! Can't recommend it highly enough!  Buy it!

puts item["ProductReviews"]["FiveStar"][1]
# Do yourself a favor and buy this.

puts item["ProductReviews"]["OneStar"].inspect
# nil (because we only projected FiveStar reviews)

puts item["ProductReviews"]["NoStar"].inspect
# nil (because no NoStar reviews)

puts item["RelatedItems"]
# 0.341E3   (0th element)
# 0.649E3   (2nd element)

puts item["RelatedItems"].size
# 2 (non-existent 4th element not present)

Next Steps

As you can see, it’s easy to put and get items in DynamoDB with the AWS SDK for Ruby. In upcoming blog posts, we’ll take a closer look at expressions for filtering and updating data.

Feel free to get started on DynamoDB Local with the following code (note that it uses the credentials file approach for specifying AWS credentials):

#! /usr/bin/ruby

require "set"
require "bigdecimal"
require "aws-sdk-core"

# Configure SDK

# use credentials file at .aws/credentials
Aws.config[:credentials] = Aws::SharedCredentials.new
Aws.config[:region] = "us-west-2"

# point to DynamoDB Local, comment out this line to use real DynamoDB
Aws.config[:dynamodb] = { endpoint: "http://localhost:8000" }

dynamodb = Aws::DynamoDB::Client.new

## Create the table if it doesn't exist
begin
  dynamodb.describe_table(:table_name => "ProductCatalog")
rescue Aws::DynamoDB::Errors::ResourceNotFoundException
  dynamodb.create_table(
    :table_name => "ProductCatalog",
    :attribute_definitions => [
      {
        :attribute_name => :Id,
        :attribute_type => :N
      }
    ],
    :key_schema => [
      {
        :attribute_name => :Id,
        :key_type => :HASH
      }
    ],
    :provisioned_throughput => {
      :read_capacity_units => 1,
      :write_capacity_units => 1,
    }
  )

  # wait for table to be created
  puts "waiting for table to be created..."
  dynamodb.wait_until(:table_exists, table_name: "ProductCatalog")
  puts "table created!"
end

DynamoDB JSON and Array Marshaling for PHP

by Jeremy Lindblom | on | in PHP | Permalink | Comments |  Share

Back in October of 2014, Amazon DynamoDB added support for new data types, including the map (M) and list (L) types. These new types, along with some API updates, make it possible to store more complex, multilevel data, and use DynamoDB for document storage.

The DynamoDB Marshaler

To make these new types even easier for our PHP SDK users, we added a new class, called the DynamoDB Marshaler, in Version 2.7.7 of the AWS SDK for PHP. The Marshaler object has methods for marshaling JSON documents and PHP arrays to the DynamoDB item format and unmarshaling them back.

Marshaling a JSON Document

Let’s say you have JSON document describing a contact in the following format:

{
  "id": "5432c69300594",
  "name": {
    "first": "Jeremy",
    "middle": "C",
    "last": "Lindblom"
  },
  "age": 30,
  "phone_numbers": [
    {
      "type": "mobile",
      "number": "5555555555",
      "preferred": true
    },
    {
      "type": "home",
      "number": "5555555556",
      "preferred": false
    }
  ]
}

You can use the DynamoDB Marshaler to convert this JSON document into the format required by DynamoDB.

use AwsDynamoDbDynamoDbClient;
use AwsDynamoDbMarshaler;

$client = DynamoDbClient::factory(/* your config */);
$marshaler = new Marshaler();
$json = file_get_contents('/path/to/your/document.json');

$client->putItem([
    'TableName' => 'YourTable',
    'Item'      => $marshaler->marshalJson($json)
]);

The output of marshalJson() is an associative array that includes all the type information required for the DynamoDB 'Item' parameter.

[
    'id' => ['S' => '5432c69300594'],
    'name' => ['M' => [
        'first' => ['S' => 'Jeremy'],
        'middle' => ['S' => 'C'],
        'last' => ['S' => 'Lindblom'],
    ]],
    'age' => ['N' => '30'],
    'phone_numbers' => ['L' => [
        ['M' => [
            'type' => ['S' => 'mobile'],
            'number' => ['S' => '5555555555']
        ]],
        ['M' => [
            'type' => ['S' => 'home'],
            'number' => ['S' => '5555555556']
        ]],
    ]],
];

To retrieve an item and get the JSON document back, you need to use the unmarshalJson() method.

$result = $client->getItem([
    'TableName' => 'YourTable',
    'Key'       => ['id' => ['S' => '5432c69300594']]
]);
$json = $marshaler->unmarshalJson($result['Item']);

Marshaling a Native PHP Array

The Marshaler also provides the marshalItem() and unmarshalItem() methods that do the same type of thing, but for arrays. This is essentially an upgraded version of the existing DynamoDbClient::formatAttributes() method.

$data = [
    'id' => '5432c69300594',
    'name' => [
        'first'  => 'Jeremy',
        'middle' => 'C',
        'last'   => 'Lindblom',
    ],
    'age' => 30,
    'phone_numbers' => [
        [
            'type'      => 'mobile',
            'number'    => '5555555555',
            'preferred' => true
        ],
        [
            'type'      => 'home',
            'number'    => '5555555556',
            'preferred' => false
        ],
    ],
];

// Marshaling the data and putting an item.
$client->putItem([
    'TableName' => 'YourTable',
    'Item'      => $marshaler->marshalItem($data)
]);

// Getting and item and unmarshaling the data.
$result = $client->getItem([
    'TableName' => 'YourTable',
    'Key'       => ['id' => ['S' => '5432c69300594']]
]);
$data = $marshaler->unmarshalItem($result['Item']);

Be aware that marshalItem() does not support binary (B) and set (SS, NS, and BS) types. This is because they are ambiguous with the string (S) and list (L) types and have no equivalent type in JSON. We are working on some ideas that will provide more help with these types in Version 3 of the SDK.

Deprecations in the SDK

The new data types are a great addition to the Amazon DynamoDB service, but one consequence of adding support for these types is that we had to deprecate the following classes and methods in the AwsDynamoDb namespace of the PHP SDK:

These classes and methods made assumptions about how certain native PHP types convert to DynamoDB types. The addition of the new types to DynamoDB invalidated those assumptions, and we could not update the code in a backward-compatible way to support the new types. They still work fine, but just not with the new types. These classes and methods are removed in Version 3 of the SDK, and the DynamoDB Marshaler object is meant to be the replacement for their functionality.

Feedback

We hope that this addition to the SDK makes working with DynamoDB really easy. If you have any feedback about the Marshaler or any ideas on how we can improve it, please let us know on GitHub. Better yet, send us a pull request. :-)

DynamoDB Series – Expressions

by Norm Johanson | on | in .NET | Permalink | Comments |  Share

For the final installment of our Amazon DynamoDB series, we are going to look at the new expression support. There are two types of expressions used by DynamoDB. First you can use expressions to update specific fields in an item. The other way is to use expressions on puts, updates, or deletes to prevent the operation from succeeding if the item in DynamoDB doesn’t meet the expression.

Update Expressions

Update expressions are great for atomic updates to attributes in an item in DynamoDB. For example, let’s say we add a player item to a DynamoDB table that records the number of games won or lost and the last time a game was played.

PutItemRequest putRequest = new PutItemRequest
{
    TableName = tableName,
    Item = new Dictionary<string, AttributeValue>
    {
        {"id", new AttributeValue{S = "1"}},
        {"name", new AttributeValue{S = "Norm"}},
        {"wins", new AttributeValue{N = "0"}},
        {"loses", new AttributeValue{N = "0"}}
    }
};

ddbClient.PutItem(putRequest);

When a player wins the game, we need to update the wins attribute and set the time the last game was played and who the opponent was. To do that, we could get the item and look up how many wins the player currently has and then update the wins with the current wins + 1. The tricky thing is what happens if there is an update to the item in between the get and the update. We can handle that by putting an ExpectedAttribute value on the update, which will cause the update to fail, and then we could retry the whole process.

Now, using expressions, we can increment the wins attribute without having to first read the value. Let’s look at the update call to see how that works.

UpdateItemRequest updateRequest = new UpdateItemRequest
{
    TableName = tableName,
    Key = new Dictionary<string, AttributeValue>
    {
        {"id", new AttributeValue{S = "1"}}
    },
    UpdateExpression = "ADD #a :increment SET #b = :date, #c = :opponent",
    ExpressionAttributeNames = new Dictionary<string, string>
    {
        {"#a", "wins"},
        {"#b", "last-played"},
        {"#c", "last-opponent"}
    },
    ExpressionAttributeValues = new Dictionary<string, AttributeValue>
    {
        {":increment", new AttributeValue{N = "1"}},
        {":date", new AttributeValue{S = DateTime.UtcNow.ToString("O")}},
        {":opponent", new AttributeValue{S = "Celeste"}}
    }
};

ddbClient.UpdateItem(updateRequest);

The TableName and Key properties are used to identify the item we want to update. The UpdateExpression property is the interesting property where we can see the expression that is run on the item. Let’s break this statement down by each token.

The ADD token is the command token, and for a numeric attribute it adds the specified value to the attribute. Next is the #a token, which is a variable. The ‘#’ means this variable will be replaced with an attribute name. :increment is another variable that is the value to be added to the attribute #a. All tokens that start with ‘:’ are variables that will have a value supplied in the update request.

SET is another command token. It means all the attributes following will have their value set. The #b variable will get its value from the :date variable, and #c will get is value from the :opponent variable.

It is also possible to remove an attribute using the REMOVE command token.

The ExpressionAttributeNames property is used to set all the attribute variables in the expression to the actual attributes we want to use. ExpressionAttributeValues property is used to set all the value variables to the values we want to use in the expression.

Once we invoke the update, DynamoDB guarantees that all the attributes in the expression will be updated at the same time without the worry of some other thread coming in and updating the item in the middle of the process. This also saves us from using up any of our read capacity to do the the update.

Check out the DynamoDB Developer Guide for more information on how to use update expressions.

Conditional Expressions

For Puts, Updates, and Deletes, a conditional expression can be set. If the expression evaluates to false, then a ConditionalCheckFailedException exception is thrown. On the low-level service client, this can be done using the ConditionExpression property. Conditional expressions can also be used on the Document Model API. To take a look how this is done, let’s first create a game document in our game table.

DateTime lastUpdated = DateTime.Now;
Table gameTable = Table.LoadTable(ddbClient, tableName, DynamoDBEntryConversion.V2);

Document game = new Document();
game["id"] = gameId;
game["players"] = new List<string>{"Norm", "Celeste"};
game["last-updated"] = lastUpdated;
gameTable.PutItem(game);

For the game’s logic, every time the game document is updated the last-updated attribute is checked to make sure it hasn’t changed since the document was retrieved and then updated to a new date. So first let’s get the document and update the winner.

Document game = gameTable.GetItem(gameId);

game["winner"] = "Norm";
game["last-updated"] = DateTime.Now;

To declare the conditional expression I need to create an Expression object.

var expr = new Expression();
expr.ExpressionStatement = "attribute_not_exists(#timestamp) or #timestamp = :timestamp";
expr.ExpressionAttributeNames["#timestamp"] = "last-updated";
expr.ExpressionAttributeValues[":timestamp"] = lastUpdated;

This expression evaluates to true if the last-updated attribute does not exist or is equal to the last retrieved timestamp. Then, to use the expression, assign it to the UpdateItemOperationConfig and pass it to the UpdateItem operation.

UpdateItemOperationConfig updateConfig = new UpdateItemOperationConfig
{
    ConditionalExpression = expr
};

try
{
    gameTable.UpdateItem(game, updateConfig);
}
catch(ConditionalCheckFailedException e)
{
    // Retry logic
}

To handle the expression evaluating to false, we need to catch the ConditionalCheckFailedException and call our retry logic. To avoid having to catch exceptions in our code, we can use the new "Try" methods added to the SDK, which return true or false depending on whether the write was successful. So the above code could be rewritten like this:

if(!gameTable.TryUpdateItem(game, updateConfig))
{
    // Retry logic
}

This same pattern can be used for Puts and Deletes. For more information about using conditional expressions, check out the Amazon DynammoDB Developer Guide.

Conclusion

We hope you have enjoyed our series on Amazon DynamoDB this week. Hopefully, you have learned some new tricks that you can use in your application. Let us know what you think either in the comments below or through our forums.

DynamoDB Series – Object Persistence Model

by Pavel Safronov | on | in .NET | Permalink | Comments |  Share

This week, we are running a series of five daily blog posts that will explain new DynamoDB changes and how they relate to the AWS SDK for .NET. This is the fourth blog post, and today we will be discussing the Object Persistence Model.

Object Persistence Model

The Object Persistence Model API provides a simple way to work with Plain Old CLR Objects (POCO), as the following examples illustrate.

First, let’s look at the POCO class definition. (Notice that the class is marked up with multiple Amazon DynamoDB* attributes. These are included for clarity, even though these attributes are now optional, and will be removed in the next sample.)

[DynamoDBTable("Products")]
public class Product
{
    [DynamoDBHashKey]
    public int Id { get; set; }
    [DynamoDBRangeKey]
    public string Name { get; set; }

    public List<string> Aliases { get; set; }
    public bool IsPublic { get; set; }
}

Next, we can create, store, load, and query DynamoDB, all while using our POCO.

var product = new Product
{
    Id = 1,
    Name = "CloudSpotter",
    Aliases = new List<string> { "Prod", "1.0" },
    IsPublic = true,
};
Context.Save(product);
var retrieved = Context.Load(2);
var products = Context.Query<Product>(1, QueryOperator.BeginsWith, "Cloud");

The addition of the DynamoDB data type M (a string-key map of arbitrary data) allows the Object Persistence Model API to store complex data types as attributes of a single DynamoDB item. (We covered the new DynamoDB types earlier this week. It might be a good idea for you to review this again.) To illustrate this, let’s consider the following example where our Product class may reference another class.

Here are the new class definitions we will be working with.

public class Product
{
    public int Id { get; set; }
    public string Name { get; set; }
    public List<string> Aliases { get; set; }
    public bool IsPublic { get; set; }
    public Dictionary<string, string> Map { get; set; }
    public Metadata Meta { get; set; }
}
public class Metadata
{
    public double InternalVersion { get; set; }
    HashSet<string> Developers { get; set; }
}

Notice that we are going to use Dictionary objects, which will also be stored as M data types. (The only limitations are that the key must be of type string, and the value must be a supported primitive type or a complex structure.)

Now we can instantiate and work with our objects as we normally would.

Product product = new Product
{
    Id = 1,
    Name = "CloudSpotter",
    Aliases = new List<string> { "Prod", "1.0" },
    IsPublic = true,
    Meta = new Metadata
    {
        InternalVersion = 1.2,
        Developers = new HashSet<string> { "Alan", "Franco" }
    },
    Map = new Dictionary<string, string>
    {
        { "a", "1" },
        { "b", "2" }
    }
};
Context.Save(product);
var retrieved = Context.Load(2);
var products = Context.Query<Product>(1, QueryOperator.BeginsWith, "Cloud");

As you can see, the new DynamoDB data types really expand the range of data that you can maintain and work with. Though, you do have to be careful that the objects you are creating do not end up having circular references, because the API will end up throwing an exception for these objects.

DynamoDB Series – Conversion Schemas

by Pavel Safronov | on | in .NET | Permalink | Comments |  Share

This week, we are running a series of five daily blog posts that will explain new DynamoDB changes and how they relate to the AWS SDK for .NET. This is the third blog post, and today we will be discussing conversion schemas.

Conversion Schemas

Document doc = new Document();
doc["Id"] = 1;
doc["Product"] = "DataWriter";
doc["Aliases"] = new List<string> { "Prod", "1.0" };
doc["IsPublic"] = true;
table.UpdateItem(doc);

As you have seen earlier this week and in this example, it is very easy to work with a Document object and use .NET primitives. But how is this data actually stored in Amazon DynamoDB? The latest version of DynamoDB has added support for—among other things—native support for booleans (BOOL type) and lists of arbitrary elements (L type). (The various new DynamoDB types are covered in the first post in this series.) So in the above sample, are we taking advantage of these new types? To address this question and to provide a simple mechanism to control how your data is stored, we have introduced the concept of conversion schemas.

Why conversion schemas?

The new DynamoDB features are a powerful addition to the .NET SDK, but adding them to the Document Model API presented a challenge. The API was already capable of writing booleans and lists of items to DynamoDB, so how do the new BOOL and L types fit in? Should all booleans now be stored as BOOL (current implementation stores booleans as N types, either 1 or 0) and should lists be stored as L instead of SS/NS/BS? Changing how data is stored with the new SDK would break existing applications (older code would not be aware of the new types and query/scan conditions depend on the current types), so we have provided conversion schemas so that you can control exactly how your data is stored in DynamoDB. Schema V1 will maintain the current functionality, while Schema V2 will allow you to use the new types.

V1

The default conversion approach that the Document Model uses is as follows:

  • Number types (byte, int, float, decimal, etc.) are converted to N
  • String and char are converted to S
  • Bool is converted to N (0=false, 1=true)
  • DateTime and Guid are converted to S
  • MemoryStream and byte[] are converted to B
  • List, HashSet, and array of numerics types are converted to NS
  • List, HashSet, and array of string-based types are converted to SS
  • List, HashSet, and array of binary-based types are converted to BS

This conversion approach is known as Conversion Schema V1. It is the default conversion that the Document Model API uses and is identical in functionality to the conversion used by the SDK prior to the 2.3.2 release. As you can see, this schema does not take full advantage of the new DynamoDB types BOOL or L: the attribute Aliases will be stored as a string set (SS), while the boolean attribute IsPublic will be stored as a numeric (N).

But what if you wanted to now store a boolean as BOOL instead of N and List<string> as L instead of SS? The simple way to do this would be to use Conversion Schema V2.

V2

Conversion Schema V2 differs from V1 in that boolean values are stored as BOOL types, Lists are stored as L, and HashSets are stored as sets (NS, SS, or BS, depending on the data). So if you use V2 schema to store the Document in our example, it will function identically from the perspective of the application, but the data stored in DynamoDB will be different. The V2 schema differs from V1 in the following ways:

  • Boolean values will be stored as BOOL instead of N.
  • Lists and arrays of numerics, string-based types, and binary-based types are converted to L type.
  • HashSets of numerics, string-based types, and binary-based types are converted to NS, SS, or BS, as appropriate.
  • Other types are not impacted.

Note that Conversion Schema V2 differs in how it stores List<T> vs. HashSet<T>: List<T> is stored as a DynamoDB List (L type), while HashSet<T> is stored as a DynamoDB Set (NS, SS, or BS type). So if we wanted to use schema V2 but keep the Tags attributed as a set, we could update the code in our example to use HashSet<string> instead of List<string>.

Using Conversion Schemas with the Document Model

Conversion schemas are set for a particular Table object. (This means that the same Document stored using different Table objects may result in different data being written to DynamoDB.) The following sample shows how to load two Table objects, one configured with schema V1 and the other with schema V2, using two different LoadTable approaches.

Table tableV1 = LoadTable(client, "SampleTable", DynamoDBEntryConversion.V1);
Table tableV2;
TryLoadTable(client, "SampleTable", DynamoDBEntryConversion.V2, out tableV2);

You may also load a Table object without specifying a conversion, in which case the Table will use either the default V1 conversion or the conversion that you specify in your app.config file, as shown in the following sample.

<configuration>
  <aws>
    <dynamoDB conversionSchema="V2" />
  </aws>
</configuration>

Avoiding Conversion

Conversion schemas are used to convert between .NET types and DynamoDB types. However, if you use classes that extend DynamoDBEntry (such as Primitive, DynamoDBBool or DynamoDBNull), conversion will not performed. So in cases where you want your data to be stored in a particular format irrespective of the conversion schema, you can use these types, as shown below.

var list = new DynamoDBList();
list.Add(1);
list.Add("Sam");
list.Add(new HashSet<string> { "Design", "Logo" });

doc["Bool"] = DynamoDBBool.True;
doc["Null"] = DynamoDBNull.Null;
doc["List"] = list;

Using Conversion Schemas with the Object Persistence Model

The changes to the Object Persistence Model API are very similar to the Document Model changes:

  • With Conversion Schema V1, booleans are stored as N, Lists and HashSets are stored as sets (NS, SS, or BS)
  • With Conversion Schema V2, booleans are stored as BOOL, Lists are stored as L, HashSets are stored as sets (NS, SS, or BS)

Similarly to the Document Model, a conversion schema is associated with a DynamoDBContext and can be explicitly specified in code or the app.config, and will default to V1 if not set. The following example shows how to configure a context with V2 conversion schema.

var config = new DynamoDBContextConfig
{
    Conversion = DynamoDBEntryConversion.V2
};
var contextV2 = new DynamoDBContext(client, config);

Tomorrow, we will take a deeper look into the Object Persistence Model API and how the new DynamoDB types allow you to work with complex classes.

DynamoDB Series – Document Model

by Norm Johanson | on | in .NET | Permalink | Comments |  Share

This week we are running a series of five daily blog posts that will explain new DynamoDB changes and how they relate to the .NET SDK. This is blog post number 2, and today we will be looking at the Document Model API.

Document Model

Yesterday, we learned about Amazon DynamoDB’s new data types such as lists and maps. Today, we are going to talk about how you can use the new data types in the Document Model API.

As a quick refresher on the Document Model API, it is an abstraction over the low-level service client and is found in the Amazon.DynamoDBv2.DocumentModel namespace. Here is an example for creating an item using the Document Model API.

var table = Table.LoadTable(ddbClient, tableName);
Document company = new Document();
company["id"] = "1";
company["name"] = "Dunder Mifflin";
company["industry"] = "paper";

table.PutItem(company);

The Document Model keeps things simple and easy. It takes care of all the data conversions and creating the underlying low-level request objects. Now if you want to take advantage of the new map data type, you can simply create a separate document object and assign it to the parent document. So in our example above, let’s give our paper company an address.

Document address = new Document();
address["street"] = "1725 Slough Avenue";
address["city"] = "Scranton";
address["state"] = "Pennsylvania";

Document company = new Document();
company["id"] = "1";
company["name"] = "Dunder Mifflin";
company["industry"] = "paper";
company["address"] = address;

table.PutItem(company);

To get the data back out, use the GetItem method from the table object passing in the key information for the item.

Document dunder = table.GetItem("1");

Now we can take advantage of the recursive capabilities of the list and map data types and add a list of employees to our branch office.

var address = dunder["address"].AsDocument();
address["employee"] = new List<string> { "Michael Scott", "Dwight Schrute", "Jim Halpert", "Pam Beesly" };
table.UpdateItem(dunder);

As I said, the Document Model translates its calls into the low-level service client. Here is how the Document Model translates the save of the Dunder Mifflin company into a call to the low-level service client.

var request = new PutItemRequest
{
    TableName = tableName,
    Item = new Dictionary<string, AttributeValue>
    {
        {"id", new AttributeValue{S = "1"}},
        {"name", new AttributeValue{S = "Dunder Mifflin"}},
        {"industry", new AttributeValue{S = "paper"}},
        {"address", new AttributeValue
            {M = new Dictionary<string, AttributeValue>
            {
                {"street", new AttributeValue{S = "1725 Slough Avenue"}},
                {"city", new AttributeValue{S = "Scranton"}},
                {"state", new AttributeValue{S = "Pennsylvania"}},
                {"employee", new AttributeValue
                {L = new List<AttributeValue>
                {
                    new AttributeValue{S = "Michael Scott"},
                    new AttributeValue{S = "Dwight Schrute"},
                    new AttributeValue{S = "Jim Halpert"},
                    new AttributeValue{S = "Pam Beesly"}
                }}
            }}
        }
    }}
};


ddbClient.PutItem(request);

Tomorrow, we are going to go deeper into how the Document Model handles converting types passed into a document into the underlying data types that DynamoDB supports.

DynamoDB Series Kickoff

by Pavel Safronov | on | in .NET | Permalink | Comments |  Share

Last week, Amazon DynamoDB added support for JSON document data structures. With this update, DynamoDB now supports nested data in the form of lists (L type) and maps (M type). Also part of this update was native support for booleans (BOOL type) and nulls (NULL type).

This week, we will be running a series of daily blog posts that will explain the new changes and how they relate to the AWS SDK for .NET, and we will see how you can take advantage of these new types to work with complex objects in all three .NET SDK DynamoDB APIs. In this, the first blog post of series, we will see how the low-level API has changed. In the following days, we will cover the Document Model, Conversion Schemas, Object Persistence Model, and finally Expressions.

New types

Until now, DynamoDB had only six data types:

  • Scalars N, S, and B that represent number, string, and binary data.
  • Sets NS, SS, and BS that represent number set, string set, and binary set.
    Sets have the limitation that the data they store has to be homogeneous (e.g., SS could only contain S elements) and unique (no two elements could be the same).

This release expands the possible data types with four new additions:

  • BOOL represents boolean data.
  • NULL represents null values.
  • L type represents a list of elements.
  • M type represents a string-to-element map.

The key point about L and M types is that they can contain any DynamoDB type. This allows you to create, for example, lists of maps of lists, which in turn can contain a mix of numbers, strings, bools, and nulls, or any other conceivable combination of attributes.

Low-level

The low-level API changes are straightforward: new DynamoDB types are now supported in all data calls. Here’s a sample that shows how both old and new types can be used in a PutItem call.

// Put item
client.PutItem("SampleTable", new Dictionary<string, AttributeValue>
{
    { "Id", new AttributeValue { N = "1" } },
    { "Product", new AttributeValue { S = "DataWriter" } },
    { "Aliases", new AttributeValue {
        SS = new List<string> { "Prod", "1.0" } } },
    { "IsPublic", new AttributeValue { BOOL = false } },
    { "Metadata", new AttributeValue {
        M = new Dictionary<string, AttributeValue>
        {
            { "InternalVersion", new AttributeValue { N = "1.2" } },
            { "Developers", new AttributeValue {
                SS = new List<string> { "Alan", "Franko" } } 
            },
            { "SampleInput", new AttributeValue {
                L = new List<AttributeValue>
                {
                    new AttributeValue { BOOL = true },
                    new AttributeValue { N =  "42" },
                    new AttributeValue { NULL = true },
                    new AttributeValue {
                        SS = new List<string> { "apple", "orange" } }
                } }
            }
        } }
    }
});

As you can see, the new M and L AttributeValue types may contain AttributeValues, allowing complex, nested data to be stored in a single DynamoDB record. In the above example, the item we just stored into DynamoDB will have an attribute of type M named "Metadata". This attribute will in turn contain three other attributes: N (number), SS (string set), and L (list). The list contains four more attributes, which in turn can be other M and L types, though in our example they are not.

Tomorrow, we will take a look at how the new additions can be used with the Document Model API.