Using SimpleDB and Rails in No Time with ActiveResource

Articles & Tutorials>Ruby>Using SimpleDB and Rails in No Time with ActiveResource
Martin Rehfeld shows you how to seamlessly leverage Amazon SimpleDB for your Ruby on Rails application using the Rails ActiveResource framework.

Details

Submitted By: klickmich3
AWS Products Used: Amazon SimpleDB
Language(s): Ruby
Created On: February 19, 2008 4:38 PM GMT
Last Updated: September 21, 2008 10:04 PM GMT

by Martin Rehfeld, senior Rails engineer and web services advisor at GL Networks, Germany

Introduction

Amazon SimpleDB is a web service for storing, maintaining, and querying structured data sets in real time. All data is stored in Amazon's web service cloud, making SimpleDB very reliable, scalable, and flexible.

After reading this article, Rails developers will be able to quickly integrate SimpleDB as a storage backend for their projects.

For more information about Amazon SimpleDB, visit the Amazon SimpleDB page on the Amazon Web Services (AWS) site.

Why Rails?

The Rails web application framework, apart from generally being a wonderful tool, offers out-of-the-box support for web service-based data stores with its ActiveResource sub-framework. Only a very thin adapter layer is necessary to bridge the ActiveResource API to SimpleDB. Rails gives you the unique opportunity to utilize SimpleDB just as any other RESTful resource provided by a Rails application.

This article assumes a basic understanding of SimpleDB and Rails and is based on Rails 2.0.2 (the latest shipping version). If you want to dive into Rails, see the Resources section at the end of this article.

Behind The Scenes

For this tutorial, we are going to use a Rails Plugin called AWS SDB Proxy acting as an adapter layer.

AWS SDB Proxy is an HTTP server built with WEBrick (a pure Ruby web server implementation that comes with Ruby's standard library). The proxy will listen for web service calls initiated by ActiveResource models and forward the requests to SimpleDB using the aws-sdb gem by Tim Dysinger.

URL mapping

ActiveResource uses the standard Rails RESTful routes to access web services. The following table illustrates the mapping of Rails' HTTP actions and URIs to SimpleDB operations performed by AWS SDB Proxy:

HTTP/REST SimpleDB
GET /domain/resource?attribute=value[&...] QUERY by exact attribute values
GET /domain/resource/query?query_string QUERY by SimpleDB query string
GET /domain/resource/itemID GET ATTRIBUTES
POST /domain/resource/itemID PUT ATTRIBUTES (create item)
PUT /domain/resource/itemID PUT ATTRIBUTES (replace)
DELETE /domain/resource/itemID DELETE ATTRIBUTES (delete item)

Special Attributes

AWS SDB Proxy handles a couple of special attributes transparently. Here is the complete list:

  • id: Every record automatically gets assigned a unique id using a hash function (see below)
  • _resource: This attribute will always hold the name of the Rails model the item belongs to. This way AWS SDB Proxy can story multiple Rails models within one SimpleDB domain
  • created_at: The time the item was initially was created at (ISO 8601 format)
  • updated_at: The time the item was last modified (ISO 8601 format)

ID Hashing

Record ids are generated using a SHA512 hash function on the request body combined with a timestamp and a configurable salt (config/aws_sdb_proxy.yml). This huge 512 bit hash will make key collisions extremely unlikely.

Pros & Cons

Over at RubyForge, a couple of work-in-progress projects aim at providing an ActiveRecord adapter to SimpleDB. In theory that would enable SimpleDB to become a drop-in replacement for a regular SQL RDBMS in any Rails project. Then again, a lot of jumping through hoops would have to go on to make that happen. SimpleDB just is no RDBMS and is targeted towards completely different usage patterns.

ActiveResource on the other was made exactly with data intergration of remote web services in mind. Despite its somewhat limited feature set compared to ActiveRecord, I like the idea to persue this straight-forward approach.

Getting Up and Running

To set up Amazon SimpleDB for Rails, follow these steps (I assume, you already created a Rails project and have your command line pointed to its directory):

  1. Sign up for Amazon SimpleDB services on your AWS account
  2. Install the aws-sdb Ruby Gem providing a basic Ruby API for SimpleDB:
    gem install aws-sdb
  3. Install the AWS SDB Proxy Plugin for Rails:
    script/plugin install http://rug-b.rubyforge.org/svn/aws_sdb_proxy
  4. Enter your Amazon Web Service credentials in the config/aws_sdb_proxy.yml file (optionally configure server ports and an individual salt used to generate primary keys with a hash function). Do this at least for the development environment.
  5. Either use an existing SimpleDB domain in your account (you can list your domains with rake aws_sdb:list_domains), or create a new one with rake aws_sdb:create_domain DOMAIN=my_new_domain
  6. Start the AWS SDB Proxy server: rake aws_sdb:start_proxy_in_foreground proving debug output on stdout (once you are confident with the configuration you can use rake aws_sdb:start_proxy to start the server as a background daemon)

Using ActiveResource

To make a Rails model access SimpleDB, it must inherit from ActiveResource::Base. For the following examples we will use a Post model that could represent blog posts and thus create the following models/post.rb file:

class Post < ActiveResource::Base
  self.site   = "http://localhost:8888" # Proxy host + port
  self.prefix = "/my_new_domain/"       # SDB domain
end

It assumes that you run your AWS SDB Proxy on localhost at port 8888 and uses a SimpleDB domain named my_new_domain (adjust this according to the configuration you entered in config/aws_sdb_proxy.yml).

Let's fire up script/console and give it a spin.

Creating Records

>> p = Post.create(:title => 'My first SimpleDB post')
=> #<Post:0x198ceec @prefix_options={}, @attributes={"updated_at"=>
   Sun Jan 20 00:42:43 UTC 2008, "title"=>"My first SimpleDB post",
   "id"=>1081408...01005954, "created_at"=>Sun Jan 20 00:42:43 UTC 2008}>

As you can see, a new Post object gets created as usual. The AWS SDB Proxy auto-assigns an id and the additional attributes created_at and updated_at mimicking Rails' standard behaviour.

Please note, that we could have assigned any attributes we wanted to that Post; SimpleDB does not enforce a schema and thus the proxy will happily accept any attributes we throw at it.

Note: Remember that all attributes will be coerced into strings for storage in SimpleDB. No matter what your original data type was, you will always get back a string representation of it when fetching records from SimpleDB.

Updating Records

Let's assign another attribute to the Post and save it to trigger an update operation:

>> p.body = 'Content is king'
=> "Content is king"
    
>> p.save
=> true

If you started AWS SDB Proxy in foreground, you can see it forward the save operation to SimpleDB.

Finding Records

ActiveResource offers a couple of ways to query for records:

# Find by record id
>> Post.find(p.id)
=> #<Post:0x18efef8 @prefix_options={}, @attributes={"updated_at"=>
   Sun Jan 20 00:45:28 UTC 2008, "title"=>"My first SimpleDB post",
   "body"=>"Content is king", "id"=>1081408...01005954, "created_at"=>
   Sun Jan 20 00:42:43 UTC 2008}>

# Find by attribute value(s)
>> Post.find(:first, :params => { :title => 'My first SimpleDB post' })
=> #<Post:0x18efef8 @prefix_options={}, @attributes={"updated_at"=>
   Sun Jan 20 00:45:28 UTC 2008, "title"=>"My first SimpleDB post",
   "body"=>"Content is king", "id"=>1081408...01005954, "created_at"=>
   Sun Jan 20 00:42:43 UTC 2008}>

The first form queries for a single Post with a given id, whereas the second form queries for records with exact matches on every given attribute (:first tells the find method to return only the first of those).

SimpleDB offers more sophisticated query operations than ActiveResource, including lexicographical comparisons, intersection and union. You can pass in native SimpleDB query syntax using this form of find:

# Find by SimpleDB Query
>> Post.find(:all, :from => :query, :params => "['title' starts-with 'My']")
=> #<Post:0x18efef8 @prefix_options={}, @attributes={"updated_at"=>
   Sun Jan 20 00:45:28 UTC 2008, "title"=>"My first SimpleDB post",
   "body"=>"Content is king", "id"=>1081408...01005954, "created_at"=>
   Sun Jan 20 00:42:43 UTC 2008}>

Deleting Records

To complete the life-cycle of our Post, we can finally delete it from SimpleDB:

>> p.destroy
=> true

Resources

Martin Rehfeld is passionate about Ruby on Rails. He has published several Rails plugins and regularly gives talks at Rails related events. If you like Martin's work, consider recommending him on Working With Rails.

Comments

good service, bad libraries and incomplete documentation
the gem has tons of problems. furthermore with the described method, complex queries are not supported (such as OR queries - yes, you can call that complex with simpledb). all in all i would like to see more examples and an updated gem. i think amazon should have enough resources to do that.
hansdump on June 4, 2010 11:04 PM GMT
Good plugin with litte bugs
I've faced few issues with this plugin. As the earlier comment mentions, this plugin uses older constructor. And another bug is, when I try to use class name with Camelcase, the plugin splits this error. NameError: uninitialized constant TestClass from /Library/Ruby/Gems/1.8/gems/activesupport-2.2.2/lib/active_support/dependencies.rb:445:in `load_missing_constant' from /Library/Ruby/Gems/1.8/gems/activesupport-2.2.2/lib/active_support/dependencies.rb:77:in `const_missing' from /Library/Ruby/Gems/1.8/gems/activesupport-2.2.2/lib/active_support/dependencies.rb:89:in `const_missing' from (irb):1 where class TestClass < ActiveResource::Base self.site = "http://localhost:8888" # Proxy host + port self.prefix = "/xyz/" # SDB domain end
Mac ( Maheshwaran S) on February 22, 2009 5:17 AM GMT
The plugin that doesn't connect.
It seems that the plugin does not because it is using an old constructor for the aws_sdb library. If you go in to vendor/plugins/aws_sdb_proxy/lib/aws_sdb_proxy/server.rb and replace SDB_SERVICE = AwsSdb::Service.new(Logger.new(nil),CONFIG['aws_access_key_id'],CONFIG['aws_secret_access_key']) with SDB_SERVICE = AwsSdb::Service.new({:access_key_id=>CONFIG['aws_access_key_id'],:secret_access_key=>CONFIG['aws_secret_access_key']}) it seems to fix some problems
bigplatinum on October 24, 2008 10:48 PM GMT
Hi
Hi, This is great. However, I get the following error when I try "rake aws_sdb:list_domains" ... (in /Users/myname/NetBeansProjects/project) ** Invoke aws_sdb:list_domains (first_time) ** Invoke environment (first_time) ** Execute environment ** Execute aws_sdb:list_domains rake aborted! wrong number of arguments (3 for 1) /Users/myname/NetBeansProjects/project/vendor/plugins/aws_sdb_proxy/tasks/../lib/aws_sdb_proxy/server.rb:27:in `initialize' /Users/myname/NetBeansProjects/project/vendor/plugins/aws_sdb_proxy/tasks/../lib/aws_sdb_proxy/server.rb:27:in `new' /Users/myname/NetBeansProjects/project/vendor/plugins/aws_sdb_proxy/tasks/../lib/aws_sdb_proxy/server.rb:27 /Library/Ruby/Site/1.8/rubygems/custom_require.rb:27:in `gem_original_require' /Library/Ruby/Site/1.8/rubygems/custom_require.rb:27:in `require' /Library/Ruby/Gems/1.8/gems/activesupport-2.0.2/lib/active_support/dependencies.rb:496:in `require' /Library/Ruby/Gems/1.8/gems/activesupport-2.0.2/lib/active_support/dependencies.rb:342:in `new_constants_in' /Library/Ruby/Gems/1.8/gems/activesupport-2.0.2/lib/active_support/dependencies.rb:496:in `require' /Users/dharmesh/NetBeansProjects/FooPetsTranslations/vendor/plugins/aws_sdb_proxy/tasks/aws_sdb_proxy_tasks.rake:28 /Library/Ruby/Gems/1.8/gems/rake-0.8.1/lib/rake.rb:546:in `call' /Library/Ruby/Gems/1.8/gems/rake-0.8.1/lib/rake.rb:546:in `execute' /Library/Ruby/Gems/1.8/gems/rake-0.8.1/lib/rake.rb:541:in `each' /Library/Ruby/Gems/1.8/gems/rake-0.8.1/lib/rake.rb:541:in `execute' /Library/Ruby/Gems/1.8/gems/rake-0.8.1/lib/rake.rb:508:in `invoke_with_call_chain' /Library/Ruby/Gems/1.8/gems/rake-0.8.1/lib/rake.rb:501:in `synchronize' /Library/Ruby/Gems/1.8/gems/rake-0.8.1/lib/rake.rb:501:in `invoke_with_call_chain' /Library/Ruby/Gems/1.8/gems/rake-0.8.1/lib/rake.rb:494:in `invoke' /Library/Ruby/Gems/1.8/gems/rake-0.8.1/lib/rake.rb:1931:in `invoke_task' /Library/Ruby/Gems/1.8/gems/rake-0.8.1/lib/rake.rb:1909:in `top_level' /Library/Ruby/Gems/1.8/gems/rake-0.8.1/lib/rake.rb:1909:in `each' /Library/Ruby/Gems/1.8/gems/rake-0.8.1/lib/rake.rb:1909:in `top_level' /Library/Ruby/Gems/1.8/gems/rake-0.8.1/lib/rake.rb:1948:in `standard_exception_handling' /Library/Ruby/Gems/1.8/gems/rake-0.8.1/lib/rake.rb:1903:in `top_level' /Library/Ruby/Gems/1.8/gems/rake-0.8.1/lib/rake.rb:1881:in `run' /Library/Ruby/Gems/1.8/gems/rake-0.8.1/lib/rake.rb:1948:in `standard_exception_handling' /Library/Ruby/Gems/1.8/gems/rake-0.8.1/lib/rake.rb:1878:in `run' /Library/Ruby/Gems/1.8/gems/rake-0.8.1/bin/rake:31 /usr/bin/rake:19:in `load' /usr/bin/rake:19 Please advise. Thanks in advance!
dharmesh desai on September 26, 2008 10:08 PM GMT
What about Collections?
Interesting article - though the example provided (posts on a blog) begs the question "How do you handle collections?". Suppose each post is categorized. And you want to maintain attributes about your categories? Do you then store the 'category_id' in the post record? And if so, is there an easier way to fetch all of the category data than looping over an array of category ids? I really like the concept behind SimpleDB - but I'm not sure of the best approach for implementing even a modestly complex data structure with normalized data elements. Am I completely missing something obvious?
Richard Luck on September 23, 2008 9:38 PM GMT
Love the idea ... how does it perform?
So, using the proxy clearly is valuable in that it enables the use of REST ... for a client Ruby app and many other apps that can do REST as well. Bravo. In terms of #s of AMIs needed in a scaleable architecture, I'll need to know how many requests per second such a proxy can handle in a normal EC2 AMI? As apps scale up, we'll need 1 proxy for every N app servers, readers will like to be able to estimate what that number N is. Good clear article ... well written, easy to follow. FHW
Fred Wild on June 25, 2008 7:14 PM GMT
We are temporarily not accepting new comments.
©2014, Amazon Web Services, Inc. or its affiliates. All rights reserved.