by Martin Rehfeld, senior Rails engineer and web services advisor at GL Networks, Germany
Introduction
Amazon SimpleDB is a web service for storing, maintaining, and querying structured data sets in real time. All data is stored in Amazon's web service cloud, making SimpleDB very reliable, scalable, and flexible.
After reading this article, Rails developers will be able to quickly integrate SimpleDB as a storage backend for their projects.
For more information about Amazon SimpleDB, visit the Amazon SimpleDB page on the Amazon Web Services (AWS) site.
Why Rails?
The Rails web application framework, apart from generally being a wonderful tool, offers out-of-the-box support for web service-based data stores with its ActiveResource sub-framework. Only a very thin adapter layer is necessary to bridge the ActiveResource API to SimpleDB. Rails gives you the unique opportunity to utilize SimpleDB just as any other RESTful resource provided by a Rails application.
This article assumes a basic understanding of SimpleDB and Rails and is based on Rails 2.0.2 (the latest shipping version). If you want to dive into Rails, see the Resources section at the end of this article.
Behind The Scenes
For this tutorial, we are going to use a Rails Plugin called AWS SDB Proxy acting as an adapter layer.
AWS SDB Proxy is an HTTP server built with WEBrick (a pure Ruby web server implementation that comes with Ruby's standard library). The proxy will listen for web service calls initiated by ActiveResource models and forward the requests to SimpleDB using the aws-sdb gem by Tim Dysinger.
URL mapping
ActiveResource uses the standard Rails RESTful routes to access web services. The following table illustrates the mapping of Rails' HTTP actions and URIs to SimpleDB operations performed by AWS SDB Proxy:
| HTTP/REST | SimpleDB |
|---|---|
| GET /domain/resource?attribute=value[&...] | QUERY by exact attribute values |
| GET /domain/resource/query?query_string | QUERY by SimpleDB query string |
| GET /domain/resource/itemID | GET ATTRIBUTES |
| POST /domain/resource/itemID | PUT ATTRIBUTES (create item) |
| PUT /domain/resource/itemID | PUT ATTRIBUTES (replace) |
| DELETE /domain/resource/itemID | DELETE ATTRIBUTES (delete item) |
Special Attributes
AWS SDB Proxy handles a couple of special attributes transparently. Here is the complete list:
id: Every record automatically gets assigned a uniqueidusing a hash function (see below)_resource: This attribute will always hold the name of the Rails model the item belongs to. This way AWS SDB Proxy can story multiple Rails models within one SimpleDB domaincreated_at: The time the item was initially was created at (ISO 8601 format)updated_at: The time the item was last modified (ISO 8601 format)
ID Hashing
Record ids are generated using a SHA512 hash function on the request body combined with a timestamp and a configurable salt (config/aws_sdb_proxy.yml). This huge 512 bit hash will make key collisions extremely unlikely.
Pros & Cons
Over at RubyForge, a couple of work-in-progress projects aim at providing an ActiveRecord adapter to SimpleDB. In theory that would enable SimpleDB to become a drop-in replacement for a regular SQL RDBMS in any Rails project. Then again, a lot of jumping through hoops would have to go on to make that happen. SimpleDB just is no RDBMS and is targeted towards completely different usage patterns.
ActiveResource on the other was made exactly with data intergration of remote web services in mind. Despite its somewhat limited feature set compared to ActiveRecord, I like the idea to persue this straight-forward approach.
Getting Up and Running
To set up Amazon SimpleDB for Rails, follow these steps (I assume, you already created a Rails project and have your command line pointed to its directory):
- Sign up for Amazon SimpleDB services on your AWS account
- Install the aws-sdb Ruby Gem providing a basic Ruby API for SimpleDB:
gem install aws-sdb - Install the AWS SDB Proxy Plugin for Rails:
script/plugin install http://rug-b.rubyforge.org/svn/aws_sdb_proxy - Enter your Amazon Web Service credentials in the
config/aws_sdb_proxy.ymlfile (optionally configure server ports and an individual salt used to generate primary keys with a hash function). Do this at least for the development environment. - Either use an existing SimpleDB domain in your account (you can list your domains with
rake aws_sdb:list_domains), or create a new one withrake aws_sdb:create_domain DOMAIN=my_new_domain - Start the AWS SDB Proxy server:
rake aws_sdb:start_proxy_in_foregroundproving debug output on stdout (once you are confident with the configuration you can userake aws_sdb:start_proxyto start the server as a background daemon)
Using ActiveResource
To make a Rails model access SimpleDB, it must inherit from ActiveResource::Base. For the following examples we will use a Post model that could represent blog posts and thus create the following models/post.rb file:
class Post < ActiveResource::Base self.site = "http://localhost:8888" # Proxy host + port self.prefix = "/my_new_domain/" # SDB domain end
It assumes that you run your AWS SDB Proxy on localhost at port 8888 and uses a SimpleDB domain named my_new_domain (adjust this according to the configuration you entered in config/aws_sdb_proxy.yml).
Let's fire up script/console and give it a spin.
Creating Records
>> p = Post.create(:title => 'My first SimpleDB post')
=> #<Post:0x198ceec @prefix_options={}, @attributes={"updated_at"=>
Sun Jan 20 00:42:43 UTC 2008, "title"=>"My first SimpleDB post",
"id"=>1081408...01005954, "created_at"=>Sun Jan 20 00:42:43 UTC 2008}>
As you can see, a new Post object gets created as usual. The AWS SDB Proxy auto-assigns an id and the additional attributes created_at and updated_at mimicking Rails' standard behaviour.
Please note, that we could have assigned any attributes we wanted to that Post; SimpleDB does not enforce a schema and thus the proxy will happily accept any attributes we throw at it.
Note: Remember that all attributes will be coerced into strings for storage in SimpleDB. No matter what your original data type was, you will always get back a string representation of it when fetching records from SimpleDB.
Updating Records
Let's assign another attribute to the Post and save it to trigger an update operation:
>> p.body = 'Content is king'
=> "Content is king"
>> p.save
=> true
If you started AWS SDB Proxy in foreground, you can see it forward the save operation to SimpleDB.
Finding Records
ActiveResource offers a couple of ways to query for records:
# Find by record id
>> Post.find(p.id)
=> #<Post:0x18efef8 @prefix_options={}, @attributes={"updated_at"=>
Sun Jan 20 00:45:28 UTC 2008, "title"=>"My first SimpleDB post",
"body"=>"Content is king", "id"=>1081408...01005954, "created_at"=>
Sun Jan 20 00:42:43 UTC 2008}>
# Find by attribute value(s)
>> Post.find(:first, :params => { :title => 'My first SimpleDB post' })
=> #<Post:0x18efef8 @prefix_options={}, @attributes={"updated_at"=>
Sun Jan 20 00:45:28 UTC 2008, "title"=>"My first SimpleDB post",
"body"=>"Content is king", "id"=>1081408...01005954, "created_at"=>
Sun Jan 20 00:42:43 UTC 2008}>
The first form queries for a single Post with a given id, whereas the second form queries for records with exact matches on every given attribute (:first tells the find method to return only the first of those).
SimpleDB offers more sophisticated query operations than ActiveResource, including lexicographical comparisons, intersection and union. You can pass in native SimpleDB query syntax using this form of find:
# Find by SimpleDB Query
>> Post.find(:all, :from => :query, :params => "['title' starts-with 'My']")
=> #<Post:0x18efef8 @prefix_options={}, @attributes={"updated_at"=>
Sun Jan 20 00:45:28 UTC 2008, "title"=>"My first SimpleDB post",
"body"=>"Content is king", "id"=>1081408...01005954, "created_at"=>
Sun Jan 20 00:42:43 UTC 2008}>
Deleting Records
To complete the life-cycle of our Post, we can finally delete it from SimpleDB:
>> p.destroy => true
Resources
- Ruby on Rails
- Amazon SimpleDB (Amazon SDB)
- Aws-sdb Gem at RubyForge
- AWS SDB Proxy Plugin for Ruby on Rails
- Inside GL Networks, the authors tech blog: resources on AWS SDB Proxy
Martin Rehfeld is passionate about Ruby on Rails. He has published several Rails plugins and regularly gives talks at Rails related events. If you like Martin's work, consider recommending him on Working With Rails.