About This Sample
A Rails application running on a 'Server' EC2 instance uploads to Amazon S3 an image submitted through the Rails application's form by the user. The Rails application puts a job message with details of the uploaded object into the 'todo' queue from the Rails controller using the ActiveMessaging Rails plug-in. One or more EC2 'Worker' instances poll the 'todo' queue for jobs, read the message and download, watermark the image then upload the new image to Amazon S3. The details of this new image are added to the original job message and the new message put into the 'done' SQS queue then the original message in the 'todo' queue is deleted.
The Rails application retrieves messages at regular intervals from the 'done' queue with the ActiveMessaging 'poller' daemon. An ActiveRecord model encapsulates a message and the ActiveMessaging plug-in saves all the SQS messages sent and received. The Rails application controller reads messages sent and received from the database rather than directly from the 'done' queue and sends them to the view that displays jobs. This prevents a situation where a user or many users update their view to see their submitted and completed jobs and each refresh or request makes new calls to Amazon SQS. N number of users each with N refreshes would increase Amazon SQS usage exponentially.
Figure 1: Media Processing Pipeline with Amazon Web Services
It's worth noting that the format of Amazon SQS messages
this application is shared with that of the 'boto' library from Mitch
Garnaat. The Ruby YAML library is very helpful in working with these
RFC-822 compliant messages.
See Mitch Garnaat's Monster Muck Mashup - Mass Video Conversion Using AWS
Muck, Heavy Lifting
In this application the 'muck', or 'heavy-lifting' is demonstrated by watermarking images. See the 'watermarker.rb' file in the root of the .ZIP file containing the sample application. Of course your application might do any other kind of work. If there were intermediate steps between non-watermarked and watermarked more Amazon SQS queues might be used. In this sample application there is no intermediate state and so the job goes directly from the 'todo' queue to the 'done' queue when the work is performed.
The amount of watermarking that can be done asynchronously without affecting the Rails application performance at all is increased simply by starting many more 'Worker' EC2 instances, Amazon SQS ensures that if there is a job one of the 'Workers' will pick it up and process it.
This code sample uses an SQLite3 database with the Rails application - not scalable nor persistent. Using Amazon SimpleDB the Rails application could be scaled by starting many instances of the Rails application and all would use Amazon SimpleDB for persisting the messages.
This sample application code is accompanied by a public AMI.
employs a 'pull'-like mechanism for deployment instead of 'pushing'
(what is done with capistrano).
See PJ Cabrera's Using Parameterized Launches to Customize Your AMIs
This application further demonstrates using public key encryption to protect the AWS keys that are passed to the Amazon EC2 instances at launch time. The corresponding private key is bundled into the AMI. On launching an instance of the AMI, the private key is deleted before rc.local adds to the SSH authorized_hosts file the keypair used to launch the instance .
The ec2-launch-instances reads the user data it associates with the instance from the configuration file using the -f switch instead of from the stdin. Using the configuration file works better from a shell because of the length and contents of the encrypted, base 64 encoded, AWS keys cipher text.
Once the AWS keys are decrypted, the keys are put into the appropriate configuration files (broker.yml, amazon_s3.yml) and either the Rails 'script/server' and ActiveMessaging 'poller' are launched (if the 'server' keyword is present in the instance user data), or only the 'watermarker.rb' script is run (if the 'worker' keyword is present in the instance user data).
See the 'launch.rb' file in the root of the accompanying code sample .
You are signed up and active for Amazon S3, SQS, EC2
You can run the EC2 Command Line Tools
(try running 'ec2-describe-instances')
You have followed the Amazon EC2 getting started guide
(you will have a key called gsg-keypair, use this or your preferred
keypair where you see <mykeypair>
in the instructions below)
You have an empty bucket in Amazon S3
(create a new bucket if
you need to)
Otherwise this code sample is meant to be run inside the cloud with it's accompanying public AMI. You do not need to download the code. Continue with the section immediately below titled "Running the Sample".
Running the Sample
Create a file called server.cfg. Edit and save the file with the below two lines in it (replace <mybucket> with the name of an empty bucket you own):
Download the application's public key that will be used to encrypt your AWS access and secret access keys (or copy the URL and download it from your browser)
Encrypt a copy of your AWS keys with the aws-pipeline application's public key, base 64 encode it and append it to your server.cfg
Create a copy of server.cfg called worker.cfg
Edit worker.cfg and change the word 'server' on the first line to 'worker'. Save the file.
Create a security group for 'Servers' and allow traffic to port 80
ec2-add-group aws-pipeline -d "AWS Pipeline Instances"
ec2-authorize aws-pipeline -P tcp -p 80
Launch two 'Workers'
ec2-run-instances ami-a128cdc8 -g aws-pipeline -k <mykeypair> -f worker.cfg -n 2
Launch a single 'Server'
ec2-run-instances ami-a128cdc8 -g aws-pipeline -k <mykeypair> -f server.cfg
INSTANCE i-5edf2f37 ami-a128cdc8 pending mykeypair 0 m1.small 2008-01-11T19:28:45+0000
Wait 30 seconds and get the details of the folly booted-up instance using the InstancID from the below output (i.e.: 'i-5edf2f37').
RESERVATION r-d40ce7bd 319268305561 defaolt
INSTANCE i-5edf2f37 ami-a128cdc8 ec2-72-44-56-6.z-1.compute-1.amazonaws.com domU-12-31-38-00-39-F2.compute-1.internal running mykeypair 0 m1.small 2008-01-11T19:28:45+0000
Copy the public DNS name (ends with '.amazonaws.com') for the instance (e.g. 'ec2-72-44-56-6.z-1.compute-1.amazonaws.com'). Open it in your preferred browser.
Upload a JPEG to watermark
Refresh for Completed Job
Create a configuration file for 'Servers'.
curl -O https://s3.amazonaws.com/aws-pipeline/aws-pipeline_public.pem
(Substitute your AWS keys where you see <awsaccesskey>
(The aws-pipeline EC2 AMI contains a corresponding private key to decrypt your AWS keys)
(You are trusting the owner of the AMI)
(Only you can SSH into the instances of this AMI that you will launch)
echo "<awsaccesskey><awssecretaccesskey>" | \
openssl rsautl -encrypt -inkey aws-pipeline_public.pem -pubin | \
openssl base64 >> server.cfg
Create a configuration file for 'Workers'.
cp server.cfg worker.cfg
Launch and connect to Amazon EC2 instances.
Running the Sample Code LocallyYou will need the below Ruby gems. (note that the RMagick gem requires ImageMagick be installed in your development environment.)
- rails (2.0.2)
- right_aws (1.7.1)
- aws-s3 (edge)
- daemons (edge)
- RMagick (edge)
Download the code sample to a directory of your choice.
Enter your AWS Access Key and Secret Access Key into the development section of both config/broker.yml and config/amazon_s3.yml.
Run the following in the directory to which you downloaded the .ZIP file.
Open this link in your browser: http://localhost:3000/
script/poller run &
+ April 30 2008
- updated for SQS 2.0.
- upgraded to Rails 2.0.2.
- watermarker.rb uses right_aws instead of sqs gem for SQS 2.0 support.
- updated to latest versions of attachment_fu and activemessaging plugins.
Please use this forum thread for submitting reviews, bugs, or discussion of this sample app: