By Chris Moyer, AWS Community Developer
When I first dove into cloud computing, I searched all over the Internet for a feasible solution to my problems using existing softwareall to no avail. Django seemed to be the closest match, but it didn't support using Amazon SimpleDB as a storage engine. I looked at making my own back end for Django but again was halted because of the unique relation-less nature of Amazon SimpleDB. It also seemed like Django, Pylons, and every other Python-based Web application framework out there was simply too complicated for my simple requirements.
So, I decided to develop a new framework that covered all of my needs: Meet Marajo.
Introducing Marajo
Marajo started off as a clone of the Google App Engine (GAE) framework to be run in Amazon Elastic Compute Cloud (Amazon EC2). I then realized that there was a lot of functionality I could implement to make things easier. Eventually, Marajo evolved into a quick framework for spinning up new Web services. Marajo today is similar to the GAE framework but uses power that can only be achieved with Amazon EC2 (and it's much easier to use).
At its core, Marajo uses the familiar modelviewcontroller (MVC) pattern. All three of these sections are run from the same server, making it a slower and less efficient system than is possible, but it also makes it quite quick to develop on. It uses the same template style language as Django and GAE, known as Jinja, which makes transitioning from Django or GAE to Marajo quick and easy.
Figure 1. Marajo overview
As you can see from Figure 1, Marajo handles the Application, Presentation, and Database layers for you. This allows you simply to focus on code and develop your interface quickly without worrying about Amazon Web Services (AWS)specific details.
Initializing Your Environment
When you first start, you'll want to launch a new Ubuntu image from http://uec-images.ubuntu.com/releases/karmic/release/. Because I'm launching in us-east-1a, I've chosen to use ami-bb709dd2. I assume that you've already followed the steps to set up a key-pair and your boto configuration. To start your Amazon Machine Image (AMI), use the launch_instance command that boto provides:
$ launch_instance -a ami-bb709dd2
This command then asks you a series of questions about where exactly you want to put the instance, what security group it should go in, and the key-pair you want to use. You also need to make sure that the security group you choose is open on port 80 to the world. Next, log in to the newly created instance using Secure Shell (SSH) and your key-pair:
$ ssh -i /path/to/your/key.pem ubuntu@ec2-host-name
Installing Your Software
Start by installing boto from Subversion. First, check out a copy of boto locally into your local directory:
$ apt-get install python-setuptools $ cd /usr/local/ $ svn co http://boto.googlecode.com/svn/trunk boto $ cd boto $ python setup.py develop
Then, set it up as a pyAMI instance by creating a special /etc/rc.local file and some special scripts in your /root directory:
#!/bin/sh -e
# File: /etc/rc.local
# execute firstboot.sh only once
# Note that /mnt only stays with us
# until we're re-bundled, so this is a
# safe place to store this flag
if [ ! -e /mnt/firstboot_done ]; then
if [ -e /root/firstboot.sh ]; then
/root/firstboot.sh
fi
touch /mnt/firstboot_done
fi
# We run startup reguardles of if we've been
# booted before or not, this lets us
# schedule things to be only run on re-boot
if [ -e /root/startup.sh ]; then
/root/startup.sh
fi
exit 0
Doing so ensures that the firstboot.sh script is run the first time this instance is run. You'll use this script to make sure the ssh key is regenerated every time a new instance is launched. Because you're not making this image public, it isn't a big deal; but do it anyway, because you should always be concerned about security.
#!/bin/bash # File: /root/firstboot.sh # Regenerate the ssh host key rm -f /etc/ssh/ssh_host_*_key* ssh-keygen -f /etc/ssh/ssh_host_rsa_key -t rsa -N '' | logger -s -t "ec2" ssh-keygen -f /etc/ssh/ssh_host_dsa_key -t dsa -N '' | logger -s -t "ec2" # This allows user to get host keys securely through console log echo | logger -s -t "ec2" echo | logger -s -t "ec2" echo "#############################################################" \ | logger -s -t "ec2" echo "-----BEGIN SSH HOST KEY FINGERPRINTS-----" | logger -s -t "ec2" ssh-keygen -l -f /etc/ssh/ssh_host_rsa_key.pub | logger -s -t "ec2" ssh-keygen -l -f /etc/ssh/ssh_host_dsa_key.pub | logger -s -t "ec2" echo "-----END SSH HOST KEY FINGERPRINTS-----" | logger -s -t "ec2" echo "#############################################################" \ | logger -s -t "ec2" update-motd depmod -a /usr/bin/python /usr/local/boto/boto /pyami/bootstrap.py exit 0
Finally, set up a script to automatically update your system on reboot. You do this in /root/startup.sh:
#!/bin/bash # File: /root/startup.sh # Things to run just after the boot process is finished # On reboot or first boot # # Update local packages apt-get -y update apt-get -y upgrade # Update boto, marajo, and botoweb cd /usr/local/boto;svn up cd /usr/local/marajo;svn up cd /usr/local/botoweb;hg pull -u /usr/bin/python /usr/local/boto/boto/pyami/startup.py exit 0
Marajo mostly installs itself, but unfortunately, Jinja2 has some features in development that you need. So, install Jinja before installing Marajo:
$ easy_install -U jinja2==dev
Next, download Marajo from Subversion and install it:
$ svn co http://marajo.googlecode.com/svn/trunk marajo $ cd marajo $ python setup.py install
Installing Apache
When you're logged in, install Apache 2 and configure it to work with your system. As with most systems, you'll want to use apt-get and install it normally:
$ apt-get install apache2
You also want to configure and install mod_proxy and mod_proxy_balancer:
$ apt-get install mod_proxy mod_proxy_balancer $ a2enmod proxy_balancer
Next, configure Apache to use your blog application as its proxy, and point the directory to your Web root. Modify the default vhost file located at /etc/apache2/sites-available/default to the following:
NameVirtualHost *:80
<VirtualHost *:80>
ProxyRequests Off
<Proxy *>
AddDefaultCharset off
Order deny,allow
Allow from all
</Proxy>
<Proxy balancer://blog>
BalancerMember http://127.0.0.1:8080
BalancerMember http://127.0.0.1:8081
BalancerMember http://127.0.0.1:8082
</Proxy>
<Directory "/usr/local/blog/www">
Options Indexes MultiViews FollowSymLinks
AllowOverride None
Order allow,deny
Allow from all
</Directory>
DocumentRoot /usr/local/blog/www
ProxyPass /api/ balancer://blog/
ProxyPassReverse /api/ balancer://blog/
ErrorLog /var/log/apache2/error.log
</VirtualHost>
Initializing Your Application
Marajo is set up to use four separate directories for each part of an application. Start by creating a directory called /usr/local/blog. In that directory, you create the four directories that house each of the major parts of your application:
handlers. This directory houses the individual Web Server Gateway Interface (WSGI) handlers. These handlers deal with any special logic that needs to be overridden. For the most part, every handler will extendmarajo.appengine.handlers.db.DBHandler.resources. This directory houses all of your persistent data storage. For your blog application, it stores yourPostandCommentobjects. Marajo already handles theUserobject for you, which is located atmarajo.appengine.api.users.User. Also, a user handler is available to you atmarajo.appengine.handlers.user_handler.UserHandlerthat ensures that you only let users modify their own object, not their authorization group.static. This directory holds all of your static HTML, JavaScript, Cascading Style Sheet (CSS), and other media.templates. This directory holds all of your Jinja2 templates for how to view the data. The template mapper also allows you to create sub-directories that specify the content type requested. Figure 2 indicates how the mapper handles these situations.
Figure 2. Template Mapper
Creating Your Resources
As with any Gui-Over-Database application, you start by defining your data structure. Figure 3 shows the artificial schema you'll be creating. Because Amazon SimpleDB is schema-less, you use the boto.sdb.db module to restrict and persist your objects.
Figure 3. DB schema
Now, let's take a look at how you go about creating your Post and Comment objects:
# File: resources/post.py from boto.sdb.db.model import Model from boto.sdb.db.property import * from marajo.appengine.api.users import User class Post(Model): """A Blog Post""" title = StringProperty(verbose_name="Title") tags = ListProperty(str, verbose_name="Tags") content = BlobProperty(verbose_name="Content") created_at = DateTimeProperty(auto_now_add=True) modified_at = DateTimeProperty(auto_now=True) created_by = ReferenceProperty(User, collection_name='created_posts') modified_by = ReferenceProperty(User, collection_name='modified_posts')
The first step is to extend boto's Model class. This base class automatically handles much of the conversions for you, so you don't have to worry about converting properties that are lexicographically sortable for use within Amazon Simple DB. Next, use the related property to set up some properties for your object. You use several different properties to store each of your different property types:
StringProperty. Stores a string up to 1024 charactersListProperty. Stores multiple strings (again, up to 1024 characters)BlobProperty. Uses Amazon Simple Storage Service (Amazon S3) to store the actual contents of the property, so it's limitted to 5 GBDateTimeProperty. Stores a Pythondatetimeobject; setauto_now_addto automatically set it toutcnow()on creation andauto_nowto make it automatically set anytime it's updatedReferenceProperty. Stores a reference to another object; in this case, you're referencing theUserobject, and thecollection_nameattribute adds a reverse reference link automatically on theUserobject
Next, create a comment object, which allows you to add a comment to your post:
# File: resources/comment.py from boto.sdb.db.model import Model from boto.sdb.db.property import * from resources.post import Post class Comment(Model): """A simple Comment object""" post = ReferenceProperty(Post, collection_name='comments') # This is just a string since we # don't require them to log in to post posted_by = StringProperty() content = BlobProperty() created_at = DateTimeProperty(auto_now_add=True)
Again, you're using the same type of process. Now, you have the two objects required for your application. Marajo already handles the User object for you, so you don't need to make that.
Creating Your Handlers
When working with Marajo, you rarely need to create your own handlers. A simple create, read, update, delete (CRUD) interface is automatically implemented for you, so it's easy just to configure and set everything up. You'll create one base handler to show the home template:
# File: handlers/__init__.py
from marajo.appengine.handlers import RequestHandler
class MainPage(RequestHandler):
"""Simply shows the main page"""
def get(self):
self.display("index.tmpl")
In addition, the basic DBHandler doesn't allow non-users to create anything, so extend it and override the post method:
# File: handlers/comments.py
from marajo.appengine.handlers.db import DBHandler
from marajo.exceptions import Unauthorized
class CommentHandler(DBHandler):
"""Override the POST to allow any user to do a create"""
def post(self):
"""Save or update object to the DB"""
obj = self.read()
if obj:
if not self.user:
raise Unauthorized()
obj = self.update(obj, self.request.POST)
else:
obj = self.create(self.request.POST)
return self.redirect("posts/%s" % obj.post.id)
Configuring Your Application
Configuration for a basic CRUD application is quite simple. The entire app.yaml file, which resides in your root folder, is written using YAML. The first section just defines a few simple variables that are accessible throughout your templates and handlers:
application: blog auth_db: marajo_users session_db: marajo_sessions version: 1
Next, configure the handlers sub-section of the configuration. This sub-section maps the URL patterns to their respective handlers. The mapper allows you to specify either a handler, static_dir, or static_file directive; everything else is passed in to the handler as an argument. This is how things like the DBHandler can act to serve up any object depending on what you specify as the db_class argument.
handlers: - url: / handler: handlers.MainPage - url: /javascript static_dir: static/javascript - url: /images static_dir: static/images - url: /style static_dir: static/style - url: /posts(.*) handler: marajo.appengine.handlers.db.DBHandler edit_template: viewPost.tmpl db_class: resources.post.Post - url: /comments(.*) handler: handlers.CommentHandler db_class: resources.comment.Comment
Creating Your Templates
You only need a few basic templates to get started. Start with your menu template, which you'll add to the top of each page. This code uses the typical Jinja syntax to insert variables at specific places. Note that you have access to the current user object if the user is logged in; you can use this object to determine whether you need to provide the user with a login or logout option.
<!-- menu.tmpl -->
<div id="topmenu">
<ul class="left">
<li><a href="/">Home</a></li>
<li><a href="/posts">All Posts</a></li>
</ul>
<ul class="right">
{%if user%}
<li><a href="{{logout_url}}">Logout {{user.username}}</a></li>
{%else%}
<li><a href="{{login_url}}">Login</a></li>
{%endif%}
</ul>
<br style="clear: both;"/>
</div>
<!-- /menu.tmpl -->
It's generally a good idea to add the HTML comments to each template so that when you view the generated source, you can figure out where a specific element is coming from. Templates can be quite tricky to debug if you don't have some sort of reference to where each element was inserted.
Next, look at your index page. This one is relatively simple, but feel free to expand on it as needed.
{% extends 'base.tmpl' %}
{% block content %}
<h1>Hello World!</h1>
{% endblock %}
This template shows the basic concept of making a new regular page for a handler to display. Notice that you always start by extending base.tmpl so that you don't need to re-type all the HTML and menu code that's duplicated on every page. This base template defines blocks, which you can then override. Here, you override the content block to add your Hello World comment.
Running Your Application
At this point, you have a fully functioning application. You can simply navigate to your application root directory and run marajo_server.py. You can then navigate to http://your-server:8080 and view your entire application. You should see a login button in the upper right of the window. Marajo uses sessions, so the login happens via sessions, not basic HTTP authentication. You'll have to log in before you'll be able to post.
Of course, this is just a basic setup for a blog. Now, let's look into customizing your templates.
Creating Custom Templates
You're now ready to start building some custom templates for your application. There are three basic templates that you can override for the database handler.
The List Template
The list template is the default template that you see when you go to the URL without any arguments. This template is passed in a few arguments, the most notable of which is objects, which is an iterable object that allows you to query the objects that should be listed. Take a look at the list template for your post handler:
{% extends "base.tmpl" %}
{% block head %}
<link rel="stylesheet"
href="{{static_file('/style/blog.css')}}"
type='text/css'/>
{% endblock %}
{% block content %}
<!-- listPosts.tmpl -->
<div id="posts" class="box">
{% for obj in objects %}
<div class="post">
<h3>
<a href="{{action_href}}/{{obj.id}}"> {{obj.title}}</a>
<small>{{obj.created_at}}</small>
</h3>
<hr/>
<div class="attr content">{{obj.content}}</div>
<ul>
{% for tag in obj.tags %}
<li>{{tag}}</li>
{% endfor %}
</ul>
<br class="clear"/>
<br class="clear"/>
<br class="clear"/>
</div>
{% endfor %}
</div>
{% if user %}
<br class="clear"/>
<div class="box">
<form method="POST" class="post">
<h4>Title</h4>
<input type="text"
name="title"
style="width: 300px"/>
<hr/>
<h4>Content</h4>
<textarea name="content"
rows="20"
style="width: 600px;">
</textarea>
<br class="clear"/>
<hr/>
<h4>Tags</h4>
<textarea name="tags"
rows="10"
style="width: 600px;">
</textarea>
<br class="clear"/>
<input type="submit" value="Create Post"/>
</form>
</div>
{% endif %}
<!-- /listPosts.tmpl -->
{% endblock %}
Here, you introduce a new block called head, which allows you to insert tags into the HTML <head> tag. You've used this to link to a static file using the static_file function passed into your template. Using this function instead of just passing in the link directly allows you to serve these static files out of CloudFront or Amazon S3 if you configure that into your app.yaml file.
The next interesting block of code starts with the {% if user %} tag on line 32. This section only appears if the user is logged in, providing him or her with a mechanism to add an additional post to the blog. As tags are multi-value, Marajo allows you to separate the strings by newlines, so you just use a <textarea> tag. You also don't have to add an action URL to your form, because you want to post directly back to the page you're already on.
You next have to modify tour posts handler configuration in app.yaml to add the list_template definition. Your new handler section should look like this:
- url: /posts(.*) handler: marajo.appengine.handlers.db.DBHandler db_class: resources.post.Post list_template: listPosts.tmpl
The Edit Template
The edit template is called every time the user goes to an object-specific URL. You can also think of this template as a view template: If the user is logged in, you can provide him or her with the form so the user can edit the post; if the user isn't logged in, you can simply display the post and allow him or her to add a comment.
{% extends "base.tmpl" %}
{% block head %}
<link rel="stylesheet"
href="{{static_file('/style/blog.css')}}"
type='text/css' />
{% endblock %}
{% block content %}
<!-- viewPost.tmpl -->
{% if user %}
{% include "editPost.tmpl" %}
{% else %}
{% include "displayPost.tmpl" %}
{% endif %}
<div id="comments" class="box">
{% for comment in obj.comments %}
<div class="comment">
<h4>{{comment.posted_by}}</h4>
<hr/>
<div class="attr content">
{{comment.content}}
</div>
<br class="clear"/>
</div>
{% endfor %}
</div>
{% if not user %}
<br class="clear"/>
<br class="clear"/>
<div class="box">
<form method="POST" action="/comments">
<input type="hidden"
name="post"
value="{{obj.id}}"/>
<label for="posted_by">Your Name: </label>
<input type="text"
style="width: 210px;"
name="posted_by"/>
<br/><br/>
<textarea name="content"
cols="40"
rows="20">
</textarea>
<br/>
<center>
<input type="submit"
value="Add Comment"/>
</center>
</form>
</div>
{% endif %}
<!-- /viewPost.tmpl -->
{% endblock %}
On line 11, notice that you introduce a conditional statement {% if user %}, which signifies that the user is logged in. You could also validate that the user is of a specific authorization group by using {% if user and user.has_auth_group("auth-group-name") %}, but for now, assume that you're the only user. The next directive indicates that you'll include a separate template within the same directory you're currently in called editPost.tmpl. This entire section essentially means that if the user is logged in, you'll show him or her one template; otherwise, you'll show the user another.
The next bit of code simply shows how you can iterate over the comments for your given post using the {% for comment in obj.comments %} block. It's important to note that as soon as you call this block of code, it triggers the query against Amazon SimpleDB, so expect that to take a little longer to render. You can also add filters or limits to this query by appending them just like you did in the handler code. Because you don't want to see any spam, change that query:
{% for comment in obj.comments.filter("spam_chance <", 50) %}
Now, you'll only show comments with a spam chance of less than 50%. Because filter returns the query itself, you can also chain filter, order, and fetch together into one line. Also, add a limit to only show the last 10 comments ordered by date created in descending order:
{% for comment in obj.comments.filter("spam_chance <", 50).order("-created_at").fetch(10) %}
The next chunk of code uses logic to show a comment box only if the user isn't logged in. You're assuming that if you're logged in to the site, you're not adding comments to your own posts, so this helps clean up some things that you don't want to see. You're adding another box with a custom form in it that performs a POST operation on /comments. You set the post option to the current post's ID so the comment is automatically attached to this post, and you let the user fill in his or her name and a brief comment. When the user clicks Submit, he or she hits the comment handler that you previously set up, which re-direct the user to this post, showing the comment the user just added.
The displayPost Template
Next, take a look at the displayPost template, which shows when the user isn't logged in:
<!-- displayPost.tmpl -->
<div id="posts" class="box">
<div class="post">
<h3>
{{obj.title}}
<small>{{obj.created_at}}</small>
</h3>
<hr/>
<div class="attr content">{{obj.content}}</div>
<ul>
{% for tag in obj.tags %}
<li>{{tag}}</li>
{% endfor %}
</ul>
<br class="clear"/>
</div>
</div>
<!-- /displayPost.tmpl -->
This template is similar to the comments section of the editPost template, where you simply show the details of the post in an HTML format. Now, take a look at the editPost template that will be shown only when the user is logged in. The only major changes you make here are to replace the simple display of the fields with their proper input types:
<!-- editPost.tmpl -->
<div id="posts" class="box">
<form method="POST" class="post">
<input type="text"
name="title"
value="{{obj.title}}"
style="width: 300px"/>
<small>{{obj.created_at}}</small>
<hr/>
<textarea name="content"
rows="20"
class="attr content"> {{obj.content}}</textarea>
<br class="clear"/>
<hr/>
<h4>Tags</h4>
<textarea name="tags"
rows="10" style="width: 600px;">
{%- for tag in obj.tags -%}
{{tag}}
{% endfor -%}
</textarea>
<br class="clear"/>
<input type="submit" value="Update Post"/>
</form>
</div>
<!-- /editPost.tmpl -->
Here, you use your form with a POST again, but you don't set the action URL, because you simply want to post to the current page. You've also set all the form's default values to the post's current values, and now you have a fully functional editing template for your user. Note that in the textarea for the tags, you use - inside the {% %} tags. These dashes state some formatting codes that allow you to remove the white space before or after the tag so you can still make the code readable but not have the tags show up spaced oddly. You don't strip out the white space before the {% endfor %} block, however, because you do need one newline after each tag.
Finally, you modify your post handler once again to add this template:
- url: /posts(.*) handler: marajo.appengine.handlers.db.DBHandler db_class: resources.post.Post list_template: listPosts.tmpl edit_template: viewPost.tmpl
Setting Up Your Application to Start at Boot
The last thing to do is modify your startup.sh script to automatically run your server whenever it's started. For purposes of this example, launch three copies of tour application server. Simply add the following three lines right before the exit 0 line:
cd /usr/local/blog; marajo_server.py -p 8080 cd /usr/local/blog; marajo_server.py -p 8081 cd /usr/local/blog; marajo_server.py -p 8082
That's it! Your application is now set up to run on three ports, and your spam filter will start automatically in the background.
Bundle Your Image
Now, bundle the image using the simple bundle_image command that boto provides. This command is relatively verbose, so make sure to follow through the prompts. You'll need your public and private keys provided to you by Amazon when you first set up your account. You'll also need an Amazon S3 bucket to store the image on and a prefix to identify your image. Use of the bundle_image script is quite simple. Just make sure you run it from your local computer, not the instance you've just launched.
% bundle_image --help
Usage: bundle_image [options] instance-id [instance-id-2]
Options:
--version show program's version number and exit
-h, --help show this help message and exit
-b BUCKET, --bucket=BUCKET
Destination Bucket
-p PREFIX, --prefix=PREFIX
AMI Prefix
-k KEY_FILE, --key=KEY_FILE
Private Key File
-c CERT_FILE, --cert=CERT_FILE
Public Certificate File
-s SIZE, --size=SIZE AMI Size
-i SSH_KEY, --ssh-key=SSH_KEY
SSH Keyfile
-u UNAME, --user-name=UNAME
SSH Username
-n NAME, --name=NAME Name of Image
For a Ubuntu-based image, use the following command:
% bundle_image -b\ ... -p <my_custom_identifier> \ ... -k /path/to/my/key.pem \ ... -c /path/to/my/cert.pem \ ... -s 10240 \ ... -i /path/to/my/ssh-key.pem \ ... -u ubuntu \ ... -n blog
This process may take up to an hour to finish, so now is a good time to take a break from all this coding. When the processing is finished, you'll be provided with your new AMI ID, which you'll use to launch all your new instances. Make sure you launch at least one copy of the instance before you terminate your development instance, so you're sure everything worked.
Creating the Proxy
The last step is to create and set up your proxy system. You're using Elastic Load Balancer for this, so let's use the elbadmin tool, also provided by boto.
% elbadmin -l 80,80,http create blog % elbadmin enable us-east-1a % elbadmin add blog <instance-id>
You'll probably want to launch a few of these instances and add each one to the load balancer. Also, make sure you set up a CNAME to the address provided to you by the elbadmin script so that you can point your users to something more readable. Your final deployment configuration should look something like Figure 4.
Figure 4. Marajo deployment
Summary
You should now be able to go directly to your proxy URL, click All Posts, and see something similar to Figure 5.
Figure 5. The All Posts page
Using Marajo is a great way to quickly bring up a new Web application without having to worry about doing a lot of coding. It provides you with sane defaults and the ability to override just about everything to create a custom system. You can find all of the code examples used in this article in the example directory for Marajo.