Monster Muck Mashup - Mass Video Conversion Using AWS

Expert AWS developer Mitch Garnaat takes us through his Monster Muck Mashup application, which supercharges the process of converting video for his iPod. Mitch uses Amazon S3 for rock-solid video file storage, Amazon EC2 to rip through the video conversion, and Amazon SQS for messaging during the conversion carnage.


Submitted By: Craig@AWS
AWS Products Used: Amazon SQS, Amazon EC2, Amazon S3
Language(s): Python
Created On: March 28, 2007


by Mitch Garnaat

Over the past year or so, Amazon has been expanding it's line of infrastructural web services. Amazon CEO Jeff Bezos likes to call this collection of services muck, meaning these kinds of services are difficult to build in a scalable manner. That's exactly what this article will focus on: combining three of these scalable services from AWS using an architecture that allows us to build robust, reliable, and scalable compute services. While the basic architecture described here could be applied to many different application areas, this article will focus on building a service to solve a specific problem that I run into regularly: video format conversion.

Like most people today, I have made the transition from film cameras to digital cameras. One of the cool things that many digital cameras today can do is shoot video in addition to still photos. So now I have hundreds of these videos, all in AVI format, sitting around on my hard drive. What I really want to be able to do is load these videos up on my video iPod so I can enjoy them and share them easily. The problem is that the iPod doesn't play AVI format videos. It wants it's videos in MPEG4 format.

In this article, we're going to build a video conversion service using the AWS building blocks. This service accepts AVI format video files as an input and produces MPEG4 files as an output. Not only will this service be able to convert all of my videos, it could easily be scaled to handle mass video conversion for thousands of users or be used as a component in a larger media management application.

Before We Begin

The architecture that I describe in this article could be applied to any programming language, but, for my example, I'm going to use Python. There are two main reasons for this:

  1. Python has been my favorite programming language since the days of Python 0.9 and it's a great choice for mashing up various services to create new services.
  2. I've already developed a Python language library called boto that provides interfaces into all three of the Amazon Web Services we are going to use in this article.

We're going to be taking advantage of boto on both the server side and the client side of this project but if you are interested in building similar services in other programming languages, check out the AWS forums and Resource Center. There are lot's of libraries available for many different languages.

Let's get started!

The Big Three

The three services we will focus on in this article are:

  • Amazon Elastic Compute Cloud (EC2) for scalable compute resources
  • Amazon Simple Storage Service (Amazon S3) for unlimited, reliable storage
  • Amazon Simple Queue Service (SQS) for reliable messaging and loose coupling

We could build a conversion service like the one I describe above without using any of these Amazon Web Services. Our little server sitting on the web may work for us, but what happens if our friends and family decide they want to use it? Or, heaven forbid, what happens if someone blogs about it, the right people find out, and our little service gets Digg'ed or Slashdot'ed? Where are we going to store all of that uploaded video? How are we going to handle the compute-intensive video conversion? How are we even going to handle the bandwidth required to handle the requests? We're not. We're building our service leveraging these building blocks from AWS so we can end up with a service that will be easy to construct, inexpensive to operate, and able to scale to meet virually any demand.

Putting the Pieces Together

The diagram below shows the basic architecture of the service we are building.

Amazon S3 is the perfect place to store the video files to be converted as well as any output files generated by our conversion service. In addition to being fast and reliable, we will never have to worry about our service running out of disk space.

For the instructions, we want a place where different clients can store the information and know that the instructions will be delivered to our service. Our service wants to be able to read one set of instructions at a time, in roughly the order in which they were stored. This ensures that work is done in a timely and fair manner. Fortunately for us, that's exactly what SQS provides us. Think of it as e-mail (or more generally, messaging) for services. And again, we won't have to worry about scalability, availability, or reliability.

Finally, we need a way to actually perform the video conversion. This is where EC2 comes in. EC2 provides elastic computing resources. With a single API call to EC2 I can create a brand new server to do my bidding. In fact, I can create dozens of them. And when the work is done, I can make them go away just as quickly and easily. No more trips to Fry's! Well, okay, maybe we will still find a reason to go to Fry's but we definitely won't have to go to buy servers for our conversion service. EC2 will take care of that for us.

Building Our EC2 Image

One of the first things we need to do is to create a new EC2 image (called an AMI) that contains all of the software needed to create our video conversion service. Because this process is quite detailed and time-consuming (and because this article is already pretty darn long) we are going to cheat. I have already done all of the configuration necessary to build our conversion service and turned it into a publicly accessible AMI that anyone can access. So, we are going to skip over most of the nitty-gritty details of installing software, etc. If you really want to go through that, there are more detailed notes with the public AMI. (See the related documents below for a link to the public AMI.)

Building Our Conversion Service

Based on the architecture shown in the diagram above, the basic steps required in our conversion service are:

  1. Read a message from our input queue
  2. Based on the data in the message, retrieve the input file from Amazon S3 and store it locally in our EC2 instance
  3. Perform our video conversion processing, producing one or more output files
  4. Store the generated output files in Amazon S3
  5. Write a message to our output queue describing the work we just performed
  6. Delete the input message from the input queue

The boto library provides a framework for this type of service in a class called, appropriately enough, Service. The Service class takes care of all of the details of reading messages, retrieving and storing files in Amazon S3, writing messages, etc. It also handles many of the common types of errors that come up when dealing with distributed services. For details on the Service class, you can view the source code here. For this article, though, we are just going to leverage that class and focus our efforts on what we need to do to get our video conversion service up and running.

To keep things simple, I've also already created a subclass of the Service class in boto to perform the video conversion. It's call ConvertVideo and the source code is shown below. This class is also part of boto and can be found here.

from boto.services.service import Service
import os

class ConvertVideo(Service):

    ProcessingTime = 30

    Command = """ffmpeg -y -i %s -f mov -r 29.97 -b 1200kb -mbd 2 -flags \
              +4mv+trell -aic 2 -cmp 2 -subcmp 2 -ar 48000 -ab 192 -s 320x240 \
              -vcodec mpeg4 -acodec aac %s"""

    def process_file(self, in_file_name, msg):
        out_file_name = os.path.join(self.working_dir, 'out.mov')
        command = self.Command % (in_file_name, out_file_name)
        os.system(command)
        return [(out_file_name, 'video/quicktime')]

Our ConvertVideo class subclasses the boto Service class. That means we can leverage all of the Service class code to handle messaging, etc. The only thing we need to do is define how the video conversion process works. We do that by overriding the process_file method of the Service class. This is the method that gets called within the Service framework when there is an input file that needs to be processed. The process_file method takes two arguments:

  • in_file_name - the fully qualified path to the input file to be processed. In our case, this will be an AVI format video file.
  • msg - the message read from the input queue representing the work to be done. Right now, we can ignore this message because we are always performing the same conversion.

There are also a couple of class variables defined:

  • ProcessingTime - defines the maximum amount of time we think it will take to process a file. This time is important because when we read an input message from SQS, we need to tell SQS how long to keep this message invisible from other readers of the queue. This is called the InvisibilityTimeout. If the timeout is too short, other services reading from the same queue might read the same message we are reading and perform the conversion again. Because the services are idempotent, this won't cause any harm but is a waste of computing resources. If the timeout is too long, the message could remain invisible longer than necessary if the original service that reads the message fails to process the message successfully. Eventually, the message will become visible in the queue again and will be read by another service but to provide reasonable response times, we don't want the timeout to be longer than necessary.
  • Command - this is our command line for the call to ffmpeg to perform the conversion. The input file name and output file name have been parameterized so we can supply them at runtime.

The process_file method constructs the correct command line to run and then executes that command line using the os.system call in Python. The boto Service class expects the process_file method to return a list of tuples. Each tuple represents one output file generated by the service. The first element of the tuple is the fully qualified path to the output file and the second element of the tuple is the mime type of the output file. Since our simple service produces only a single output file (the MPEG4 file) we return a list with a single tuple.

Get the Message?

We have described how we are using SQS to help us scale our services and we have talked about the boto Service class handling the reading and writing of messages for us. But what do those messages actually look like? Here's an example input message for our ConvertVideo service.

Bucket: garnaat_fileit
InputKey: f84e4a20b571abc69baf2277d193e596
Date: Tue, 20 Feb 2007 17:21:21 GMT
OriginalFileName: MVI_3113.AVI
Size: 1126472

The basic message structure is very simple and should look familiar. It basically follows the same RFC-822 format used in mail messages and in HTTP headers. The required fields are described below:

  • Bucket - the Amazon S3 bucket that contains input files and will be used to contain output files
  • InputKey - the key of the input file in Amazon S3. The combination of the bucket and key provides a fully qualified reference to the input document. The boto service framework uses the MD5 hash of the file as it's key in Amazon S3. This is one of the ways we can guarantee that services are idempotent.
  • Date - the date and time that the input file was originally stored in Amazon S3
  • OriginalFileName - the original name of the input file
  • Size - the size in bytes of the input file

Once our ConvertVideo service has completed processing a particular input message, it writes a message to the status queue describing the work that it just completed. That output message is shown below.

Bucket: garnaat_fileit
InputKey: f84e4a20b571abc69baf2277d193e596
Date: Tue, 20 Feb 2007 17:21:21 GMT
OriginalFileName: MVI_3113.AVI
Size: 1126472
OutputKey: e69e376be5af6f88f81d3e31adf27988;type=video/quicktime
Server: ConvertVideo
Host: domU-12-31-34-00-02-82
Service-Read: Wed, 21 Feb 2007 01:28:14 GMT
Service-Write: Wed, 21 Feb 2007 01:28:27 GMT

As you can see, this status message contains all of the fields from the original input message plus some additional fields added by the service, described below.

  • OutputKey - the Amazon S3 key and mime type of the outputs of the service. This field could contain multiple entries, separated by commas.
  • Server - the name of the service that processed this message.
  • Host - the DNS name of the EC2 instance that performed the actual conversion
  • Service-Read - the date and time that the service read the input message
  • Service-Write - the date and time that the service wrote the status message

In a production environment, these output messages would be read from the output queue and persisted in log files or a database.

Enough Muck! Let's Convert Some Video!

First of all, if you are still with me; Congratulations! We are almost ready to put our service into action. As I mentioned earlier, I've already bundled the conversion service into a publicly available AMI (see the links at the end of this article). In addition to installing the necessary software to perform the video conversion, I also needed to modify the rc.local file in that instance so that it would automatically start up our conversion service as soon as the instance boots up.

But wait! How does our instance know what service to start up? Or where to read messages from? Or who's AWS credentials to use? Well, that's where instance user data comes in. A relatively new feature added to EC2 allows us to to pass arbitrary data to an instance when we launch it. This provides a great way to create very general purpose images with little or no hardcoded data. The boto Service class takes advantage of the instance user data feature in EC2 to allow a variety of parameters to be passed to the service at instance creation time.

Making it Work

Okay, now we are finally ready to use our super-scalable video conversion service. The first thing we need to do is submit some video files to be converted. We could provide a simple web upload page to submit the files but for now, we want to be able to get a bunch of files up there as efficiently as possible so we will use a command-line utility provided in the boto library. Let's assume that we have a bunch of AVI format video files sitting in a directory called movies in our home directory. Here's how we would submit those files to the video conversion service.

$ cd ~/boto
$ boto/services/submit_files.py -b myvideos -q vc-input ~/movies
...
50 files successfully submitted.
$

This command does a lot of work behind the scenes. Let's step through it. The -b option is used to specify the Amazon S3 bucket in which to store the input video files. The -q option specifies the SQS queue that will be used as the input queue for our service. The final argument is either a fully qualified path to a single file to submit to our service or a fully qualified path to a directory. If we pass a directory to the submit_files command will submit all files in the directory.

For each file processed by the submit_files command the file will be stored in the specified bucket in Amazon S3 using the file's MD5 hash as it's key in Amazon S3. In addition, for each file stored a message will be written to the specified queue. This message represents the work that needs to be performed on the file. In our case, that means the video conversion.

Now that we have stored our original video files in Amazon S3 and created messages in our video conversion service's input queue we can fire up our conversion service. Again, we will leverage a command-line utility in boto to make this as easy as possible. Remember, previously in this article we went through the steps to create and register our video conversion service's AMI in EC2 so all we need to do is create one or more instances of our service to process the messages in the input queue. To start with, let's just create a single service instance to process our files.

$ cd ~/boto
$ boto/services/start_service.py -r -m boto.services.convertvideo \
-c ConvertVideo -a ami-2eba5f47 -i vc-input -o vc-output \
-e mitch@garnaat.com
$

Again, there's a lot happening here behind the scenes. First let's go through the arguments passed to the command.

  • -r: this option means that we are starting a remote service. This same script is used to start up the service on the remote EC2 instance so this option is needed to tell the command whether it should be firing up an EC2 instance (as in our case) or starting the service software on the EC2 instance.
  • -m: this option is used to specify the python module that contains our server class. The ConvertVideo class we created earlier resides in the module boto.services.convertvideo. You can create your services in any module you like, as long as you have configured your EC2 server instance so it can access the module.
  • -c: this option specifies the name of our Python server class. We called our class ConvertVideo.
  • -a: this option specifies the EC2 AMI id for our service. This is the value returned when you register the image with EC2.
  • -i: this option specifies the name of the SQS queue that will be read for input messages for our service. This should match the name given when we called submit_files.py.
  • -o: this option specifies the name of the SQS queue that will be used to store status messages for our service.
  • -e: this option is used to provide an e-mail address that will be notified when a service is started or stopped. This just provides an easy way to tell when your service has been instantiated and is ready to start processing input messages and also when it has completed all processing and is shutting down.
So, the net effect of this command is to:
  1. Process all of the command line arguments and construct a string that will be passed to the instances as UserData. This UserData contains everything the instance will need when it starts up.
  2. Start up a new instance of the AMI specified on the command line
  3. Once the new instances starts up, it will read the UserData passed to it
  4. Based on the UserData, it will load the appropriate Python class representing our service and create a new instance of that class. If the -e option was used on the command line the service will send an e-mail indicating that the service has started.
  5. The new instance of our service class will begin reading messages from the input queue specified in the UserData and will process the messages until the queue is empty
  6. The service will then terminate the EC2 instance in which it is running. Before doing so, if the -e option was used on the command line starting the service the service will send an e-mail indicating that the service is shutting down.

Once the service has completed it's work, we can grab the results. Here again we will leverage a command-line utility provided in boto to simplify this task.

python boto/services/get_results.py -q vc-status ~/movies
retrieving file: MVI_3110.mov
...
50 results successfully retrieved.
Minimum Processing Time: 2
Maximum Processing Time: 58
Average Processing Time: 17.820000
Elapsed Time: 896
Throughput: 3.348214 transactions / minute

This shows the kind of throughput we can expect from a single instance of our conversion service. But how about that scalability we talked about earlier? Well, let's make things a little more interesting. In our next test we will queue up 500 videos for conversion. Since we queued up 10 times more work let's create 10 times more servers and see how things go.

$ python boto/services/start_service.py -m boto.services.convertvideo \
-c ConvertVideo -r -a ami-2eba5f47 -e mitch@garnaat.com\
-i vidconv-input -o test-status -n 10
Server: boto.services.convertvideo.ConvertVideo - ami-2eba5f47 (Started)
Reservation r-b4bf5bdd contains the following instances:
i-10c32479
i-13c3247a
i-12c3247b
i-15c3247c
i-14c3247d
i-17c3247e
i-16c3247f
i-e9c32480
i-e8c32481
i-ebc32482

Now we will have 10 video conversion servers all reading messages from the same queue and processing the same set of work. In theory, that means the elapsed time to complete all of this processing should be about the same as it took a single server to process 50 files. Let's check on the results.

$ python boto/services/get_results.py -q test-status ~/movies
retrieving file: MVI_3110.mov
500 results successfully retrieved.
...
Minimum Processing Time: 2
Maximum Processing Time: 60
Average Processing Time: 17.794000
Elapsed Time: 928
Throughput: 32.327586 transactions / minute

Sure enough, the average processing time and elapsed time are almost exactly the same but our overall throughput is roughly 10 times higher than in our previous example which is exactly the sort of behavior we would expect and hope for.

Check Please!

We've created a framework for providing scalable services and shown some examples of how that framework can be easily be ramped up to handle increasing demands. We've also shown that our approach scales in a very linear and predictable manner, exactly what we want to see. One important question remaining, however, is " How much does it cost?". We can answer that question pretty easily because the get_results.py command, in addition to retrieving and summarizing the results found in a status queue, also creates a CSV file called log.csv in the directory specified on the command line. By bringing that file into a spreadsheet program like Excel (or by loading it into a database) we can get all kinds of stats about our services. Let's use that information to total up our bill for converting the 500 videos.

Storage 2.5 GBytes $0.38/Month
Transfer 2.5 GBytes $0.50
Messages 1000 $0.10
Compute Resources 8 Instances for ~ 20 minutes $0.80
  Total: $1.78

A total of about $1.78 for converting 500 videos means a per/video cost of less than $0.004. Pretty impressive. And, unlike traditional computing infrastructure that is a fixed cost no matter what your actual demand looks like, this infrastructure cost can track your demand exactly.

Wrapping It Up

We've covered a lot of ground in this article. We've discussed the different Amazon Web Services involved, described a high-level architecture for combining those services into a scalable services framework and shown the performance and cost metrics of a video conversion service built with that architecture.

But that's really only the beginning. There's a lot more we could do to make this services framework even more useful, such as:

  • Provide a browser interface for submitting videos for conversion. This would have to accept the POST'ed file submissions (and parameters) and then transfer the file to Amazon S3 and queue up a message to describe the work to be performed.
  • Extend the video conversion service itself to handle a wider range of input formats and conversions. The ffmpeg program is very powerful and we should take better advantage of it.
  • Load the status messages into a database so we can query about previous jobs and better track our service usage.
  • Come up with a strategy for dynamically managing the EC2 instances rather than starting them up manually.
  • Develop service support code in different languages. This article focused on my favorite language, Python, but since the main interface between consumers and producers in our architecture is via RFC822-style message headers we could easily write services in any language and have them interoperate.
  • Lots, lots, more...

So get out there and produce your own scalable, reliable web services! AWS makes it easy.

Additional Resources

Mitch Garnaat is an independent software consultant living in Upstate New York. He has been designing and developing software for 20 years. For the past year, his focus has been on leveraging Amazon Web Services. He is the author of the open source boto library which provides a Python interface for an expanding set of Amazon Web Services and has been developing AWS-based applications for a variety of customers.