Scalable Media Hosting with Amazon S3

Articles & Tutorials>Scalable Media Hosting with Amazon S3
Why slow your web server down by hosting media files? Craig Noeldner and AWS Evangelist Mike Culver show how to configure your domain provider to use Amazon S3 for simple, scalable media hosting.

Details

Submitted By: Craig@AWS
AWS Products Used: Amazon S3
Created On: November 28, 2007 12:25 AM GMT
Last Updated: September 21, 2008 10:10 PM GMT

By Craig Noeldner and Mike Culver, Amazon Web Services

Scenario: Imagine you have a small web site with big potential. You’re currently using a reasonably-priced web hosting provider that provides a good value for the amount of traffic you normally receive. Perhaps you’ve gone one step further and are hosting your site on a dedicated server. However, your site has caught the attention of the blogosphere and you’re about to get much more traffic than you can handle in your current web hosting setup.

What are you going to do?

Knowing how to scale your web site can mean the difference between watching your idea take off or take a dive. A common technique for scaling a web site is to use a different server to host media files like images, videos, and audio files. This distributes the traffic and bandwidth load between hosts and allows the primary web server to focus on delivering web pages and server-side processing, rather than serving up 5MB audio files (or even 100MB videos).

If you don’t want to set up, configure, and maintain a few extra servers just for hosting your media files, then use Amazon S3. Amazon S3 is storage for the Internet and gives any developer access to the same highly scalable, reliable, fast, inexpensive data storage infrastructure that Amazon uses to run its own global network of web sites.

This tutorial walks through the steps necessary for hosting media files for your web site using Amazon S3. We’ll use a domain we’ve already registered, webscalecomputing.info, to set up a new sub-domain, media.webscalecomputing.info, that will host the images, videos, and audio files in Amazon S3.

While we won’t go into any programming details for using Amazon S3, you’ll need to have a basic understanding of web networking and DNS to read this article. (Or, you’ll need enough background to translate the concepts to your own hosting provider.)

More on Amazon S3

Amazon S3 provides a simple web services interface that can be used to store and retrieve any amount of data, at any time, from anywhere on the web. Generally, software developers use Amazon S3 in their applications that need the same highly scalable, reliable, fast, inexpensive data storage infrastructure that Amazon uses to run its own global network of web sites.

You can always improve your web site performance by moving your media files from your main web server. This could be as simple as creating a sub-domain that points to a host that serves your media files. Of course, you still have to worry about the typical heavy-lifting for any type of hosting, such as:

  • How much traffic will this setup accommodate? What happens if I get more traffic than it can handle?
  • What happens if the host goes down?
  • How do I backup the files so they’re not lost?
  • How much am I paying for idle capacity?

Amazon S3 provides answers to those questions, without the need for worrying about the pesky details of, well, implementing them.

The web services interface is simple enough that you can retrieve data using a URL, so it’s well-suited for basic web hosting tasks, like serving up media files.

The pricing for Amazon S3 is on a pay-as-you-go basis, so there is no minimum fee. This means you don’t have to invest in a large amount of hosting infrastructure or services in order to ensure that your web site handles the occasional traffic spike.

Use the AWS Simple Monthly Calendar provided by AWS to estimate your monthly bill.

http://calculator.s3.amazonaws.com/calc5.html

Amazon S3 in Action

Blue Origin is one small company with a big idea that successfully scaled its web site using Amazon S3. On January 2, 2007, the company posted information and videos on its web site about a test launch for a new vertical take-off, vertical-landing vehicle. Within the next day, the news was covered by both SlashDot and Boing Boing, sending a tremendous amount of traffic to its web site. With its media files stored in Amazon S3, it was able to instantly scale and handle the 3.5 million requests and 758 GBs in bandwidth in a single day.

Had the company hosted the web site completely on one of its internal servers, the traffic on January 04 would have overwhelmed their system capacity. If they had used a basic hosting package from a popular provider, they would have overwhelmed that service, or—even worse—exceeded the maximum allowed bandwidth for the month and occurred massive overage fees.

Blue Origin’s total charge for Amazon S3 in January? Just over $300.

SmugMug, www.smugmug.com, is another company that’s using Amazon S3 for hosting its media files. After 12 months, they’ve saved almost $1M.

Now, let’s go through the steps of hosting your media files on Amazon S3, like Blue Origin.

Signing up for Amazon S3

If you haven’t already, sign up for Amazon S3 at http://aws.amazon.com/s3. After signing up for Amazon S3, you’ll have two access identifiers needed for uploading your media files:

  • Access Key ID
  • Secret Access Key

The Access Key ID is a public identifier, like a user name, that specifies a particular Amazon S3 account. The Secret Access Key is the private identifier, like a password, that ensures you’re the one making a request.

Important: Your Secret Access Key is a secret, and should be known only by you and AWS. You should never e-mail your Secret Access Key to anyone. It is important to keep your Secret Access Key confidential to protect your account.

Uploading Your Media Files

Without going into too many details, Amazon S3 uses concepts of a bucket and object to store data. Buckets help organize a collection of objects, like how a folder might contain a list of files.

There are many tools available for working with Amazon S3 without having to write a software application. For this tutorial, we’ll use a plug-in for the Firefox browser, called S3Fox (https://addons.mozilla.org/en-US/firefox/addon/3247). You can also use one of the many code samples and tools available through the Amazon S3 Resource Center (http://aws.amazon.com/resources) or use a product built on Amazon S3 in the Solutions Catalog (http://solutions.amazonwebservices.com).

First, create a bucket in your Amazon S3 account that corresponds to the domain you’ll use to host your media files. For our web site, we’ll create a bucket called, “media.webscalecomputing.info”.

Important: Use lower-case letters only to name buckets that will be used in DNS redirects. This requirement is a function of the way that DNS handles names (always lower case).

Why use this specific bucket name? Amazon S3 has a virtual hosting feature that allows inbound requests from a web site, so it will serve up content from the bucket by the same name. We’ll talk more about this feature in the next section when we configure our domain.

Next, add your media files to the new bucket in Amazon S3. Using the Firefox plug-in, it’s as simple as selecting the files on your local system, then clicking the transfer button.

Amazon S3 has a rich set of access privileges for both buckets and objects, so make sure that permissions are set on both the bucket and your objects to allow everyone access. The Firefox plug-in we’re using sets this for us using a dialog box.

All the media files are now accessible through a URL that points to Amazon S3. The basic URL syntax for S3 is http://<bucket_name>.s3.amazonaws.com/<object_name>, so the files we uploaded have the following URLs:

The simplest way to use Amazon S3 for media hosting is to simply update our web pages to point to these files. For example:

 <img src=”http://media.webscalecomputing.info.s3.amazonaws.com/jeff-at-web20.jpg”/> 

However, when people download our files, we want them to look like they’re coming from our domain, and not s3.amazonaws.com. If someone chooses to download our audio file, we want users to think it’s coming from our site. We’ll now set up our domain hosting so that the files are available through a URL under http://media.webscalecomputing.info/.

Setting up Your Domain

Since we already host our web site on www.webscalecomputing.info, we now want to create a sub-domain that we’ll point to the files located in Amazon S3. This is done by using a CNAME entry on our hosting provider.

Most popular web hosting companies will let you create a new CNAME record for your domain. For our hosting company, creating a new CNAME record consisted of logging into our account, then navigating through a few DNS configuration pages until we ended up at one that allows us to create a CNAME record.

To create the CNAME record, we specify an alias, “media”, and the domain it points to, “media.webscalecomputing.info.s3.amazonaws.com”.

Now, with the CNAME record in place, the media files are now available through the following URLs:

Our web page can now reference the media files.

 <img src=”http://media.webscalecomputing.info/jeff-at-web20.jpg”/> 

That’s it!

Automatically Copying Files to Amazon S3

There are also more ways you can use Amazon S3, including automatically copying files to Amazon S3. The Resource Center in the AWS Developer Connection web site has technical documentation, code samples, and other resources you can use to learn more about Amazon S3 and build your own applications to use the service. As always, the exact tutorial to read depends on the language you’re using, but here are a few possibilities.

Language

Tutorials

Java

C#

PHP

Ruby

Learning More About Amazon S3

Why not host the entire web site in Amazon S3 and just use a domain provider to set up the appropriate CNAME records? Although it’s certainly possible, you may want to have a web server running to perform server-side processing on a script or to access a database. Amazon S3 is a storage solution, so it does not perform any server-side processing (but check out Amazon EC2 for information on scalable, virtual computing).

Of course, you don’t have to have a web site to use Amazon S3. Like Jeremy Zawodny, we use Amazon S3 to backup our home computers. (Craig pays just over $1 a month to backup his important files.)

Here are a few links for learning more about Amazon S3:

Comments

AWESOME!
Excellent article... I was up and running with S3 in less than 15 minutes!!!
Almatas on June 4, 2008 6:16 AM GMT
Easier than I thought
The main pages for S3 gave me the impression that I'd have to do some coding. But using the FFOX plugin and following your instructions made it easy. I'm using S3 to host some audios for an upcoming product launch and this seems to do the job. Thanks!
P. Koning on April 28, 2008 1:17 PM GMT
Great Aricle
I have had my S3 site up and running for a few months... couldn't figure out how to hide the amazon s3 link... stumbled upon this writing and found out about the Cname... worked perfectly. Also found the S3FF extension... love it !!
ikmyer on April 15, 2008 9:55 PM GMT
It taught me everything I needed to know
I had no idea how to use S3...didn't even really know what it was, but heard it was cool. I stumbled along, signed up, still didn't know what I was doing until I found this article. I had my images up and running on my test page in minutes and my entire site very shortly afterwards. Amazing product and fantastic article. BTW, S3 Organizer for FireFox is amazing. I'll be making a donation to them.
thehomesteadgroup on January 31, 2008 3:21 AM GMT
Just what I needed
This Tutorial outlined a solution to a problem I was having. I needed to host a variety of media files for my clients, but was constantly having to fight for space on my normal web host. I followed the tutorial and now i can host files on demand without deleting older files. This is faster than my web host to boot! Time will tell if the pricing is too good to be true!
silvertreeaustin on January 23, 2008 1:33 AM GMT
CNAME ?
According to what you're saying where you explain the reason for choosing your bucket name, it should be OK to setup your alias to point to S3.amazonaws.com Right?
tglps on January 12, 2008 9:53 PM GMT
Great, but doesn't work with EU buckets
This article was exactly the information I was looking for, but now I find it doesn't work for the new EU buckets because EU buckets can't have a . in their name for some reason. That's a great shame I feel. So to use virtual hosting your buckets have to be hosted in the US. I also found an article that said the new 'best-practice' policy for bucket names for both EU and US should not have dots . in them. So, I'm concerned that virtual hosting may not be supported for ever... Also I understood that using S3 for hosting media files may result in some failures as GET is not guaranteed to work?
pjcab on January 9, 2008 1:05 PM GMT
Great Job on this article!
I was able to follow it very easily!
akpassion on January 8, 2008 8:55 AM GMT
Great Head Start But...
I can't get my CNAME to work. I enter media on one side of the DNS editor on my dedicated hosting on godaddy, and media.mydomainname.com.s3.amazonaws.com on the other and media.mydomainname.com has still yet to be found after 24 hours? Maybe I'm too impatient. As soon as I can get it to work though, this was a great head start into leveraging the technology for my purposes.
kadardev on December 6, 2007 8:16 PM GMT
Glad to see this hear
I'm really glad to see this was added to the official list of articles at AWS. I read this a while back and it's truly the way to go when thinking about hosting your own media files. It just makes sense.
C. Grant on December 4, 2007 7:29 PM GMT
Great article
I was planning to use Amazon to host my media files, but this article made the process super smooth. I had the set up done in a few minutes. Now my files are backed up and accessible all in the same place.
Ana Lee on December 4, 2007 6:02 PM GMT
We are temporarily not accepting new comments.
©2014, Amazon Web Services, Inc. or its affiliates. All rights reserved.