Category: Amazon EC2


On Condor and Grids

by Jeff Barr | on | in Amazon EC2, Conferences & User Groups |

There is lots of buzz about Hadoop and Amazon EC2and of course there should be, given all the great projects such as the one that the New York Times one, where they converted old articles into PDF files in short order at a very reasonable cost.

Theres a second environment you should know about, although the buzz level is a bit lower. (That might change.) Condor is a scheduling application that is commonly used in HPC and grid applications. It can also be used to manage Hadoop grids, and manages jobs in much the same manner as mainframesthat is, you submit a job to Condor, along with metadata that describes the jobs characteristics. Then Condor finds suitable resources to allocate for the job. Note that Condor and Hadoop are trying to solve things in independent ways–with the result that they overlap in some ways, while doing unrelated things in some cases.

This week I attended Condor Week at the University of Wisconsin in Madison. Condor Week is an annual event that gives Condor collaborators and users the chance to exchange ideas and experiences, to learn about latest research, to experience live demos, and to influence our short and long term research and development directions.

If you are interested in large-scale grid computing, this approach is worth a serious look. There are two active projects that implement Condor on Amazon EC2, and of course thats why this blog entry is being posted.

Cycle Computing offers Amazon EC2 plus Condor as an integrated platform, in addition to supporting other underlying computing resources. Their software automates Condor grid management, including monitoring, configuration, version control, usage tracking, and more. At the conference Jason Stowe from Cycle Computing made a very strong case for using Amazon EC2 instead of a traditional grid environment. Jasons presentation is available for download at http://www.cs.wisc.edu/condor/CondorWeek2008/condor_presentations/stowe_cycle.pdf.

Red Hats approach integrates EC2 directly into the Condor code base. The result is that an Amazon EC2 instance is the Condor Job, and in that manner they are able to manage the entire life cycle of an EC2 Instance. In some cases the entire Condor pool is running on EC2, and in other cases EC2 augments an existing pool. All of this work was done by collaboration between the University of Wisconsin (Jaeyoung Yoon , Fang Cao, and Jaime Frey, along with Matt Farrellee from Red Hat. They plan to integrate Amazon S3 as a storage medium in the near future.

One thing seems certain: on-demand virtualization brightens the lights in Grid Computing City, because organizations who could not afford a grid suddenly find themselves with both affordable infrastructure and powerful tools to manage their new-found tool.

Mike

Animoto – Scaling Through Viral Growth

by Jeff Barr | on | in Amazon EC2 |

Animoto is a very neat Amazon-powered application. Built on top of Amazon EC2, S3, and SQS, the site allows you to upload a series of images. It then generates a unique, attractive, and entertaining music video using your own music or something selected from the royalty-free library on the site. Last week I spoke to a group of Computer Science and IT students at Utah Valley State College. Before leaving Seattle I spent some time downloading images from their athletics site. I then combined this with some Southern Surf Syndicate music from The Penetrators and ended up with this really nice video:

There’s a lot going on in the background. After the images and the music have been uploaded, proprietary algorithms analyze them and then render the final video. This can take an appreciable amount of time and requires a considerable amount of computing power.

Animoto co-founder and CEO Brad Jefferson stopped by Amazon HQ for a quick visit on Thursday. Earlier in the week we had seen their EC2 usage grow substantially and I was interested in learning more. Brad explained that they had introduced the Animoto Videos Facebook application about a month earlier and that it had done pretty well, with about 25,000 users signing up over the course of the month, with steady, linear growth.

The reaction from the Facebook community was positive, so the folks at Animoto decided to step it up a notch.  They noticed that a significant portion of users who installed the app never made their first Animoto video yet the application (as they themselves admit) relies heavily on the ‘wow’ factor of seeing your first Animoto video and wanting to share it with your friends.  On Monday the team made a subtle but important change to their application: they auto-created a user’s first Animoto video.

That did the trick!

They had 25,000 members on Monday, 50,000 on Tuesday, and 250,000 on Thursday. Their EC2 usage grew as well. For the last month or so they had been using between 50 and 100 instances. On Tuesday their usage peaked at around 400, Wednesday it was 900, and then 3400 instances as of Friday morning. Here’s a chart:

Animoto_ec2_usage

We are really happy to see Animoto succeed and to be able to help them to scale up their user base and their application so quickly. I’m fairly certain that it would be difficult for them to get their hands on nearly 3500 compute nodes so quickly in any other way.

— Jeff;

Scalr – Scalable Web Sites with EC2

by Jeff Barr | on | in Amazon EC2 |

Scalr Dave Naffis of Intridea wrote to tell me that they have released Scalr in open source form. Scalr is a fully redundant, self-curing, self-hosting EC2 environment.

Using Scalr you can create a server farm using prebuilt AMI’s for load balancing (either Pound or nginx), web servers, and databases. There’s also a generic AMI that you can customize and use to host your actual application.

Scalr monitors the health of the entire server farm, ensuring that instances stay running and that load averages stay below a configurable threshold. If an instance crashes another one of the proper type will be launched and added to the load balancer.

Download the code, take a look at the diagrams, or (always the last resort) read the installation instructions.

— Jeff;

New EC2 Features: Static IP Addresses, Availability Zones, and User Selectable Kernels

by Jeff Barr | on | in Amazon EC2 |

We just added three important new features to Amazon EC2: Elastic IP Addresses, Availability Zones, and User Selectable Kernels. The documentation, the WSDL, the AMI tools, and the command line tools have been revised to match and there’s a release note as well.

Read on to learn all about them…

Elastic_ball The Elastic IP Addresses feature gives you more control of the IP addresses associated with your EC2 instances. Using this new feature, you use the AllocateAddress function to associate an IP address with your AWS account. Once allocated, the address remains attached to your account until released via the ReleaseAddress function.Separately, you can then point the address at any of  your running EC2 instances using the AssociateAddress function.The association remains in place as long as the instance is running, or until you remove it with the DisassociateAddress function.Finally, the DescribeAddresses function will provide you with information about the IP addresses attached to your account and  how they are mapped to your instances. Accounts can allocate up to 5 IP addresses top start; you can ask for more if you really need them. Addresses which you have allocated but not associated with an instance will cost you $.01 per hour.

 

Giant_world_map Availability Zones give you additional control of where your EC2 instances are run. We use a two level model which consists of geographic regions broken down into logical zones. Each zone is designed in such a way that it is insulated from failures which might affect other zones within the region. By running your application across multiple zones within a region you can protect yourself from zone-level failures.

The new DescribeAvailabilityZones function returns a list of availability zones along with the status of each zone. The existing RunInstances function has been enhanced to accept an optional placement parameter. Passing the name of an availability zone will force EC2 to run the new instances in the named zone. If no parameter is supplied, EC2 will assign the instances to any available zone.

 

Linux_kernel Finally, the User Selectable Kernels feature allows users to run a kernel other than the default EC2 kernel. Anyone can run a non-default kernel, but the ability to create new kernels is currently restricted to Amazon and select vendors. This feature introduces a new term, the AKI or Amazon Kernel Image. The AKI can be specified at instance launch time using another new parameter to RunInstances, or it can be attached to an AMI (Amazon Machine Image) as part of the image bundling process.

We are also rolling out 32 and 64 bit versions of Linux kernel version 2.6.18, all packaged up as AKIs and ready to run. And there’s a new 32 bit Fedora Core 6 AMI and both 32 and 64 bit versions of Fedora Core 8.

 

The developers at RightScale are already supporting these new features in the  free version of their RightScale platform. They’ve also assembled three very informative blog posts.

The first post covers DNS and Elastic IPs and how they come in to play when upgrading a server. One sentence from this post really captures the essence of cloud computing as applied to the upgrade process:

The power of the cloud is that we dont need to touch our existing web server and risk causing damage during the upgrade process. Instead we launch a second web server and install the new release on it.

The second post reviews the process of setting up a fault-tolerant site using the Availability Zones. This post describes two different ways to create a redundant architecture with the ability to load balance traffic across zones or to fail over to a second zone when the first one fails. When that happens redundancy can then be re-established by bringing another set of instances to life in yet another zone. As they note:

If you have never tried to set something like this up yourself starting from renting colo space, purchasing bandwidth to buying and installing servers, you really cant appreciate the amount of capital expense, time, headache, and ongoing expense saved by EC2s features! And best of all, using RightScale its just a couple of clicks away :-).

Finally, the third post  announces the fact that they now support the new Elastic IP and Availability Zone features. You’ll need to read the entire post, but they are pretty excited by the opportunities that this new set of features open up:

Whats really exciting is that the combination of Elastic IPs and Availability Zones bring cloud computing to a different level. In the above example, when the app servers get relaunched in a new zone, EC2 allows the elastic IPs that were associated with the app servers to be reassigned from the old servers in the failed zone to the new ones. So now traffic doesnt just get routed to new instances, it actually gets routed to a different datacenter. From the outside this may seem straightforward, but in reality the degree of engineering that is necessary to support this type of technical feature is quite staggering.

We’re looking forward to hearing from more developers and system architects as they engineer these new features into their systems. As always, drop me a note at awseditor@amazon.com if you have done something that you’d like us to cover in this blog.

— Jeff;

Increasing Your Amazon EC2 Instance Limit

by Jeff Barr | on | in Amazon EC2 |

Ec2_bump_limit We have simplified the process of requesting additional EC2 instances. You no longer need to call me at home or send a box of dog biscuits to Rufus.

You can now make a request by simply filling out the Request to Increase the Amazon EC2 Instance Limit form. We’ll need to know a little bit about you and about your application and the number of instances that you need, and we’ll take care of the rest.

As always, if you are doing something cool with EC2, we really  want to hear about it! Write a blog post that we can link to, or simply send us an email at awseditor@amazon.com .

— Jeff;

What’s the Difference Between Amazon FPS and Amazon DevPay?

by Jeff Barr | on | in Amazon DevPay, Amazon EC2, Amazon FPS |

Weve heard from a few folks that its not clear what the difference is between some of the Amazon Web Service offerings. This is a very short post to try to clarify two services, plus a product feature. Like most short descriptions, I am short-changing the rich feature set of each offering. Visit aws.amazon.com for more information on each.

Using Amazon Flexible Payments Service (Amazon FPS), developers can accept payments on websites. It has several innovative features, including support for micropayments.

Amazon DevPay instruments two Amazon Web Services to enable a new sort of Software as a Service. Amazon DevPay supports applications built on Amazon S3 or Amazon EC2 by allowing you to resell applications built on top of one of these services. You determine the retail price, which is a mark-up above Amazons base price. Customers pay for your application by paying Amazon. We deduct the base price plus a small commission; then deposit the rest into your Amazon account.

Amazon EC2 Public AMIs (Amazon Machine Images) are not a service as such. Rather these virtual server representations are a feature of Amazon EC2, designed with Amazon DevPay in mind. They are usually configured with your value-add software that you want to monetize using a monthly fee and/or markup above the base fee that Amazon charges. One of the best-known examples of a public AMI is Red Hat RHEL, which is available for a monthly fee plus an hourly fee. Its fully supported by Red Hat, which makes the virtual version of their software viable for many companies who are Red Hat customers.

— Mike

EC2 Firefox Extension is now Open Source

by Jeff Barr | on | in Amazon EC2 |

Ec2_firefox The very cool EC2 Firefox Extension is now an open source project on SourceForge!

The extension makes it really easy to launch and manage Amazon EC2 instances. After creating your keypairs and security groups, you can simply right-click on any of the listed AMIs (Amazon Machine Images) and choose to launch one or more instances.

All of your running instances are listed at the bottom where they can be identified, controlled, monitored, shut down, and so forth. You can easily capture the public DNS name of any running instance, and then paste it into your favorite SSH client (e.g. PuTTY, my personal favorite) to create a secure connection to your new EC2 instance.

 

Ec2_firefox_menu There’s also a very cool menu entry labeled “Launch more of these” for instant scalability (assuming, of course that you’ve built your application in a scalable fashion, a subject of another post).

As-is, the extension is pretty cool but as always seems to be the case with something cool, everyone who uses it has ideas for even more cool features. Some people seem to want a slightly less technical view and others want to go in the opposite direction. A lot of people would like to have more control over the list of displayed AMIs.

We’ve released the extension in source form and are now eagerly anticipating the results. The extension is written in JavaScript and you’ll need to know a little bit about CSS and DHTML to be productive.

Let us know what you come up with!

— Jeff;

Amazon EC2 Gets More Muscle

by Jeff Barr | on | in Amazon EC2 |

MuscleThe Amazon EC2 team just added Large and Extra Large instance types to EC2. The former “one size fits all” instance type is now known as a Small instance.

Large instances are 4 times larger in each dimension (CPU power, RAM, and disk storage) than the Small instances and cost $0.40 per hour. Extra Large instances are 8 times larger in each dimension and cost $0.80 per hour.

Both of the new instance types support 64-bit computing. While the Large Instance type offers 7.5 GB RAM, the Extra Large Instance Type offers 15 GB RAM (compared to the Small instance type and its 1.7 GB RAM). To help developer compare the new instance types, we are measuring the CPU capacity using a new term called an EC2 Compute Unit. The EC2 home page has more information about this.

When I first heard the news, I fell off my seat after reading the specs, especially ’64-bit’ and ’15 GB RAM’. This is addressing one of the most common requests that we have heard from our developers.

With these new types of instances, developers will now be able run ravenous applications like large databases and/or compute-intensive tasks like simulations. Most importantly, they will be able to mix-and-match based on their infrastructure needs. Some Ideas that I can think of are:

  • Small-scale user : 1 Small instance running the entire month (Website Hosting)
  • Medium-scale user:  4 Small instances, 2 Large instances (Social Networking App)
  • Compute intensive on-demand parallel user: 400 instances for 72 hours (Hadoop Cluster)
  • High-perf user: 20 Extra Large instance for 14 days (Biotech Drug Synthesis or Render Farms)
  • Database or file share hosting user:  8 Large instances running the entire month (Memcached-based Applications)
  • Mixed large-scale user: 16 small instances, 4 large instances, 2 extra large instances, running entire month (Large Web-Scale Application)

Imagine the new possibilities!

If you have more ideas for how you would use these new instances, I would love to know.

I have also updated our AWS Simple Monthly Calculator with the new Instance Types where you can get estimate of your monthly bill based on your usage.

We are working hard to improve our products based on the feedback that you provide us. Keep the excellent feedback coming in!

— Jinesh

\HelloWorld\ Facebook Application AMI

by Jeff Barr | on | in Amazon EC2 |

Jinesh_facebook I was quite curious to see how facebook applications can be built which use Amazon S3 and Amazon EC2 so I spent some time creating a “HelloWorld” facebook application that lists your Amazon S3 objects given the bucket name (basically integrate Amazon S3 libraries with Facebook libraries). I bundled up my code/configuration and created a Public AMI so that facebook developers can simply re-use my configuration and host their app in Amazon EC2.

AMI ID: ami-74f3161d
AMI Manifest: aws-facebook-app/image.manifest.xml

This Amazon Machine Image is pre-configured and ready-to-go for hosting your Facebook Application. There are some simple steps listed in our Resource Center Public AMIs page that will help you get started.

So now you have solid scalable infrastructure to back you up, innovative facebook platform to play with and all you need is the Killer Idea!

If you think this is helpful, let me know through comments. We could extend this AMI and build an Auto-Scaling module around the app so that we can simply “Auto-Scale Facebook Applications” out of the box and never worry about servers when you get famous overnight.

Thoughts?

–Jinesh

Start filling up your Shopping Cart with AMIs

by Jeff Barr | on | in Amazon EC2 |

Amazon EC2 allowed developers to create and bundle their software into Amazon Machine Images – Pre-packaged Pre-configured Filesystem. Developers were then able to share their AMIs with friends and family (no kidding) and even with the general public.

Now with our brand new “Paid AMI support“, they can set their own price and earn perpetual commission. This adds a whole new business model to Amazon EC2.

For example, A Ruby on Rails Developer can now configure the entire stack (Nginx, Apache, Mongrel, MySQL and all the open source goodies that “simply works”), set its price, say $0.15 cents/hour and $0.12 /GB-up and $0.21 /GB-down and fire away. While Amazon EC2 gets the same old traditional $0.10/hour, $0.10/GB-up and $0.18/GB-down, the developer (AMI-creator) gets the difference (in this case, $0.05/hour, $0.02/GB-up, $0.03/GB-down) credited back to his account from whomever who instantiates that image.

The AMI-creator can set any price for the AMI depending upon the software that is loaded or as compensation for the work and time he has put onto making it and AMI-consumer simply purchases the AMI just like he purchases more tangible items from Amazon.com.

Now lets think a little harder to see what can we do with it. Imagine all the possibilities. You are more than welcome to brainstorm (in the comment section of this post). A few that come to my mind are:

  • Monetizing ‘Software As A Service’ – John Doe has a web app (say CRM software, Blogging Software) that consumers, in the past, used to install, configure, optimize. Now it comes “factory-installed”. John can deploy and create a Paid AMI, run some numbers and set the price and consumers can simply pay for the “service” over and above the hosting charges.
  • Monetizing your Configuration Setup – John Doe has provided some migration, automation and upgrade utilities over and above already configured Apache-Tomcat.
  • Monetizing your Optimization Setup – John Doe has developed a smart hack that bumps up performance for Rails/Tomcat/WebSphere in a particular configuration setting or configured and opmitized instances that work as a MySQL cluster.

More ideas are always welcome!

–Jinesh