Auto-Scaling Web Sites Using Amazon EC2 and Scalr
Scalr provides the framework for the deployment of auto-scaling and auto-healing web sites using Amazon EC2. This tutorial walks through installing and setting up Scalr for your own site.
Submitted By: John Fronckowiak
AWS Products Used: Amazon EC2
Created On: June 26, 2008
Abstract
Amazon Elastic Cloud Computing (Beta) (Amazon EC2) and Amazon Simple Storage Service (Amazon S3) provide highly scalable and quickly configurable computing power and storage, on a pay-as-you-use basis. Administration of these resources can be burdensome to contemporary network or computing managers who find themselves in a rapidly changing and mobile workplace.
This article discusses the implementation of Scalr, which is an open source, fully redundant, self-curing, and self-scaling hosting environment that uses Amazon EC2. Scalr allows network administrators to create virtual server farms, using prebuilt components. Scalr uses four Amazon Machine Instances (AMIs) for load balancing, databases, application server, and a generic base image. Administrators can preconfigure one machine and, when the load warrants, bring online additional machines with the same image, to handle the increased requests.
About the Amazon Services
Amazon provides developers direct access to the same tools and technologies that it uses to build its own web presence. Two of the most powerful and exciting web services provided are Amazon EC2 and Amazon S3.
Amazon EC2 provides developers with access to nearly limitless virtual-computing power. Developers can select and launch preconfigured virtual machines, to which they will have direct access and which they can use and customize for the specific needs of their applications. Amazon EC2 provides a variety of virtual servers, in most Linux flavors, that are preconfigured for common applications. The service also provides the ability to host Microsoft Windows servers. Amazon EC2 services are fee-based; price is dependent on the size of the virtual server and data transferred in and out of the server.
Amazon S3 provides developers with simple, Internet-based storage that can be used with Amazon EC2 servers. Customized Amazon EC2 servers, called machine instances, are stored in Amazon S3. Developers who use Amazon EC2 must also have an Amazon S3 account. Like Amazon EC2, Amazon S3 is fee-based, with price dependent on the amount of storage used and data transferred in and out.
The remainder of this article assumes that you have signed up for Amazon EC2 and S3 services and obtained the private and public keys needed to use these services. If you haven't done this yet, please read the Amazon EC2 Getting Started Guide, which provides a step-by-step guide to accessing Amazon EC2 and S3 services.
The Need for Web Scalability
Scalability is a web server's ability to maintain a site's availability, reliability, and performance as the amount of simultaneous web traffic, or load, hitting the web server increases.
The major issues that affect web site scalability include the following:
- Performance
- Load management
Performance is a measurement of how efficiently a site responds to a web browser request. Application performance can be designed from the start, tweaked as the application is deployed, and monitored while the application is running. Performance is affected by a number of factors:
- Application design
- Database connectivity
- Hardware resources
- Network capacity and bandwidth
Web applications that will be used by many clients must be designed with performance as a paramount concern. Once the application is deployed, system administrators can tweak performance at the operating system, database, and even the application level. The number of concurrent users that your application will support has the most direct impact on performance. Performance needs to be measured based on the speed of a single transaction and the latency (typically measured by overall load averages) incurred by increasing the number of concurrent users.
Scalr Overview
By using Scalr, you can create a server farm that uses prebuilt AMIs for load balancing, web servers, and databases. You also can customize a generic AMI, which you can use to host your actual application.
Scalr monitors the health of the entire server farm, ensuring that instances stay running and that load averages stay below a configurable threshold. If an instance crashes, another one of the proper type will be launched and added to the load balancer.
The Four AMIs
Scalr provides four AMIs that you can use to build your server farm:
- ami-50d53039 — A base application server.
- ami-5ed53037 — The load balancer. Scalr implements two approaches to load balancing: Pound and nginx. Both provide open source solutions to web application scaling.
- ami-53d5303a — A MySQL server image.
- ami-5dd53034 — A base image that can be customized to your specific application needs.
All the AMIs are based on Ubuntu 7.
Installation
The first thing you'll need is a machine on which you can run Apache, PHP, and MySQL. Amazon EC2 is a great place to find a machine resource; however, you can run the Scalr management application on any UNIX-based machine that offers the specified support. I began with a base Ubuntu 8.04 image. Specifically, I launched an instance of ami-6a57b203. If you haven't used Amazon EC2 services before, please read the Amazon EC2 Getting Started Guide, which provides a step-by-step guide to getting access to Amazon EC2 and S3 services.
Once I had the base Ubuntu image up and running, I needed to make sure that all the necessary services were running and available. I began with installing MySQL. After I logged into my image at the command prompt, I issued the following command:
apt-get install mysql-server mysql-client
Next, to ensure that I had the most recent versions of Apache and PHP running, I issued the following command:
apt-get install apache2 php5-cli libapache2-mod-php5
Then, I restarted the Apache web server:
/etc/init.d/apache2 restart
Finally, I added PHP-MySQL integration. I also installed the PHP Mhash and MCrypt modules that are used by the Scalr application, and then restarted the Apache web server:
apt-get install php5-mysql apt-get install php5-mcrypt apt-get install php5-mhash /etc/init.d/apache2 restart
After you ensure that the necessary services are available, you need to create a database for the Scalr application, and then download the scalrdump.sql script from the Scalr web site, and add that file to your Scalr database. I accomplished all this with the following commands:
mysqladmin -p create scalr wget https://scalr.googlecode.com/files/scalrdump.sql mysql -p scalrNext, you need to download and unzip the Scalr application files:
wget https://scalr.googlecode.com/files/scalr-0.5.zip apt-get install unzip unzip scalr-0.5.zipNow you're ready to begin modifying the necessary application files. Edit the etc/config.ini file to include your database credentials. The config.ini file should look like the following:
[db] driver=mysql ; WE MUST USE MySQL extension because mysqli not support nconnect needed for pcntl fork host = "localhost" name = "scalr" user = "*YOUR_MYSQL_USERNAME**" pass = "*YOUR_MYSQL_PASSWORD*" [debug] profiling = 1 app = 1 level = 0 db = 1You need to modify the permissions on a few of the files so that they are writable:
chmod -R 0777 cache/smarty cache/smarty_bin cache/smarty_bin/en_US logs etc/clients_keyscron/cron.pidNext, you need to add your Amazon Web Services (AWS) private key and certificate files to the etc directory. To my etc directory, I added the following files:
cert-KBXUS23JQWFAUVIDNDPI62GIBKJ3BW2K.pem pk-KBXUS23JQWFAUVIDNDPI62GIBKJ3BW2K.pemBefore you can launch the Scalr management application, you need to make sure that BIND is installed. You can do so by using the following command:
apt-get install bind9Before you can continue the configuration process in the Scalr web application, you must make sure that /var/www points to the directory in which you downloaded the Scalr application. You will then be ready to open the Scalr web application:
![]()
Open your web browser, and point it to your AMI domain.
Log in, using the username "admin" and the password "admin". You will now be ready to configure the final settings. Click the Settings menu item, and then select Nameservers, Add New:
![]()
Simply add the name of your Nameserver host:
![]()
This name is the domain name of your current server. Click Save to add the new Nameserver.
Finally, you need to configure the core settings. Click the Settings menu item, and then select Core Settings. The Settings, General page is displayed, as shown in Figures 4 and 5.
You can change the administrative username and password:
![]()
Scroll down the page a bit further to provide your AWS Account ID and Key name and to change the Event handler URL to the IP address of your current system. To find your account ID, open the Your AWS Profile page inside https://aws-portal.amazon.com, find the Account Number field, and remove all dashes from the number in that field:
![]()
Where to Next?
If you've set up Scalr on an EC2 instance like we did in this tutorial, now may be a good time to use the AMI tools and create your own AMI based on the newly configured instance.
With Scalr installed and configured, you can begin to deploy your own scalable web application. By default, a Scalr deployment will include separate AMIs for your database, application server, and load balancer. You can use the Scalr interface to launch these AMIs. You then can use any tools that that you typically use to interact with AMI instances. Once launched, Scalr will automatically ensure that your application will scale as dictated by the overall load.
What types of applications make sense for a Scalr deployment? Anything that you expect to support a large number of users.The system was initially designed for MediaPlug, a white label audio, video, and image transcoding service that needed to scale based on customer demand.
Conclusion
When supporting many current users in your web application, scalability is of utmost importance. Traditionally, deploying scalable web solutions required a significant capital investment. Scalr provides a simple, web-based interface for managing Amazon EC2-based server farms. You can launch a group of servers that will work in concert to support your application. Additional servers are deployed as the concurrent user load increases, and new instances are launched automatically if a server crashes. The open source solution provided by Scalr and Amazon EC2 provides business with the opportunity to build highly scalable web solutions while minimizing the amount of capital investment.
Learning More About AWS
This article highlights a few aspects of working with AWS. Here are a few more resources available to developers to help you learn more.
- Scalr Open Source Repository
- The Elastic Cloud Computing Web Site
- The Elastic Cloud Computing API
- The Simple Storage Service Web Site
- The Simple Storage Service REST API
- Introduction to AWS for PHP Developers
- PHP-AWS (a PHP library for using AWS)
About the Author
John Fronckowiak (john@idcc.net) is the President of IDC Consulting, Inc., providing consulting and technical writing. John is also a Clinical Assistant Professor in Information Systems at the Adult Learning Program of Medaille College (https://www.medaille.edu/alp). He is the author of several books about programming, database design and development, and networking.