After the fotopedia prototype was built, the team decided to launch on AWS’s cloud computing platform. Olivier Gutknecht, Co-Founder and Director, explains, “We first chose Amazon S3 early in the development phase for our media storage needs. We evaluated other solutions and quickly concluded that Amazon S3 was fully aligned with our needs and avoided new development work while reducing administrative and maintenance time to a minimum. Switching our first implementation to Amazon S3 was done in less than a week. And when Amazon EC2 was introduced, we quickly switched from our previous dedicated hosting solution to the Amazon Cloud.”
In June of 2009, TechCrunch broke their story a few days before their planned public launch, but fortunately the fotopedia team was able to react. It took fifteen minutes to launch additional capacity to handle the inbound traffic. Such a quick reaction time would have been impossible with classic hosting environments. “What we really like about Amazon Web Services is how it helped us build our infrastructure the right way, with flexibility, scalability and automation in mind,” states Gutknecht.
Gutknecht continues, “Building on AWS has allowed us to focus on our core features. It often allows us to introduce new features for our users sooner than anticipated, and to simplify our architecture. For example, we knew we had to switch our media hosting to a CDN at some point, and in the meantime we were using an in-house caching layer. When our widgets were featured on the 2008 LeWeb conference web site, we quickly enabled Amazon CloudFront for distribution of our images – literally days after Cloudfront launched. It was easy to implement and improved the performance for our customers.”
Today, fotopedia hosts 500,000 photos and has 19,000 subscribed users. fotopedia uses Amazon EC2, Elastic IPs, and Elastic Load Balancing for hosting and Amazon S3 and CloudFront for media hosting and for internal log and data storage. Batch processing is done in Amazon Elastic MapReduce with custom Pig scripts and Hadoop Java jobs. Gutknecht explains their architecture, “Doing complex processing on massive datasets would not have been feasible short term without extensive capital expenditures. Amazon Elastic MapReduce and Amazon EC2 instances solved this issue and enabled quick prototyping. We regularly analyze a full Wikipedia dump to extract, abstract, and compute a graph of related articles to build our photo encyclopedia. Switching to Amazon Elastic MapReduce simplified management of the Hadoop Jobs, and allowed us to concentrate on the core problems.”
From a cost perspective, fotopedia seen measurable results by using AWS vs. a traditional hosting environment. For instance, when the team purchased Reserved Instances for their production servers, their AWS bill dropped 30%. Also, handling media storage and CDN distribution with Amazon S3 and CloudFront was half the cost of their custom solution, based on a very conservative cost analysis.
In closing, Gutknecht offers sage advice to other developers and entrepreneurs: “Using AWS as your infrastructure means using a more flexible, scalable solution – but, also a slightly more complex one. Instead of fighting the novelty, it is essential to use it at its best. For instance, we ensure that almost everything in our infrastructure and application can be automated. Provisioning a new server and having it up and running, with our full application loaded does not need manual intervention. To fully automate our EC2 servers, we use Opscode Chef. Chef is a young but extremely powerful system integration framework. When we provision a new EC2 instance, we set up the instance with a simple boot script template. Chef takes over and installs all necessary software, configures monitoring probes, checks out our software. The rest of our Chef-managed infrastructure then discovers this new backend and automatically reconfigures our HTTP front servers and load balancers.”
”Running a complex application like fotopedia is not just a matter of running some Rails code and a MySQL database, but coordinating a long list of software services: some written by us, some installed as packages from the operating systems, some built and installed from source code, some provided by Amazon. What is great about using AWS as our infrastructure is that it helps you think about your application globally.”
Discover fotopedia at fotopedia.com .