AWS Case Study: Ensembl

Glenn Proctor of the European Bioinformatics Institute explains how the Ensembl project, a joint project between EMBL-EBI and the Wellcome Trust Sanger Institute, is using AWS for their website:

Ensembl

Hi Glenn, briefly tell us about your business.
I work for the European Bioinformatics Institute on the Ensembl project. Ensembl is a joint project between EMBL-EBI and the Wellcome Trust Sanger Institute  to develop a software system that produces and maintains annotated genetic data. We make all this information freely available online. I lead the team that is exploring ways in which cloud computing can improve access to Ensembl’s resources.

How have you incorporated Amazon Web Services as part of your architecture? What services are you using and how?
We have used AWS to build two fully functional web-based mirrors of the Ensembl website and associated resources in the cloud—one in the eastern United States and one in Asia. For each mirror, the architecture is identical and uses several Amazon Elastic Compute Cloud (Amazon EC2) technologies. The website sits behind Amazon Elastic Load Balancing (Amazon ELB) and has two load-balanced Apache Web Server instances, although this number can be increased using Auto Scaling, if necessary. The web server nodes talk to a MySQL database running on a separate AWS instance, backed by a couple of Amazon Elastic Block Store (Amazon EBS) volumes. We have another MySQL instance that backs our Biomart tool. A separate instance is used for collecting log data from the other nodes and as an endpoint for our VPN. We use Amazon Simple Storage Service (Amazon S3) for backups and snapshotting and also to distribute the Ensembl data as part of the Amazon Public Data Sets initiative. Search functionality is handled by another instance that runs our Apache Lucene–based search server.

What programming languages and/or tools did you use to build this solution?
Ensembl is primarily written in Perl using a MySQL back end. There is a bit of Java code. We use Apache as our primary web server, augmented by Nginx for static content. The website makes extensive use of JavaScript and CSS. We use Apache Lucene for search and memcached for in-memory caching.

Why did you decide to use AWS?
Our initial involvement was via the Public Data Sets initiative. We then wanted a way to provide a mirror of Ensembl that was geographically closer and therefore quicker to access for our users in North America. A collaborator was already using AWS and recommended it, which is why we used it. We were impressed with the breadth of services offered (which has continued to increase), the ease of administration, and ease of access: We use SSH to access local machines every day, so using it to access AWS instances was second nature.

How has AWS helped your business?
Being able to spin up one or a dozen compute instances of any size, with any amount of attached storage, in a few minutes has completely changed the way we think about how we provide our services.

Can you share any metrics on your usage of AWS to date?
In terms of capability, AWS enabled us to quickly build two fully functional mirrors of our system, one in the United States and one in Asia. This has improved redundancy and, crucially, response time for users in those regions. For example, the response times of our most popular web pages for users in the eastern United States used to be around 5 or 6 seconds on average. With our AWS-based mirror, these same users are now enjoying load times for these same pages of less than a second.

Have you learned any valuable lessons during this development process that you’d like to pass on to other developers?
Take as much advantage of the built-in services offered by AWS like Amazon Relational Database Service (Amazon RDS) as possible rather than building everything independently. Design for failure; take advantage of snapshotting, multiple availability zones, and multiple regions. Amazon provides a lot of training opportunities, both online and in person: Make the most of those.

Do you have any future plans to incorporate other AWS solutions?
We are impressed with the continuing high rate of introduction of new AWS technologies and hope to adopt those technologies as appropriate. We are also planning to make some of our analysis software available as AMIs so that people can run it themselves without the overhead of installation and configuration. Our positive experience with AWS has been used as a model for adoption of similar technologies throughout the rest of our organization.

To learn more, visit http://uswest.ensembl.org/ This link will launch in a new browser window or tab..

Added October 3, 2011

Top









Security Whitepaper
Learn about our physical and operational security processes for network infrastructure.

whitepaper View Whitepaper (pdf)



AWS Customer News
Read the latest announcements about AWS customer success and innovation.

View Media Coverage

©2013, Amazon Web Services, Inc. or its affiliates. All rights reserved.