Decreasing costs and rapid technological innovation have resulted in a tremendous increase in the volume of biological data being generated. At the same time, life sciences research is becoming increasingly collaborative and complex. Amazon Web Services (AWS) provides a complete set of easy to use, flexible tools to help your lab address these problems, including solutions for high performance computing, data sharing between sites, archiving, and storage.
"We completed the equivalent of 39 years of computational chemistry in just under 9 hours."
Steve Litster, Ph.D., Global Head of Scientific Computing, Novartis
“The AWS Cloud enables swift collaboration even with hundreds of terabytes of data."
Dr. Narayanan Veeraraghavan, Lead Programmer Scientist at Human Genome Sequencing Center, Baylor College of Medicine
“Our customers need access to irregular but intense computational capacity. For some projects, they may need to scale to 100 servers for a few days, but once analysis is done, they scale back down to virtually none."
Sebastian Wernicke, Head of Business Development, Seven Bridges Genomics
AWS provides all of the basic building blocks necessary to create your own cloud supercomputer. You can create an account and launch your first machine in minutes. You won't need specialized computer hardware or, more importantly, extensive facilities to host servers.
Web File Storage – Amazon Simple Storage Service (S3) acts like a virtual drive that is practically unlimited in size. Each file you store can be as large as 5 TB and you can share them with anyone of your choosing who has access to the web.
Virtual Hard Drive – Amazon Elastic Block Store (EBS) acts like a hard drive for for fast input/output and boot volumes that is attached to your virtual computer providing up to 48,000 IOPS (input/output operations per second) per virtual machine.
Archiving – Amazon Glacier is designed for archiving your data with a durability of 99.999999999%, meaning your data will be there when you need it. Your data can be stored for as little as $0.01 per GB per month.
Amazon Elastic Cloud Compute (EC2) acts like a virtual server that you provision and control. EC2 is located in the cloud, so you don't need infrastructure and can obtain access to new servers in minutes, rather than days, weeks, or months. EC2 servers can be turned on and off almost instantly. You can start up one, or even thousands, of servers simultaneously, run your experiment, turn them off, and only pay for the time your servers were running.
The AWS environment is designed to give customers the ability to follow a broad range of international security and data protection standards. If your lab uses protected health information as part of its research program (as defined under the U.S. Health Insurance Portability and Accountability Act (HIPAA)), AWS can sign Business Associate Agreements (BAAs) and provide the technical infrastructure to ensure you have the ability to meet your statutory privacy requirements.
When your lab is planning to analyze particularly complex data sets, such as genomic or proteomic data, you can take advantage of full breadth of EC2 servers, including those specifically designed for high-performance computing (HPC). You can customize for the features that best fit your application, including applications optimized for biological simulations or for graphical representation of biomolecules.
The work done in your lab may already qualify as "big data". If you are already dealing with a large and growing volume of biological data, your projects might benefit from tools like Amazon Elastic MapReduce (EMR), a Hadoop framework that runs on Amazon EC2 and can dramatically speed up data processing for large data sets. Database tools are available too, such as Amazon Relational Database Service (RDS), which automates many database administration tasks, freeing up more time to focus on your research.
The security features in the AWS Cloud are built to help you meet your regulatory and institutional requirements. AWS servers are in buildings monitored by security guards and electronic surveillance 24/7. Your data is secured by a firewall by default. Data can be encrypted while in transit or at rest, with encryption keys you can store yourself or with AWS.
Internet Connection - your lab's standard web connection can be used to access all of the AWS services
AWS Direct Connect - establishes a direct, private, hourly connection between your premises and AWS, which can provide transfer speeds of up to 10 Gbps.
Popular data transfer tools - AWS Marketplace features dozens of common software packages for data transfer to the cloud.
AWS Import/Export - allows you to mail your large datasets to AWS stored on portable storage devices or hard drives.