AWS Case Study: Integrated Proteomics
About Integrated Proteomics
Integrated Proteomics Applications (IPA) provides data analysis solutions for academic institutes, research organizations, and pharmaceutical companies studying the protein interactions of biological systems. The company’s software product, Integrated Proteomics Pipeline (IP2), features a streamlined interface for identifying, quantifying, and analyzing protein data.
IPA’s employees are headquartered in San Diego, California. Together, the company’s three founders have more than fifty years of experience in the proteomics field and many scientific publications to their credit.
IPA realized that the substantial costs associated with developing and maintaining an in-house infrastructure are too great for many institutions involved with proteomics. In response, the company developed a cloud service using Amazon Web Services (AWS). IPA’s cloud service, which can be fully integrated with the IP2 software, gives researchers the high-throughput compute power and robust storage capacities needed to conduct mass spectrometry-based proteomics analysis.
Why Amazon Web Services
Because IPA’s founders had previously utilized Amazon Elastic Compute Cloud (Amazon EC2) to build bioinformatics tools for a large, non-profit research organization, the company did not hesitate to establish their own infrastructure-as-a-service (IaaS) in AWS. CEO and Co-founder Robin Park notes, “AWS is at the forefront of cloud services, so we trust that their service is mature and stable.”
Park goes on to say, “The scalability of the AWS Cloud allows analysis results to return quickly, no matter how large the data size or how computationally intensive.” Currently, IPA’s data processing performs in customized AMIs within Amazon EC2 High-CPU Extra Large instances for Linux. The results are rapidly cross-referenced with protein databases housed in Amazon Simple Storage Service (Amazon S3) Although analysis normally takes several days on a desktop computer, when coupled together, the IP2 software and IPA cloud service completes the process in just a few hours.
The diagram below outlines IPA’s architecture:
In addition to improved processing speed, the new proteomics cloud service is now obtained at the economical price-point IPA originally envisioned for its customers. The company conducted an experiment comparing the cost of typical protein research performed on a local cluster of sixty-four servers with the cost of the same research in the cloud service. IPA’s experiment determined that over the course of three years, its AWS-based service can save customers more than $100,000, not including the additional expenses of the IT staff required to maintain physical machines. Dr. John Yates, Co-founder and Scientific Advisor, corroborates the experiment’s results with his own experience: “Having used large computing clusters for years, the cost effectiveness of AWS cloud computing while maintaining high performance is extraordinary.”
In light of this success, IPA aims to grow its current customer base in the United States while expanding its cloud service into Asia. The company also plans to incorporate AWS into new technology solutions for additional bioinformatics fields.
Dr. Tao Xu, Co-founder and President, concludes, “Cloud computing is the future. IPA is proud to use AWS to help us lead the field in proteomics data analysis and provide the most comprehensive cloud computing-based proteomics solutions for biomedical research and drug discovery.”
To find out more about how AWS can help you store and process big data, visit our Big Data details page: http://aws.amazon.com/big-data/.