Ion Flux, Inc. provides services that analyze DNA sequence data for researchers and health professionals in genomic medicine. The information resulting from Ion Flux’s analysis is used to address a wide variety of health problems impacted by human genetics. The company, which is a recent startup, has six team members located in Los Angeles, California, and Beijing, China.
Dr. Allen Day, Ion Flux’s founder and CEO, began the company’s operations using Amazon Web Services (AWS) instead of a traditional storage and computing environment. Dr. Day explains that, “Given our aggressive development timelines and the very volatile need for large amounts of computing resources, we recognized that AWS was a great fit because we would not have to expend capital building physical computing infrastructure, and the operational costs of this infrastructure could also be eliminated. It was an easy decision to use AWS for all of our computing needs.”
Amazon Elastic MapReduce (Amazon EMR) is Ion Flux’s most important service within AWS. The company uses Amazon EMR to process large amounts of data in as many as ten to thirty node clusters launched in parallel for four to five hour intervals. Ion Flux uses Cascading, the third-party Java-based application from Concurrent, Inc., for managing jobs in the Hadoop framework. Amazon EMR will allow Ion Flux to instantly increase its processing resources beyond its current levels as the company grows.
In addition to Amazon EMR, Ion Flux relies on Amazon Simple Storage Service (Amazon S3) for incoming and outgoing client files as well as to host the sequencing pipeline associated with the company’s processing in Amazon Elastic MapReduce.
Amazon Elastic Compute Cloud (Amazon EC2) handles Ion Flux’s variety of Web services, collaboration software, and engineering support systems. Within Amazon EC2, the company frequently uses Amazon Machine Images (AMIs) for computing within a virtual machine. The AMIs allow Ion Flux to simulate their clients’ infrastructures.
Ion Flux also uses Amazon Relational Database Service (Amazon RDS) and Amazon Simple Queue Service (Amazon SQS) in association with its Web logs and laboratory information management system. View a diagram of the company’s AWS-based infrastructure (in PDF format).
In the near future, Ion Flux plans to increase its Amazon Elastic MapReduce usage and find additional uses for Amazon EC2 and Amazon RDS as the company continues to develop its various analysis services.
Dr. Day believes that AWS helped the company begin operating two to three months earlier than it would have been able to if it used a traditional infrastructure. The company also estimates that it saved $100,000 by avoiding initial hardware costs, and it continues to save approximately $2000 a month in the operations and maintenance costs that would have been incurred by a physical system.
However, beyond the monetary advantage of AWS, Dr. Day says, “I think the more compelling benefit is how much agility we gain. We were able to launch with a lower barrier to entry – by investing a smaller amount of initial capital, we get more iterations on developing our product.”
As a new company, Ion Flux is pleased to focus on expanding its impact within the genomic medicine industry, while allowing AWS to provide its cloud computing and storage needs. Dr. Day says, “The time and money required to set up and operate physical HPC infrastructure are justifiable only if the infrastructure is highly utilized. Processing genome data requires a large amount of computing resources, but as an early-stage company, our demand is intermittent. Amazon EMR has allowed us to develop a medical informatics service that is both highly-scalable and highly cost-effective.”
To find out more about how AWS can help you store and process big data, visit our Big Data details page: http://aws.amazon.com/big-data/.