Opportunity | Accelerating the Processing Performance of Cryo-EM Workflows to Generate Insights Faster
By working on AWS, we’re able to spend more time focusing on how we can innovate. We can be creative and take advantage of the cloud to accelerate our science.”
Senior Director of Software Engineering, Vertex Pharmaceuticals
Solution | Reducing Data Storage Costs and Accelerating Processing Using AWS ParallelCluster
By migrating to AWS, Vertex migrated its workloads closer to where the data arrived in Amazon Simple Storage Service (Amazon S3)—an object storage service that offers industry-leading scalability, data availability, security, and performance. Vertex also uses Amazon FSx for Lustre, a fully managed shared storage built on one of the world’s most popular high-performance file systems, to give scientists exactly the amount of storage resources that they need during active analysis.
After processing, Vertex sends the data back to Amazon S3. The company sorts data efficiently using Amazon S3 Lifecycle policies, sets of rules that define actions that Amazon S3 applies to a group of objects. “Using Amazon S3 Lifecycle policies, we can put data into different tiers to lower the cost of storage,” says Iturralde. The company can also scale its storage seamlessly, limiting data center overhead.
To manage compute for data processing, Vertex uses AWS ParallelCluster, an open-source cluster management tool that makes it straightforward to deploy and manage elastic HPC clusters on AWS. It will spin HPC nodes up and down based on the demands of the analysis software. “When they’re done, we can go back to paying almost zero,” says Iturralde. “We don’t have to worry that the pace of science is going to overwhelm our resources or divert our attention toward maintaining the infrastructure.”
By matching its compute costs to workload demands, Vertex has reduced costs by 50 percent. Further, it has achieved two times better performance than its previous architecture. And Vertex has removed the bottlenecks its cryo-EM team faced in the on-premises environment when sharing resources with other groups, which it often did. “Previously, it took several weeks to analyze cryo-EM data, even when no one else was using resources,” says Posson. “Now, we can reliably deliver data in under 1 week using AWS.”
Vertex added native single sign-on support using Amazon Cognito, which businesses can use to add sign-up, sign-in, and access control to web and mobile apps quickly and easily. “Using Amazon Cognito gives us that additional comfort that only the appropriate employees have access to the software,” says Iturralde. Alongside this, Vertex uses Application Load Balancer—which load balances HTTP and HTTPS traffic with advanced request routing targeted at the delivery of modern applications—to secure its networking.
On AWS, Vertex has made its processes efficient, scalable, and cost effective while reducing manual maintenance. Building on AWS also means that the company has access to the latest compute and GPU resources without the months-long lead time associated with procuring data center hardware. For example, Vertex is running Amazon EC2 G5 instances, which deliver a powerful combination of CPU, host memory, and GPU capacity. By performing cryo-EM processes in the cloud, scientists can do near-real-time analysis. Vertex uses expensive microscope time more efficiently and facilitates scientific breakthroughs.
Outcome | Accelerating Data Processing to Speed Up Research Using Amazon EC2
About Vertex Pharmaceuticals
Vertex is a pharmaceutical company headquartered in Boston that studies complex molecules and researches treatments for serious diseases using the latest microscopy technologies around the world.
AWS Services Used
Amazon Simple Storage Service (Amazon S3) is an object storage service offering industry-leading scalability, data availability, security, and performance.
Amazon Elastic Compute Cloud (Amazon EC2) provides secure and resizable compute capacity for virtually any workload.
Learn more »
AWS ParallelCluster is an open source cluster management tool that makes it easy for you to deploy and manage High Performance Computing (HPC) clusters on AWS.
Learn more »
Amazon FSx for Lustre
Amazon FSx for Lustre provides fully managed shared storage with the scalability and performance of the popular Lustre file system.
Learn more »
Organizations of all sizes across all industries are transforming their businesses and delivering on their missions every day using AWS. Contact our experts and start your own AWS journey today.