Thanks to AWS and Spot Instances, we were able to build a flexible and scalable high-performance-computing cluster in the cloud, and reach an ideal cost efficiency.
Yang Lui Chief IT Architect

XtalPi, Inc. (XtalPi) is a pharmaceutical technology company that provides next-generation solutions to drug solid-state research during drug development. Founded by a group of quantum physicists and biopharmaceutical professionals at MIT, Xtalpi provides cloud-based computational drug solid-state screening and designing technologies that can significantly boost drug development efficiency for biopharmaceutical companies, and cast a substantial impact on drug-development risk management, patent strategy, and lifecycle management. Through naturally paralleled algorithms, the company's cloud high-performance-computing platform conducts extensive computation with groundbreaking speed and accuracy through AWS Spot Instances. XtalPi’s current clients are top global pharmaceutical companies.

XtalPi’s drug solid-state screening and designing platform employs high-performance-computation resources and uses a unique computational model to help pharmaceutical companies and scientists quickly obtain a clear landscape of a drug’s solid states. This technology provides a highly cost-efficient way to conduct drug solid-state development, which holds the promise of greatly shortening the drug research and development cycle, significantly improving the efficiency and success rate of drug development while ensuring better drug quality. The company, though, had challenges as it prepared to formally launch the platform. “Our prediction platform was developed and tested at a supercomputing center. But as we prepared for commercial use, we ran into challenges,” says Yang Liu, chief technology officer at XtalPi.

Supercomputing centers have a few disadvantages: First, it is difficult to customize and deploy services for external parties; second, it takes a long time to apply for computing resources; and third, supercomputing centers usually have a large number of jobs in queue, which can lead to long wait times and compromise the user experience. Additionally, XtalPi’s algorithms are highly parallel, which means computing time is inversely proportional to the amount of computing resources deployed. The computing is often on demand and elastic, and does not require a constantly large number of computation nodes. But for predictions that demand result in a few days, thousands of computation cores are needed simultaneously. “It’s unrealistic for us to purchase and maintain computation clusters at such a large scale, or to gain approval for using so many resources at a supercomputing center. Cloud computing is the only option that meets both our computing resource needs and our budget requirements,” says Liu.

In choosing a cloud platform, XtalPi’s first consideration was performance: drug-crystal-structure prediction is a compute-intensive application that requires a massive amount of computation and poses a performance challenge for single nodes. Much of the calculation requires a host with more than 24 cores.

After meticulous research of the cloud-services market, XtalPi found that only the c3 and c4 hosts in Amazon Elastic Compute Cloud (Amazon EC2) could meet its demands for performance, with the added benefit of Amazon EC2 Spot Instances’ low cost. XtalPi also recognized the superior customer service, stability, and security of AWS. As a result, the company started to move its drug-crystal-structure-prediction platform onto the AWS Cloud in November 2015.

The XtalPi IT team started by conducting performance tests on Amazon EC2’s C4.8xlarge instances (utilizing Intel’s Xeon E5-2680V2 CPU) to ensure its computational chemistry software could run efficiently. “In December, 2015, we started to run tests on larger scales,” says Liu. “During that process, we found that the GridFS of Mongo DB—the storage system we were using before—was creating a bottleneck effect, limiting our throughput and making it impossible to scale dynamically. By switching to Amazon Simple Storage Service (Amazon S3) for data storage, we quickly solved the problem,” says Liu. Today, the XtalPi drug-crystal-structure-prediction platform runs solely on an AWS Cloud-based infrastructure, leveraging services including Amazon EC2, Amazon DynamoDB, Amazon S3, and Amazon Virtual Private Cloud (Amazon VPC).

Diagram 1 shows the architecture of XtalPi’s prediction platform. In this chart, Amazon EC2 instances are used as the main source of computational capacity for the predictions; Amazon S3 provides intermediate data storage, with infinite scalability to handle large amounts of data; Amazon DynamoDB is used to store business data; Amazon VPC is used for defining isolated sections in the cluster, therefore guaranteeing the system’s security. “We mainly rely on elastic computing resources, such as the Amazon EC2 Spot Instances,” says Liu. “We use Amazon VPC to provision isolated sections in the cluster, and to make sure that our core management platform remains impervious to other clusters.”

arch diagram-XtalPi

For XtalPi, the benefits of moving to AWS are threefold: reduced costs, reliable security, and infinite resources with flexible scalability.

Drug-crystal-structure prediction is a compute-intensive application that could require tens of thousands of cores during peak computing times, and building private computing clusters of such capacity would cost millions of dollars. However, utilizing the elastic compute resources and AWS Spot Instances, XtalPi was able to lower its compute costs by 50 to 60 percent. In addition, AWS provides a variety of SDKs that allow XtalPi to quickly and easily manage and control its clusters, accelerate the platform’s development, and use services such as Amazon S3 to lower operation and maintenance costs. “We are saving roughly 800,000 RMB to 1,000,000 RMB in operational costs using AWS,” says Liu.

XtalPi provides crucial services in the various stages of drug development, and places great importance on its platform’s security and reliability. AWS has proven itself to be a trusted partner in ensuring XtalPi’s platform, and offers an extensive array of services for optimal data security. “Thanks to AWS and Spot Instances, we were able to build a flexible and scalable high-performance-computing cluster in the cloud, and reach an ideal cost efficiency,” says Liu.

To learn more about how AWS can help with your high-performance-computing needs, visit our high-performance-computing details page.