Numerate Discovers Drug Candidates Five Times Faster by Running on AWS


Simplifying a Complex Drug-Discovery Process

Identifying new drug candidates is an increasingly complex process, requiring life sciences companies to synthesize and test thousands of molecular structures per drug discovery program, at a cost approaching $20 million. Ultimately, only one out of four of these programs ever succeeds in identifying a drug candidate. Numerate, a discovery-stage pharmaceutical company, is helping ease this identification process through machine learning (ML). The company uses ML technologies to more quickly and cost-effectively identify novel molecules that are most likely to progress through the research pipeline and become good candidates for new drug development. “We use an algorithm-centric model to help pharmaceutical companies take massive amounts of data to accelerate the identification and optimization of clinical drug candidates,” says Brandon Allgood, CTO and co-founder of Numerate.

Numerate has extensive compute requirements for its drug discovery platform, which evaluates billions of molecules with dozens of predictive models that replace laboratory tests. The platform must quickly scale to tens of thousands of cores to support the company’s ML workloads. “Our customers need to screen billions of molecules virtually using our machine learning models, and then prioritize 100 or fewer of those molecules for synthesis and testing,” Allgood says. “That’s the level of processing we help drug discovery programs get to, and it requires massive compute scalability to power it all.”

“We used our AWS-based machine learning platform to discover and optimize candidate drugs five times faster than the industry average.”

Brandon Allgood, CTO and Co-founder, Numerate

  • About Numerate
  • Numerate, headquartered in San Francisco, is a computational drug design company that applies artificial intelligence (AI) and machine learning (ML) technologies at cloud scale to transform small-molecule-drug discovery. Numerate was acquired by Valo Health.

  • Benefits
    • Models tens of billions of molecules on its ML environment
    • Discovered drug candidates 5x faster than the industry standard
    • Reduces costs by seven-eighths
  • AWS Services Used

Using AWS to Drive a Machine Learning Platform

To support its requirements, Numerate runs its ML platform on Amazon Web Services (AWS). “We started using AWS on a smaller scale for compute, but over time we moved our entire infrastructure to AWS because we saw the scalability and flexibility benefits of the cloud,” says Allgood.

Numerate initially used Amazon Elastic Compute Cloud (Amazon EC2) instances, powered by Intel processors, to drive its ML modeling solution. The company also uses Amazon Simple Storage Service (Amazon S3) to store system data. “We still rely on Amazon EC2 and Amazon S3 as the core of everything we do but have also expanded our use of AWS services along the way,” says Allgood. For example, Numerate embraces infrastructure as code (IaC), using AWS CloudFormation to reliably and repeatedly provision its infrastructure, which frees the team to innovate faster.

Saving Seven-Eighths the Cost of Compute Clusters

Over time, Numerate has increasingly relied on Amazon EC2 Spot Instances to reduce the costs of drug discovery testing. EC2 Spot Instances are spare EC2 capacity available at up to 90 percent discount over On-Demand prices. “Using Amazon EC2 Spot Instances has completely changed things for us. We can run our platform in a much more cost-effective way as a result,” says Allgood.

By taking advantage of Amazon EC2 Spot Instances, Numerate has significantly reduced its costs for processing data. “We can run compute clusters for one-eighth the cost we could before because of Amazon EC2 Spot Instances,” says Allgood. “We not only have more scalability, but also more cost-effective compute capacity.”

Numerate is also benefiting from the elasticity of AWS. “If we need to scale our platform to 50,000 cores for a few weeks and then turn everything off and not use it for a few months, we can easily do that on AWS,” Allgood says.

Screening 10 Billion Molecules on AWS

Numerate has relied on the scalability of AWS to simulate and screen 10 billion molecules through its ML environment. “There has been a transformation in this industry,” says Allgood. “There used to be two million molecular compounds available for purchase, and now it’s in the billions because of the advancement of automatic synthesis laboratories. I expect that to be 100 billion in the near future. The ability to search and apply models to these libraries is a challenge for companies like ours, but AWS gives us the ability to meet that challenge through massive compute scalability and processing power. AWS helps us unlock targets by enabling us to use and act on data faster and more effectively.”

Discovering Candidate Drugs Five Times Faster

Using AWS to power its ML platform, Numerate has accelerated time-to-discovery for new drug candidates. The company recently used its AWS-based platform to rapidly discover and optimize ryanodine receptor 2 (RYR2) modulators, which are being advanced as new drugs to treat life-threatening cardiovascular diseases. “We used our AWS-based machine learning platform to discover and optimize candidate drugs five times faster than the industry average,” says Allgood. “Ryanodine 2 is a difficult protein to target, but AWS made that process easier for us.” Allgood says traditional methods could not even have attacked the problem, as the complexity of the biology makes the testing laborious and slow, independent of the industry’s low 0.1% screening hit rate for much simpler biology. “In our case, using AWS, we can effectively decouple the trial-and-error process from the laboratory,” he says.

Numerate started with 200 known compounds from the public domain, built a model, and evaluated 10 million purchasable compounds. From this, the company’s researchers achieved a 30 percent hit rate, including dozens of new patentable compounds. This was followed by three cycles of new design, where Numerate continuously built models of RYR2 activity, other toxicity-inducing target activity, and biological compound fate. The company then used these models to select which compounds to make and test in the lab. “During these cycles, we searched hundreds of millions of novel compounds. This work led to a set of compounds that are ready for efficacy models in the relevant animals, and only required one year and 69 compounds to be made and tested,” says Allgood.

Numerate is now better equipped to identify potentially life-transforming drugs. Says Allgood, “Overall, AWS gives us more flexibility and capacity to power our ML platform, so we can help design some of the world’s most needed disease therapeutics.”

Learn More