Customer Stories / Life Sciences / USA
A-Alpha Bio Boosts the Performance of Protein-Protein Interaction Prediction Using NVIDIA BioNeMo on AWS
Learn how biotechnology company A-Alpha Bio scaled protein-binding predictions by 10 times while increasing model speed by 12 times using BioNeMo on AWS.
12x
faster inference calls
10x
more protein-binding predictions evaluated
108 million
inference calls made
1–2 fewer
experimental cycles, which lowers costs and accelerates protein design
Overview
Biotechnology startup A-Alpha Bio specializes in harnessing synthetic biology and machine learning (ML) to measure, predict, and engineer protein-protein interactions (PPIs). The company used BioNeMo Framework—a generative artificial intelligence (AI) solution for drug discovery from NVIDIA, an AWS Partner—on Amazon Web Services (AWS) compute infrastructure.
Using BioNeMo on AWS, A-Alpha Bio achieved a 12-times increase in throughput of PPI binding predictions, processing over 100 million inference calls—10 times what it had initially projected—in 2 months. A-Alpha Bio conducted training on AWS, supported by a straightforward and efficient setup process. This dramatically improved candidate quality, thus progressing promising drug candidates more efficiently through the wet lab with fewer design-build-test cycles required. Using BioNeMo on specialized NVIDIA-based AWS GPU instances, A-Alpha Bio substantially improved the likelihood of discovering viable therapeutics.
Opportunity | Using NVIDIA BioNeMo on AWS Batch to Accelerate ML for A-Alpha Bio
Based in Seattle, A-Alpha Bio specializes in generating large-scale quantitative data on PPIs to accelerate drug discovery and engineer highly effective protein therapeutics. The company employs two key proprietary solutions. The first is AlphaSeq, an experimental platform for rapidly and quantitatively measuring millions of protein-protein binding affinities simultaneously. The second is AlphaBind, an ML platform that is trained on the world’s largest PPI datasets to predict binding affinity from protein sequence.
In early 2024, A-Alpha Bio undertook a large-scale experiment on AWS, using ESM-2, an open-source protein language model, to generate embeddings for fine-tuning its proprietary protein-protein binding data. “The stronger the binder, the higher its effectiveness as a therapeutic,” says Adrian Lange, director of ML research at A-Alpha Bio. This computationally intensive process involved several campaigns, each processing over 9 million interactions through ESM and producing 650,000 embeddings per run.
Working alongside AWS, the company identified an opportunity to enhance this experiment by using BioNeMo. It began deploying its BioNeMo containers to AWS Batch, a fully managed batch computing service, in February 2024. “Implementing the BioNeMo ESM-2nv model took around a week, and it was simple to incorporate BioNeMo containers into our existing AWS workflows and infrastructure,” says Aditya Agarwal, senior ML scientist at A-Alpha Bio. “This made it possible for us to comprehensively explore protein mutations and sample a vastly expanded mutational landscape when compared with the GitHub vanilla model, substantially increasing our potential to discover superior drug candidates.”
Improving models using BioNeMo on AWS is helping us accelerate our drug discovery. We can explore the landscape of protein mutations more exhaustively than ever before, which increases our chances of discovering superior drug candidates.”
Adrian Lange
Director of Machine Learning Research, A-Alpha Bio
Solution | Increasing Inference Speed by 12 Times Using BioNeMo on Amazon EC2 P5 Instances
A-Alpha Bio’s pioneering work in therapeutic development centers around the discovery and optimization of protein sequences that meet specific design criteria for designing monoclonal antibodies (mABs). The process typically involves multiple iterations of a design-build-test cycle, with each iteration requiring extensive wet-lab experimentation, which is time and resource intensive. The criticality of this work can be underscored by the fact that mABs dominate therapeutics. Three mAbs ranked among the top 10 best-selling drugs in 2023, and the Food and Drug Administration has approved more mAbs than other classes of drugs. However, general protein ML models, even with broad training data, often fall short when applied specifically to antibodies, highlighting the need for specialized approaches in this critical area of drug development.
By deploying BioNeMo on AWS, the company trained models with efficiency and predictive accuracy that it had not seen before. A-Alpha Bio used ESM2nv in BioNeMo framework to train a model for antibody design and identified mAbs with improved binding affinity. “By expanding our pool of optimized binder mutants, we reduce the number of iterations that are required to find excellent drug candidates, dramatically cutting down lab time, resources, and costs,” says Lange.
The company’s journey on AWS began with using Amazon Elastic Compute Cloud (Amazon EC2) instances—which provide secure and resizable compute capacity for virtually any workload—and then expanded to include AWS Batch for storing data. To further boost model speeds, A-Alpha Bio adopted Amazon EC2 P5 Instances—instances powered by the cutting-edge NVIDIA H100 Tensor Core GPUs that deliver excellent performance in Amazon EC2 for deep learning and high performance computing applications. “The AWS team has been very helpful and collaborative, guiding us in securing cost-effective capacity on AWS,” says Lange.
To secure access to P5 Instances at reservation pricing, A-Alpha Bio uses Amazon EC2 Capacity Blocks for ML, which organizations use to reserve GPU instances in Amazon EC2 to run ML workloads. Using Capacity Blocks, customers can reserve instances for 1–14 days at 1-day intervals, gaining flexibility in securing GPU capacity. Using BioNeMo on P5 Instances, the company boosted performance and achieved impressive results: inference ran 12 times faster and experimental costs decreased compared with running the open-source model on prior-generation GPUs. “We’re getting the promised performance from P5 Instances,” says Lange. “With this setup, not only do we save time, but we also cut costs.”
In early 2024, A-Alpha Bio significantly expanded its operations using BioNeMo on AWS, running six different campaigns with 18 million inference calls each, totaling 108 million inference calls. “Our initial plan was to process 10 million calls, but with the enhanced speed and efficiency, we were able to increase our workload by 10 times,” says Lange. “The dramatic improvement in performance prompted us to broaden the scope of our experiment considerably and scale rapidly.”
Evaluating 10 times as many potential PPIs, A-Alpha Bio effectively increased its chances of discovering superior drug candidates by the same factor. Today, by using generative AI and ML to select optimal starting points, the company can remove one to two cycles of wet-lab trial and error, which is more expensive and time consuming than computational testing. Not only does this help researchers get answers faster, but it also empowers them to explore more complex protein modifications. “We can design and test protein sequences with more mutations and combinations than before, which statistically means that we can now uncover protein sequences that are more optimized than what we could achieve previously,” says Agarwal.
Outcome | Exploring Proteins for Better Therapeutics Using BioNeMo on AWS
A-Alpha Bio is now poised to adopt BioNeMo as its standard solution for future campaigns. It plans to explore additional foundation models in BioNeMo, such as ESM-3 and OpenFold for protein folding. Importantly, the company aims to use its proprietary PPI data to further train and fine-tune ML models.
“Improving models using BioNeMo on AWS is helping us accelerate our drug discovery,” says Lange. “We can explore the landscape of protein mutations more exhaustively than ever before, which increases our chances of discovering superior drug candidates.”
About A-Alpha Bio
A-Alpha Bio is a biotechnology company focused on improving human health by harnessing synthetic biology and machine learning to measure, predict, and engineer protein-protein interactions.
AWS Services Used
AWS Batch
AWS Batch is a fully managed batch computing service that plans, schedules, and runs your containerized batch ML, simulation, and analytics workloads across the full range of AWS compute offerings
Amazon EC2
Amazon Elastic Compute Cloud (Amazon EC2) offers the broadest and deepest compute platform, with over 750 instances and choice of the latest processor, storage, networking, operating system, and purchase model to help you best match the needs of your workload.
Learn more »
Amazon EC2 P5 Instances
Amazon EC2 P5 instances, powered by the latest NVIDIA H100 Tensor Core GPUs, deliver the highest performance in Amazon EC2 for deep learning and high performance computing applications.
Learn more »
Amazon EC2 Capacity Blocks for ML
Amazon EC2 Capacity Blocks for ML easily reserve Amazon EC2 P5 instances and Amazon EC2 P4d instances for a future start date.
Learn more »
More Life Sciences Customer Stories
Get Started
Organizations of all sizes across all industries are transforming their businesses and delivering on their missions every day using AWS. Contact our experts and start your own AWS journey today.