AWS for Industries

Highlights from the AWS Life Sciences Executive Symposium 2023: Accelerating Pharma Drug Discovery with ML and Generative AI

On May 15, we hosted the AWS Life Sciences Executive Symposium in Boston, where I led our track on ‘accelerating pharma drug discovery with Machine Learning (ML)’. Over 300 life sciences executives from across 100 organizations attended this half-day, in-person event to explore how they can drive innovation through robust data foundations and machine learning on the cloud.

The opening sessions of the symposium showcased how AWS is enabling life sciences organizations leverage generative AI, FMs (Foundation Models), and LLMs (Large Language Models) for drug discovery, across use cases like identifying potential adverse drug reaction, and searching clinical trial datasets. We announced easy-to-run protein algorithms for a range of open-source molecular models and demonstrated how AWS can significantly reduce the cost and time needed to generate usable protein structures. Attendees were excited about the ability to access Ready2Run workflows within the Amazon Omics services for single API or console deployment of AlphaFold and ESMFold algorithms with transparent , run-based pricing and scalable infrastructure.  We also previewed AWS’ Drug Discovery Workbench, based on the AWS Batch Architecture for Protein Folding and Design, that supports multiple algorithms within a shared user interface including DiffDock, RFDesign, RFDiffusion, ProteinMPNN, AlphaFold, OpenFold, OmegaFold, and ESMFold.

We also demonstrated why AWS is the easiest place to build with FMs—from finding the right models, to secure customization, to integrating into your existing applications. Our approach to generative AI is the same as for ML: delivering new innovations to make it easy, cost effective, and practical for life sciences organizations to leverage generative AI. In fact, AWS customers are already using generative AI for a range of use cases, from speeding up repetitive tasks to uncovering new insights about proteins and disease pathways. At the symposium, we gave the attendees a sneak peek into some of our pre-trained models, including an Amazon HealthLake chatbot, and a clinical trial chatbot that leverages Amazon Kendra and LLMs. Customers can get started on these pre-built models today.

During the session, we covered how Amazon SageMaker JumpStart, a machine learning (ML) hub with foundation models, built-in algorithms, and pre-built ML solutions, allows users to get started in just a few clicks. Builders across industries are using pre-trained models on JumpStart to perform a range of tasks like article summarization and image generation. These models can be fully customized for any use case with your data, and can be easily deployed into production with the user interface or SDK. And, since all data is encrypted and does not leave your Virtual Private Cloud (VPC), you can trust that your data will remain private and confidential.

At the same time, we understand choice matters when building with generative AI. Which is why we introduced Amazon Bedrock — the easiest way to build and scale scale generative AI applications with foundation models (FMs). Amazon Bedrock makes FMs from AI21 Labs, Anthropic, Stability AI, and Amazon accessible via an API, democratizing access for all builders, whether you are building your own model or licensing third party ones. It also supports secure customization so that your data stays private and secure. By simply pointing Amazon Bedrock to a few labeled examples in Amazon S3, the service can fine-tune the model for a task without having to annotate large volumes of data. And in-line with our overarching theme of creating an end-to-end data strategy, we are making sure customers can readily integrate and deploy FMs into their applications and workloads running on AWS using familiar controls and integrations with AWS’s depth and breadth of capabilities like Amazon SageMaker and Amazon S3. Amazon Bedrock addresses the simple fact that one solution, or one model, is unlikely to solve every business problem you face.

There is no doubt that LLMs and generative AI can help short-circuit the time to discover new therapies. But, while it has generated a lot of hope for the future of research, the adoption of these technologies in real world R&D settings is still limited. This is why the hype vs hope panel discussion with Reverie Labs and Recursion Bio was timely, and provided attendees a clear executive understanding of the drivers and considerations of successfully applying generative AI in research, and the areas with most promise. We are still in the early days for generative AI and in the beginnings of the knowledge integration revolution. FMs and LLMs show great potential in how they can be applied, which will continue to expand in novel applications and industry adoption. And, consistent with our Day 1 culture at Amazon, we will have a lot more coming.

In the session on ‘upcoming ML innovations for pharma drug discovery’, we gave the audience a captivating view into the exciting advancements that are likely to gain momentum in the coming days. This included the use of FMs for disease gene prediction, novel drug synthesis with Meta AI ESM, and drug discovery with diffusion models. We also covered federated learning, that provides privacy preserving access to datasets within or across organizations for jointly training a shared ML model. In addition, we demystified the use of deep phenotyping and deep learning in image analysis for higher accuracy, efficiency, and scalability in digital pathology.

The conversation around our purpose-built AI/ML services carried forward into the sessions featuring our esteemed guest speakers. Leaders from several forward-thinking companies joined us on stage to share inspiring real-world examples on how they are addressing everyday challenges and bringing novel medicines to life faster with machine learning on AWS.

Bristol Myers Squibb shared how they are maximizing productivity & scale with High-Performance Computing & Cryo-EM using AWS. By building a highly optimized architecture, they are successfully navigating the challenges commonly faced when running these computationally-intensive workflows on-prem—huge capital requirements, GPU age out, constrained data center availability, and lack of elasticity. BMS’ fit-for-purpose cryo-EM workflows transfer the movies from microscopes to AWS for processing and analysis, optimizing cost, scale and throughput. This is helping the company make strides towards its goal of reducing drug development time from 10 years to 6 years.

Regeneron presented how they are using AI-driven protein structure prediction to improve their targeted drug discovery process. Identifying the structure of a protein is an important part of many drug discovery workflows, but it is expensive and often time-consuming. To address this challenge with ML, Regeneron is making state-of-the-art tools available to its scientists, so they can test and compare multiple approaches, and predict the shape of a protein in minutes, down to atomic accuracy. AWS is helping the company scale these workloads by bringing the right compute resources to its data, providing elasticity to manage costs, and removing the undifferentiated heavy-lifting of managing physical infrastructure.

Pfizer shared their insights on utilizing knowledge graphs (KGs) to integrate information within the organization for semantic learning. Their session shed light on enhancing the usability of knowledge graphs through data standardization, and explored the integration of AI/ML on top of KGs to explore what-if scenarios, and its potential impact. Through a range of knowledge graph use cases, like finding 6-year old lab data to support pH studies of a legacy product, or studying the lot genealogy for products used in a clinical study, they explained how KGs helped prevent repetition of experiments, saving millions in material and labor costs. A particularly exciting application was the use of Amazon Neptune, a fully managed database service, to understand the pathways of chemical reactions via Graph explorer.

And finally, Generate Biomedicines described how they are eliminating the need for traditional trial-and-error drug discovery methods by generating novel protein therapeutics using computational tools powered by ML. They showed how their generative AI platform, Chroma, creates new protein molecules based on geometric and functional programming instructions, learning patterns from the Protein Data Bank. Chroma is able to generate extremely large proteins and protein complexes (like the ones we see in nature) in a few minutes on a single GPU, which can then be further studied for their effectiveness as drugs. The session exemplified how an integrated wet-dry “lab of the future” can turn drug discovery into a truly computational/technological pursuit.

Through the day, the sessions highlighted how AWS is helping companies of all sizes, at at every stage of their drug discovery journey, with the most comprehensive set of AI/ML services, infrastructure, and implementation resources. But, it was important for us to ensure that everything that was presented during the sessions was simplified and made actionable for our customers. Towards that, we hosted a dedicated Ask-the-expert networking expo after the sessions, where attendees met with AWS experts to view in-depth demos of our offerings and ask any clarifying questions.

One of the biggest challenge that life science organizations need to overcome to effectively pursue machine learning is harnessing their data. This is why we also hosted a parallel track at the symposium on ‘unlocking access to and insights from data’, where we presented a range of purpose-built services and solutions for efficient storage, analysis, cataloging, and sharing of multi-modal data. To read the highlights from the data track, click here.

Life sciences researchers were at the forefront of cloud adoption when we launched AWS 17 years ago. Today, 9 out of the top 10 pharma companies use AWS for data analytics and machine learning. Witnessing the industry leaders unite at our Symposium was inspiring—it was a true depiction of what we call ‘the art of the possible’.

To know more about our comprehensive set of life sciences offerings, visit the AWS for Life Sciences website. If you’d like to schedule a one-on-one with an AWS specialist to discuss your specific business needs, please reach out to us here.

Lisa McFerrin

Lisa McFerrin

Lisa McFerrin is the WW Lead for HCLS Strategy & Solutions for Research, Discovery, and Translational Medicine at AWS. Lisa has a background in math and computer science and a PhD in Bioinformatics, with over 15 years experience in software and methods that bridges biomedical data to advance the understanding of cancer biology and improve patient care. She is dedicated to lowering barriers in data analysis to facilitate collaborative and reproducible research.

Ujjwal Ratan

Ujjwal Ratan

Ujjwal Ratan is a Principal Machine Learning Specialist in the Global Healthcare and Lifesciences team at Amazon Web Services. He works on the application of machine learning and deep learning to real world industry problems like medical imaging, unstructured clinical text, genomics, precision medicine, clinical trials and quality of care improvement. He has expertise in scaling machine learning/deep learning algorithms on the AWS cloud for accelerated training and inference. In his free time, he enjoys listening to (and playing) music and taking unplanned road trips with his family.