Using AI to transform drug discovery
Lab rats, Fluorescent lights, white coats. This is the scene that pharmaceutical research often conjures up. Yet in reality, today’s advances are increasingly powered by something far more forward facing: artificial intelligence.
For Celgene—a global biopharmaceutical company based in Summit, New Jersey—the secret sauce to modern drug discovery consists of two subsets of AI: machine learning and deep learning. Machine learning involves computing techniques that analyze vast amounts of data to find understanding that might be too abstract and time consuming for humans. Deep learning (a subset of machine learning) takes that even further, using code that attempts to mimic the brain’s ability to recognize patterns in unstructured data. Machine learning can scour historical clinical records to predict patient outcomes. Deep learning can find new patterns of behavior in medical images to help doctors improve decision making.
“Have you ever gone to a doctor and been prescribed a drug that doesn’t work? You try another one—that doesn’t work. Then the next one does work,” says Lance Smith, Celgene’s Associate Director for Global Planning and Technology. “Celgene aims to have you benefit from the drugs that work straightaway. We don’t want you to waste precious time and hard-earned money on prescriptions that will not benefit you. We also want doctors to use their time towards achieving the best patient outcomes.”
In the world of pharmaceuticals, AI is fast becoming an essential tool for companies like Celgene that are looking to innovate and stay competitive. The average commercial drug costs more than $2.5 billion to bring to market and 10-plus years to develop. Weeding out non-starters and getting to market quickly is critical not only to the companies developing the drugs, but also to the patients who need them. AI gives pharma firms a shortcut on both fronts: It dramatically boosts researchers’ abilities to find those “Eureka!” moments that are so important to drug discovery.
“We may have two pieces of data, but we don’t understand the relationship between them. We’re using AI to see if we can figure it out, and possibly find relationships that we hadn’t thought of before.”
Associate Director for Global Planning and Technology
A New Approach to Drug Development
Diseases like cancer are exceptionally complex, but treatments to date have largely followed traditional vectors. And that’s where machine learning and deep learning are making significant inroads, allowing scientists to think differently about potential therapies and the studies they design in order to analyze them.
“A Celgene researcher may have a concept in mind,” says Smith. “We may have two pieces of data, but we don’t understand the relationship between them. We’re using AI to see if we can figure it out, and possibly find relationships that we hadn’t thought of before.”
In the past, researchers relied on imperfect image-processing algorithms to analyze cancer cells, and then they corrected them by hand. With tens of thousands of cells, this required a huge expenditure of time and effort. But using deep learning, images can be processed almost instantaneously with much better results.
“The challenge is that cancer cells tend to cluster and mix with normal cells in a tumor, and it can be difficult to identify and distinguish normal from tumor cells on a large scale using classic image analysis approaches,” says Pascual Starink, Director of IT at Celgene. “The approach of deep learning is to adopt and mimic what people do. If a researcher looks at a microscope image with labeled cells, they can easily and clearly identify individual cells. What we try to do is teach a neural network to adopt those recognition and decision-making abilities.”
These analyses—which are generally run using the Amazon SageMaker machine-learning platform and the Apache MXNet deep-learning framework—are especially critical for toxicology predictions: virtually analyzing the biological impact of a potential drug without putting live patients (or even lab rats) at risk.
“Our data scientists are not classically trained programmers, but they all program as a tool to make their job happen. SageMaker makes developing AI algorithms easier without having to learn new complex programming languages; it allows them to stay focused on what they do best,” says Starink.
A Prescription for Speed
The other piece of the success puzzle for Celgene is speed. Pharmaceutical research revolves heavily around exceedingly complex algorithms to predict how certain compounds will interact with the human body.
To this end, Celgene uses high-performance Amazon EC2 P3 instances powered by NVIDIA Tesla V100 Tensor Core GPUs (graphics processing units) to process the complexity. These NVIDIA GPUs have thousands of cores that accelerate the training of machine-learning models (which can, for instance, test the effectiveness of a drug at faster and more accurate rates). The results have been game changing: A model that once took two months to train can now be trained in four hours.
Innovative machine-learning tools paired with GPU-based compute power are helping companies like Celgene get more creative with their drug development, allowing them to experiment with new therapies that might have been considered too experimental in the past. Instead of worrying about wasting time and chasing false leads, Celgene scientists are coming up with innovations that wouldn’t have been possible before—and in record time.
“We want to make better cures, and we want them faster,” says Smith. “We can design a molecule based on an idea someone had in a dream last night. We couldn’t do any of that in the past, because it would take two months. Now, it takes the same amount of time as a coffee break.”