Virtual Screening of Novel Active Drug Compounds on AWS with Orion®

This article was contributed by Dr. Matt Geballe, Vice President of Product, OpenEye Scientific, and Dr. Nihit Pokhrel, Partner Solutions Architect for HPC at AWS.

Bringing a drug molecule to market costs about $985 million and takes 12-15 years. 1,2

One of the reasons for such a lengthy process is the need to identify molecules that satisfy many chemical properties, like solubility, toxicity, bioavailability, selectivity, and potency. As a result of this complexity, a drug molecule in Phase 1 of a clinical trial only had a 7.9% likelihood of approval from 2011 to 2020.3 This massive failure rate is unique to the pharmaceutical industry; no other industry works with such a low rate of success.

Computer-aided drug discovery (CADD) has been a key player in lowering the cost and speeding up the timeline for drug development. CADD uses high performance computing (HPC) resources to virtually screen databases with billions of molecules. It can speed up the searching of potential drug molecules, and filter out molecules and compounds that are unsuitable. CADD also allows researchers to investigate the structure activity relationship between molecules much faster than experimental screening methods.

To run such large computations within a reasonable timeframe, we need powerful computing resources that scale. Cloud-based high performance computing (HPC) allows us to run a large number of simulations for drug screening in a matter of hours or days. These simulations would take months if performed on traditional HPC platforms located on-premises.4 CADD using Amazon Web Services (AWS) infrastructure, particularly the variety of computing resources, can be scaled to hundreds of thousands of processors without investing in costly and resource-intensive servers on the premises.

Computational scientists often are limited by available resources. These limited resources make it challenging to perform necessary scientific calculations, which have varying computing requirements and time scales. This can encompass large-scale modeling calculations such as molecular dynamics simulations all the way down to single molecule computations involving quantum mechanics.  As such, researchers will have access to an on-demand computing cluster, as well as a better data sharing and collaboration platform.

With these challenges in mind, OpenEye Scientific developed Orion®, a cloud-based molecular design platform for CADD. Orion provides computational chemists with virtually unlimited HPC resources. These include data visualization, collaboration, and workflow management tools that help them perform calculations more efficiently.

Orion is the only cloud-native molecular design platform for pharmaceutical and biotechnology companies, and is powered by Amazon Web Services. The platform can instantaneously scale up to hundreds of thousands of CPUs and GPUs as needed. This increases both the quality and diversity of generated hits for drug discovery projects. It also accelerates drug development timelines in significant ways, with the time dependent on the drug being developed. In other words, Orion combined with AWS HPC infrastructure allows users to solve complex problems, without worrying about cost or capacity.

Figure 1: An example workflow running in Orion

Figure 1: An example workflow running in Orion

At the core of Orion is the Orion scheduler. The scheduler manages all the ongoing compute tasks, and uses Amazon Elastic Compute Cloud (Amazon EC2) instances for compute resources. When jobs are submitted on Orion, either via the UI or by API, the Orion scheduler monitors computational capacity needed for each workload, which estimates the requirements by the rate of work completed. The scheduler then uses the AWS Auto Scaling to automatically recruit and return resources as needed, based on the compute requirements of the workflow and the scale of the calculation (see Figure 1). Users can significantly reduce the cost of their workload by running their chemistry simulations using Spot Instances as well, allowing them to benefit from unused EC2 capacity in AWS. The data is then stored on Amazon S3 and Amazon Aurora. Orion users can do further analysis of their results stored as needed (see Figure 2). Orion is highly available, as the infrastructure gets deployed in two Availability Zones. Orion relieves the customer from needing to manage the specifics of availability and diverse compute resources themselves.

Figure 2: High-level architecture of Orion

Figure 2: High-level architecture of Orion

Users gain the following benefits from running their chemistry workloads on Orion:

Elastic and Flexible Compute Resources: Computations in Orion can take advantage of the wide variety of hardware available on AWS, including both CPUs and GPUs. Within the same calculation, users also can switch between On-Demand and Spot Instances, balancing cost savings against time-to-result. Users can define the hardware requirements across different parts of the same workload. The scheduler automatically places this work onto diverse hardware, and automatically scales up resources when applicable. When the worker’s instances are not needed, the scheduler turns them off.

Cost Awareness: Orion scheduler is aware of the cost of all the compute resources it controls. It automatically chooses the lowest cost hardware that satisfies the requirements of the calculation. This includes factoring in Spot versus On-Demand Instances, hardware density, and the scale of the workload. The scheduler allows a user to monitor the cost of the job in near-real-time while the job runs. The scheduler can set up notifications, and cancellation thresholds to prevent accidental cost overruns.

Automated Management of Complex Scenarios:  Orion supports many complex workloads. These include support for cycles within a workflow, and automated throttling down when failures are detected in calculation code. Also included is fair assignment of resources to enable small and shorter interactive calculations. These less resource intensive calculations run alongside multiple massive jobs, such as a Gigadock™ calculation, which may use hundreds of thousands of CPUs in aggregate.

One application that can benefit from Orion and AWS HPC infrastructure is molecular docking, a structure-based drug design (SBDD) approach. SBDD methods involve using the 3D structure of the target protein for screening potential hits. Molecular docking studies the drug-target interaction at an atomic level. This is done by first predicting and then scoring the binding interactions between the two. This approach helps drug development scientists screen large libraries of molecules and compounds much more efficiently than other screening methods, which leads to more precise drug development and lead optimization.

OpenEye Scientific’s GigadockTM functionality on the Orion platform takes molecular docking to the cloud. Through the power of AWS HPC infrastructure, users can take advantage of tens of thousands of CPUs through the cloud to search prepared vendor libraries. They can search through purchasable chemical compounds for 3D structures of the target protein to screen potential hits. With the speed of AWS, these large-scale virtual screening searches can be accomplished in just a few hours, saving significant time and cost.

Beacon Discovery, a Eurofins drug discovery company, recently learned how OpenEye’s Gigadock functionality could help them validate docking approaches. They were also able to validate shape and chemical features to identify two novel chemical entities and more than 30 potent hits for known G-Protein Coupled Receptor (GPCR) targets.


Orion built on AWS can expedite drug discovery. Researchers can focus on the science, while Orion provides the necessary infrastructure to run the CADD workload. The platform levels the playing field for small and medium-sized drug development firms. It does so by providing resources previously available only to the largest of pharmaceutical and biotechnology companies.

If you would like to evaluate Orion for yourself and take advantage of the platform’s GigadockTM functionality for faster hit discovery, request an evaluation by filling out the form. Someone will be in touch with you to discuss how Orion can help solve your drug discovery challenges. You can learn more about OpenEye Scientific in our AWS case study.

The content and opinions in this blog are those of the third-party author and AWS is not responsible for the content or accuracy of this blog.


  1. “Estimated Research and Development Investment Needed to Bring a New Medicine to Market, 2009-2018” Wouters, et al. Journal of the American Medical Association, March 2020.
  2. “FDA Drug Approval Process”, April 2020.
  3. “How long does it take to get a drug approved?” BIO, J.P. Carroll, February 2021.
  4. “Accelerating Drug Discovery with Supercomputers”, Pharmaceutical Executive, June 2020.


Nihit Pokhrel

Nihit Pokhrel

Nihit Pokhrel is a Partner Solutions Architect at Amazon Web Services, working with HPC and Quantum Computing partners to help them build well-architected solutions. Her background is in computational chemistry focusing in the area of Molecular Dynamics. Nihit specializes in HPC for the Life Sciences industry.