Pfizer boosts bioreactor efficiency with AWS industrial edge services

Pfizer, one of the world’s premier biopharmaceutical companies, uses machine learning (ML) and artificial intelligence (AI) for near-real-time monitoring of mammalian cell culture bioreactors to boost batch yield and reduce the risk of contamination. Using Amazon Web Services (AWS), the company developed Manufacturing Intelligence Edge (MI Edge), a platform that uses AI and ML for continuous monitoring of bioreactors at its global manufacturing sites. MI Edge increased the frequency of measurements from one sample per day to near-real-time monitoring every few seconds. This improved frequency helps operators to adjust parameters as needed throughout the batch, resulting in greater yield, shorter cycle times, and reduced risk of contamination, delivering more medicine for patients, faster.

This post covers the use case, services, architecture, and sample code implemented for the initial proof of concept (PoC). The steps included are preparing the model, creating an inferencing function, creating a bridge component, and deploying components to the edge.

Background

Bioreactors grow desirable cell cultures in an ideal environment to optimize production. Operators monitor growth by physically sampling the medium and analyzing the sample. The results of the analysis inform the operator on whether to adjust inputs, like nutrient feed rates, to maintain optimal conditions for the cells. Though essential, sampling can introduce undesirable cells into the bioreactor, contaminating the batch – resulting in losses. To minimize risk, sampling is done every 24 hours. The 24-hour wait time reduces opportunities to contaminate a batch, but it also reduces the operator’s visibility into cell culture growth. “Decreasing the time between measurements is key to reducing risk and critical for closed-loop process controls,” says Shawn Mullins, senior director of Digital Manufacturing 4.0 at Pfizer. “This, of course, needs to be done in a compliant and secure manner, adhering to all relevant FDA guidelines.”

Further, Pfizer’s previous platform “lacked scalability, ease of replication, and adaptability to deploy predictive analytics and deep learning frameworks,” says Reza Kamyar, director of global technology and engineering at Pfizer. “With AWS, we can deploy advanced AI/ML solutions at an unparalleled pace to make near-real-time, data-driven decisions. We can also make advanced process control and autonomous production in commercial manufacturing a reality.”

Model architecture

To give operators a tool to monitor in between physical samples and adjust with greater precision, the global technology and engineering (GTE) team at Pfizer used a physics-informed neural network (PINN). This ML model overcomes low data availability of biological processes to predict the growth within a bioreactor from sensor data and process parameters. The PINN model can describe the growth within a bioreactor with high accuracy and helps operators to adjust inputs with greater frequency.

MI Edge makeup

Pfizer’s digital MI teams and AWS worked together to build MI Edge, a hybrid, low-latency container platform to run workloads at Pfizer’s manufacturing plants. Workloads like applications, dashboards, and the PINN model are built in AWS and deployed to hardware running at manufacturing sites. Management of model versions, deployment pipelines, security policies, user access controls, and application logs reside in AWS.

MI Edge uses a range of AWS industrial edge services, including AWS IoT Greengrass, an open-source edge runtime and cloud service for building, deploying, and managing device software. AWS IoT Greengrass is flexible and can run on Linux, Windows, or Raspberry Pi OS, with ARM and x86 processors. The runtime facilitates local activation of Docker containers, native operating system processes, custom runtimes within a prebuilt AWS IoT Greengrass component, or functions on AWS Lambda—a serverless, event-driven compute service that lets you run code for virtually any type of application or backend service.

MI Edge manages the AWS IoT Greengrass instances and deploys workloads like models, applications, and dashboards securely and with ease. The platform also uses AWS IoT Core, a service that helps connect billions of IoT devices and route trillions of messages to AWS services without managing infrastructure.

Further, MI Edge uses AWS IoT SiteWise, a service designed to collect, organize, and analyze data from industrial equipment at scale, to process the bioreactor data. AWS IoT SiteWise is the solution’s primary data model-view-controller (MVC) infrastructure. AWS IoT SiteWise provides the ability to:

design data schemas and connect schemas to data streams,
provide OPC UA integration to stream data into schema,
create dashboards to visualize data at the edge and in the cloud, and
configure alarms and actions to respond to received data.

Finally, MI Edge uses Amazon SageMaker, a service designed to build, train, and deploy ML models for any use case with fully managed infrastructure, tools, and workflows. Specifically, MI Edge uses Amazon SageMaker Neo, which helps developers optimize ML models for inference on Amazon SageMaker in the cloud and supported devices at the edge, and Amazon SageMaker Edge Manager, an inference engine for edge devices, to simplify execution and maintenance of the models at the edge.

ML model preparation

To prepare the PINN model to run optimally on AWS IoT Greengrass, Pfizer compiled the model into machine code with Amazon SageMaker Neo, which can compile for specific hardware/software/GPU/CPU combinations, resulting in an ML model that runs with optimal performance on a given edge device without loss of accuracy.

The following is an example of a Python code snippet in an Amazon SageMaker Jupyter notebook that takes a constructed model and compiles it specifically for the Nvidia Jetson Xavier NX platform. The resulting compiled model, when run on the NX platform, uses the GPU “CUDA” cores in the NX GPU to run the model:

Once the ML model is compiled, it is packaged into an AWS IoT Greengrass Component that is ready for deployment:

Inference function preparation

The inference function is packaged code that does the following:

retrieves and prepares the input from a data source,
calls the model’s predict() method and passes on the input,
receives the result of the model’s predict() invocation, and
acts based on the model’s prediction.

The following is a snippet of the inferencing function’s main loop, which is triggered whenever input arrives for inferencing. The main loop invokes the compiled ML model’s predict() method via gRPC from Amazon SageMaker Edge Manager. The inference result is published to AWS IoT Core message queuing telemetry transport (MQTT) service for review.

Bridge component build

One of the requirements of the PoC was getting data from a RESTful web API into AWS IoT SiteWise. The default connector for AWS IoT SiteWise is OPC UA, and the customer data source did not provide OPC UA connectivity. The GTE team created a bridge component to make REST API calls and convert the responses to OPC UA data for consumption by AWS IoT SiteWise Edge, a service designed to collect, process, and monitor industrial equipment data on premises.

The following is an example of the main loop in the bridge component.

Both the inference function and the bridge are packaged into an AWS IoT Greengrass component, like the PINN model. Below, you will see the inference function and model components ready for deployment to an AWS IoT Greengrass device.

Component deployment

AWS IoT Greengrass deploys components to target devices with a deployment task. Tasks can target a single AWS IoT Greengrass device or a group of them.

To deploy a component, define the components and the configurations to apply to the AWS IoT Greengrass device.

A deployment is created, and the custom components—the compiled PINN model, the inference function, and bridge components — are deployed to the AWS IoT Greengrass device. The device is now ready to run inferencing tasks natively on its hardware.

Conclusion

In this blog post, we have described how Pfizer’s Digital MI and GTE teams are using AWS industrial edge and AI/ML services to boost the efficiency of production bioreactors. Through the MI Edge platform, Pfizer can monitor its bioreactors in near real time and adjust inputs as needed. This ability will result in increased yield, shorter cycle times, and reduced risk of contamination.

“The MI Edge capability has helped bring our vision for data and advanced analytics to life,” says Mike Tomasco, vice president of Digital Manufacturing at Pfizer. “We are well past the dreaded pilot purgatory phase with MI Edge, and we are driving toward full-scale global rollouts of these capabilities.”

Pfizer continues to explore ways to improve its manufacturing processes using AWS industrial edge services, by expanding the use of ML models to optimize the Active Pharmaceutical Ingredients (API) manufacturing processes, leveraging AI to predict/prevent equipment failures, and optimizing energy consumption at manufacturing sites.

Pfizer is confident that by continuing to use AWS industrial edge services, it will be able to further improve its manufacturing processes and deliver more medicine to patients faster.

AWS for Industries