AWS for Industries
MSD Builds HawkAVI Platform on AWS to Reduce Product Reject Rates
Blog is guest authored by Nitin Kaul and Guglielmo Iozzia of Merck & Co., Inc.
Improving the productivity of manufacturing operations is vital for pharmaceutical companies, so patients can get the critical medicines they need, faster. While patient safety has always been the highest priority, Merck & Co., Inc. (MSD), the leading global biopharmaceutical company, wanted to explore ways to reduce the overall rejects and false reject rates in their drug product automatic visual inspection processes. This was important because a majority of the rejected medicines were actually good products―valuable to patients and the company. These products were rejected due to process input variations that are difficult to identify in an image-based inspection system.
Using Amazon Web Services (AWS), the company built its HawkAVI platform to address the issue. The platform combines descriptive analytics, artificial intelligence (AI), and machine learning (ML) to generate insights on rejected drug product images. By processing the data, it generates an understanding of the root cause for false rejects, and recommends corrective actions to the manufacturing team.
The insights and near real-time data analytics gained from HawkAVI have reduced the overall false reject rates by 50% in various products lines. It has improved product availability, reduced the number of investigations, and enabled on-time and right-first-time change controls while providing significant time and cost savings to MSD.
Seeking Innovation in Drug Product Inspection Process Using AWS Platform
With a history of 130+ years and operations spread across 60+ countries, MSD’s science is helping tackle some of the world’s greatest health threats. The company’s standard technology of automated visual inspection (followed broadly across the industry) comprises 30–50 cameras per manufacturing line, clustered into groups programmed to identify specific defects, operating on a stringent rules-based system. These visual inspection machines identify defects and reject products for a variety of reasons—ranging from surface defects, scratches, cracks, or the presence of foreign particles.
As a part of its broader initiative to strengthen its manufacturing processes and boost production yields, MSD wanted to reduce the false reject rates. It also wanted to advance its capabilities to provide SMEs (subject matter experts) an easier way to do deeper analysis into rejected products by unifying the data residing in multiple siloed systems and develop insights on untapped image data.
MSD’s HawkAVI platform, built on AWS, unifies data silos. Doing this makes it possible to use advanced analytics and machine learning to automatically analyze terabytes of image data per machine and identify historical trends. This allows teams to quickly identify and correct issues―saving time and cost by supporting investigations into rejects and quickly performing root cause analysis.
“There is no need to fish for data or spend weeks in collating it. Previously, SMEs would have to get this data from multiple systems, figure out a way to contextualize it, and then analyze the data to identify issues before addressing them. We’ve eliminated those non-value-add activities with this platform,” says Nitin Kaul, Architect and Technical Lead for HawkAVI Platform at MSD. “At a leadership level, it has also helped the site and global senior leadership team to get a holistic view into the reject rates and trends across products, sites and manufacturing lines, which was not possible previously.”
Outside of providing a longitudinal view into the trends over time and identifying defects, the platform also helps to identify images, across multiple cameras, associated with a unique defect for which a specific product was rejected. This further helps in tuning the machines and adjusting the rules that oversee the visual inspection machines. The platform can learn and provide insights to further optimize future processes. In recognition of its impact, The Manufacturing Leadership Council awarded the platform the 2021 Manufacturing Leadership Award in the AI/ML category.
Reducing Product Reject Rates Using AWS Analytics and Machine Learning Services
The HawkAVI platform supports three product teams involved in distinct workstreams to deliver value.
The first workstream brings together structured process data from upstream manufacturing systems and visual inspection machines. It helps to uncover trends for product rejects at site, machine, line, and camera level granularity, within and across manufacturing batches, to illustrate holistic batch performance characteristics.
The framework used for this workstream is AWS Glue, a serverless data integration service that makes it straightforward to discover, prepare, migrate, and integrate data from multiple sources for analytics, ML, and application development. The data gathered and curated using AWS Glue is then stored in Amazon Simple Storage Service (Amazon S3), an object storage service that is scalable and secure with cost-effective storage classes.
The platform also uses a custom AWS IoT Greengrass Connector to integrate with OSI PI, Manufacturing Process Data Historian, which feeds data through Amazon Kinesis. Amazon Kinesis enables the processing and analysis of data as it arrives for response. AWS Lambda (Lambda) is used to run code without provisioning or managing infrastructure to track and transform data in near real-time. It launches AWS Glue-based pipelines to do computations once manufacturing batch completion is detected.
Amazon Redshift, a fast and widely used cloud data warehouse, provides a harmonized product view with visualization dashboards for reporting reject trends from a process data standpoint. These visualizations help the company to iteratively improve the product rejection process. “Being able to analyze and understand the batch characteristics, performance metrics, product genealogy and granular reject information for ongoing manufacturing batches and use the insights generated to plan and optimize subsequent batches has been a big success already,” says Kaul.
The second workstream is dedicated to working with visual inspection machine vendors to enable image capture capability and develop a centralized framework for image capture, ingestion, and processing in the cloud. MSD uses AWS DataSync, a service that facilitates and accelerates secure data migrations. Image data from inspection machines is collected in site image servers and synchronized to S3 buckets in the cloud using AWS DataSync edge agents, leveraging edge site-specific configurations.
“The movement of volume data from edge sites into the cloud was initially a challenge for us,” says Guglielmo Iozzia, Data Science lead for the HawkAVI platform. “Luckily, using AWS DataSync, we can throttle ingestion bandwidth per site requirements, leverage scalable data filtering, use built-in data integrity validation and benefit from end-to-end compression and encryption—enabling us to focus on reusability, scalability, and overall performance of platform.”
The third workstream is used to develop AI and ML tools and services to gain insights from the terabytes of defect image data. Once the image data lands in Amazon S3 buckets, it triggers a Lambda-based pre-processing pipeline leveraging Amazon Simple Queue Service (Amazon SQS). This workflow handles the image conversion, dynamic metadata extraction, static metadata association, and image data entry in Amazon DynamoDB (DynamoDB) tables. Amazon DynamoDB’s fast and flexible NoSQL service is not only used to create core inspection data tables but also stores all static metadata associated with sites, lines, machines, and camera groups.
Once processed, the images flow through a Lambda-based inference pipeline that leverages Amazon SageMaker (SageMaker) that automates the extraction of insights using machine learning. As a result, the platform can search for defect images across terabytes of image data, categorize and label images, understand trends based on model inference. It also helps them build, train, and deploy machine learning models with fully managed infrastructure, tools, and workflows.
Amazon OpenSearch Service indexes sync with DynamoDB tables to support interactive querying from consumption apps. The platform also leverages other SageMaker functions to enable MLOps and Model Lifecycle management.
By capturing the image data and model inference data as soon as images land in the cloud, MSD’s teams can perform root cause analytics in near real-time, rather than spending weeks just to access the data. The automated nature of the HawkAVI platform means that there is a tremendous reduction in cycle time—resulting in cost savings. “In various cases, we have been able to reduce product reject rates by around 50%, thanks to the new capabilities of the HawkAVI platform,” says Kaul.
The platform also makes it possible for teams to collaborate on the analysis and review of inspection data. “With model inference results and other insights accessible in context, the SMEs can leverage consumption apps, like Image Query. They can search for different defect category images across multiple batches, identify image sets, download images, and metadata in order to support inspection machine tuning to optimize existing inspection processes. This further reduces rejection rates, which leads to more lifesaving products available,” adds Iozzia.
Improving Analytics Using AWS
With the HawkAVI platform in place, MSD is striving towards a broader goal of reducing the reject rates. The company is also continuing to onboard different sites and product lines to HawkAVI and would like to further deploy deep learning ML models in the future. “Any new facility coming up has funding secured to automatically onboard on the HawkAVI platform. We are also exploring whether we can deploy deep learning models at the edge within the AVI machines, and use ML models in GxP scenarios,” says Iozzia.
The platform showcases MSD’s firm commitment towards organization-wide cloud modernization initiatives. “We are invested in developing all new products and modern applications in the cloud,” says Kaul. “And AWS is our cloud provider of choice in our mission of using the power of leading-edge science to save and improve lives around the world,” he concludes.
Nitin Kaul works as an Associate Director of IT Architecture at MSD and is responsible for strategy, architecture and development of AI/ML products and services within the manufacturing domain. At MSD, Nitin has led the technical vision, architecture and development of strategic initiatives like ML/AI-based drug inspection, Global Predictive Maintenance Platform solutions, Drug Yield improvement analytics, Drug Cold Chain analytics and traceability, IoT Architecture and Scale-Out computing. Nitin also provides advisory and technical oversight for major initiatives in the manufacturing/supply chain domain with a focus on Industry 4.0, IoT Frameworks, Microservices Architecture and Containerization, DevOps, MLOps, Technology Innovation and Big Data/Data Lake Analytics. Nitin holds a B.S. in Electronics and System Engineering and an M.S. in Computer Science. Guglielmo Iozzia joined MSD in early 2019 as part of its Manufacturing Division and across 2022 he moved to the Applied Mathematics and Modelling Data Science group in the same area. He is a Biomedical Engineer with an extensive technical and leadership industry experience in Software Engineering and ML/AI applied to different contexts (Biotech Manufacturing, Healthcare and DevOps, just to mention the latest) while at MSD, Optum, IBM and FAO of the UN, and a lifelong learner. He is also an international speaker: his latest talk was on November 2022 at the BiotTechX congress in Basel, Switzerland, where he presented part of the work done with Deep Generative Models within the Hawk AVI program. He is author of the following technical book: Hands-on Deep Learning with Apache Spark.