AWS for Industries

Improving AstraZeneca Japan’s Enterprise Search Capabilities and Regulatory Compliance using Amazon Kendra

Ethical marketing compliance is critically important for pharmaceutical companies operating in Japan to prevent the miscommunication of wrong or outdated information. In a country known for its strict regulations and emphasis on consumer protection, these companies must adhere to a set of ethical guidelines to maintain trust and make sure of the well-being of patients. AstraZeneca Japan (AZKK) sales representatives face the challenge of searching for the most recent and validated marketing content to engage with their customer in a compliant way. To resolve this challenge, AZKK worked with Amazon Web Services (AWS) on a Proof of Concept (POC) to improve their enterprise search experience to find updated, relevant, and validated content using Amazon Kendra.

Amazon Kendra is an intelligent search service powered by machine learning (ML). It re-imagines enterprise search for your websites and applications so your employees and customers can easily find the content they’re looking for, even when it’s scattered across multiple content repositories within your organization. The service supports semantic search for both semi-structured and unstructured content in multiple languages, including Japanese, and recently announced general availability of Amazon Kendra in AWS’ Tokyo Region. Although Japanese semantic search was one requirement for AZKK selecting Amazon Kendra for the POC, there were additional product features AZKK required, such as the ability to index and search Microsoft Excel documents. Amazon Kendra also offers the ability to create custom connectors, which can be event driven. This provides AZKK with the ability to update Kendra’s index in near real-time, so that up-to-date content is readily available for users to discover/find.

AZKK created an enterprise search solution using Amazon Kendra, Amazon API Gateway, Amazon Simple Notification Service (Amazon SNS), Amazon Simple Queue Service (Amazon SQS), AWS Lambda, Amazon EventBridge EventBus, Amazon EventBridge Pipes, AWS Key Management Service (AWS KMS) customer-managed keys, and Amazon Simple Storage Service (Amazon S3). The solution architecture is described in the following diagram.

All data is encrypted at rest using an AWS KMS customer-managed-key.

All data is encrypted at rest using an AWS KMS customer-managed-key

  1. Whenever a document is created, updated, or deleted in Box content management solution, the Box webhook service sends a JSON message to API Gateway.
  2. API Gateway forwards the message to an Amazon SQS FIFO (First-In-First-Out) queue by use of a service integration to remove duplicate messages and keep an ordering of the events for a document.
  3. A Lambda function is triggered by messages on the SQS queue to process the message. If the file type of the document is supported by Amazon Kendra, then the file is downloaded from Box into an S3 bucket to be indexed by Amazon Kendra. A request to index is sent to Amazon Kendra that also contains an Access Control List (ACL) based on the ACLs in Box.
  4. A JSON object containing the document ID, Amazon Kendra Index ID, and Amazon Kendra Data Source ID is forwarded to an SQS FIFO queue to monitor the status of the Amazon Kendra indexing request.
  5. For Box event types of FILE.TRASHED, FILE.DELETED, or FILE.LOCKED (made private in Box), the Lambda function first replaces the document content and document title to prevent the content from being searchable, followed by making a request to Amazon Kendra to remove the document from the index.
  6. Another Lambda function is triggered by the Check Document Index Status SQS queue to poll Amazon Kendra to find the result of the indexing request. Once a result is received indicating success or failure, then a message is sent to a custom EventBridge bus to alert downstream processes and for logging or trigger further analytics if desired.
  7. An EventBridge rule triggers a Lambda function on either a SUCCEED or FAIL result from checking the index ingestion status to delete the temporary copy of the document from the S3 bucket so that we do not have two copies of the document.
  8. If there was an issue in Amazon Kendra indexing resulting in a FAIL status, then an EventBridge rule is triggered to send a message to the team for investigation.For exception handling, we have Dead Letter Queues (DLQ) attached to the SQS queue with the ability to redrive messages if needed for the initial webhook message and also notify the team for investigation.
  9. In the event of Amazon Kendra having an issue and we cannot get an index status, the message drops to a DLQ and an EventBridge Pipe modifies the message to indicate an error and passes it to the EventBridge bus.
  10. If there is an issue in processing the received webhook notification, then the message is passed to a DLQ that triggers an alarm using an Amazon CloudWatch Alarm.

The results of the POC showed a two-fold increase in search relevancy, as well as significant improvement in search speed (average rating of 4.4/5 versus 2.4/5) and accessibility (average rating of 4.2/5 versus 2.2/5) as compared to the existing enterprise search solution. 93% of the users recommended that the solution should be rolled out to production. Many users commented that they like the fast and accurate search.

The success of the POC resulted in the solution being rolled out to production at the end of June 2023 to great reception.

“Driving employee productivity with AI is a key foundation to the Digital workforce. This project has delivered 12,000 hours per year of productivity, and it exemplifies the impact that can be delivered by in-house AI engineering talent and close partnership with the AWS team.” Harsh Gandhi, Head of Information and Digital, IT, AstraZeneca Japan.

AZKK is looking to enhance the solution by adding more data sources and integrating the solution with Amazon Comprehend to support better content classification. There are plans to evaluate using Amazon Lex for chatbot integration together with Amazon Polly to support speech interaction. This can provide AZKK’s field force hands-free interaction with Amazon Kendra when on the move.

Amazon Kendra helps organizations like AstraZeneca Japan improve the search experience and unlock employee productivity. With the proliferation of generative AI, it is now possible to generate responses to more complex queries. With the ability to create accurate responses in a timely manner, AZKK are evolving their use-case to use conversational AI powered by large language models (LLM) in retrieval augmented generated (RAG) based applications. RAG architecture reduces generative AI hallucinations by restricting responses to company data rather than relying on the LLM, which are prone to hallucinations. With Amazon Kendra’s Retrieve API (designed specifically for RAG), a comprehensive ecosystem of data source connectors, support for common file formats, and security, you can quickly start using Generative AI solutions for enterprise use cases with Amazon Kendra as the retrieval mechanism. AZKK can augment their existing Amazon Kendra based solution with native AWS services, such as Amazon Bedrock and Amazon Sagemaker JumpStart, to access a variety of LLMs best suited for their use-case.

Thiam Hwa Lim

Thiam Hwa Lim

Thiam Hwa Lim is a Sr. Solutions Architecture at AWS supporting customers in Healthcare & Life Sciences industry. He has 20 years of experience working with customers in the industry and is passionate about using technology to improve people’s lives.

Firaz Akmal

Firaz Akmal

Firaz Akmal is a Sr. Product Manager for Amazon Kendra at AWS. He is a customer advocate, helping customers understand their search and generative AI use-cases with Kendra on AWS. Outside of work Firaz enjoys spending time in the mountains of the PNW or experiencing the world through his daughter’s perspective.

Jarich Vansteenberge

Jarich Vansteenberge

Jarich Vansteenberge is the Director for the Data, AI and Innovation CoE at AstraZeneca K.K. He has 9+ years of experience delivering practical implementations of AI and Innovative projects in Healthcare and Pharmaceutical Industries in Japan.

Jim O’Neil

Jim O’Neil

Jim O’Neil is a Senior Cloud Architect on the Data, AI and Innovation CoE team in AstraZeneca K.K. He has over 30 years of experience in designing and building scalable cost-effective and sustainable architectures for multiple industries. Lover of all things serverless.