AWS for Industries

Generative AI-Powered Clinical Intelligence: Safely Driving Better Outcomes

Healthcare organizations face immense challenges in gaining insights from the vast amounts of unstructured patient data they collect, since about 80% of medical data remains unstructured and unused after it is created (Kong, 2019, NIH). Clinical notes written by doctors and nurses are a prime example; they contain a wealth of information about patient conditions, treatments, and outcomes, but this data is in free-form text that is difficult to analyze.

AI21 Labs’ Contextual Answers technology available on Amazon SageMaker JumpStart offers a secure solution to unlock insights from clinical notes while protecting sensitive patient information. Contextual Answers uses a retrieval-augmented generation approach while adhering to strict guardrails on information usage which reduces the risk that the generated response will be confidently wrong (called hallucination). This blog post shows how you can use AI21 Lab’s Contextual Answers to securely answer questions about clinical notes.

Questions asked of clinical notes are answered with relevant excerpts and summaries, providing healthcare professionals with quick access to critical information. With Contextual Answers’ restrictions limiting responses to only those backed by information in the supplied text, healthcare organizations can safely and responsibly benefit from AI, accelerating research and enhancing patient care through evidence derived from clinical notes. Ensuring that LLMs (Large Language Models) reply with answers derived solely from the relevant provided text is critical to many organizations. AI21’s Task Specific Models (of which Contextual Answers is one such model) resolve this by leveraging models that are specifically trained to answer questions based on the data provided. The technology surfaces insights without compromising security or privacy, highlighting the unique value AI can offer the healthcare industry.

Problem Statement

Patient medical records contain a wealth of insights, but extracting meaningful information from unstructured physicians’ notes can be challenging. Natural language understanding (NLU) and conversational AI solutions can leverage LLMs to unlock key details about a patient’s social determinants of health (SDoH) that cannot be found in the structured electronic medical records.

However, LLMs can produce incorrect responses presented with unwavering confidence or even invent details that seem real. This issue is particularly concerning in the healthcare domain, where an incorrect understanding could negatively impact patient care. For example, misunderstanding a physician’s documentation around a patient’s ability to afford necessary medication could result in improper and potentially life-threatening treatment plans.

Healthcare organizations need responsible NLU solutions that provide rich insights from unstructured data while rigorously mitigating the risk of hallucinations. AI21’s Contextual Answers allow customers to answer questions about their document(s) with the confidence that the answers come directly from the document itself. Critically, Contextual Answers will respond with “None” when the answer is not in the document by leveraging, among other things, an architecture called retrieval augmented generation (RAG).

Retrieval Augmented Generation

When a question is posed to an LLM, the model relies on its pretraining to retrieve an answer. This method excels when a model has been exposed to the right kind of data, namely data that relates to the incoming question. However, when LLMs are asked about information not present in the training data, for example a question that references the clinical notes for a specific patient, the LLM performs poorly on two fronts. 1. The model simply does not know the answer 2. The model may not recognize that it does not know the answer and may invent one instead.

RAG is the process of optimizing the output of a LLM, so it references an authoritative knowledge base outside of its training data sources before generating a response (AWS, 2023). In this case, we are using a doctor’s note as the knowledge base.

For healthcare applications, Contextual Answers could be used to search medical literature or a hospital’s database of patient records and doctor’s notes. The RAG model can pinpoint the most relevant information to a query and generate a concise summary answering the question. This allows healthcare professionals to efficiently search across large collections of disparate text data to gain insights without the need to read every record manually.

Some potential benefits for healthcare organizations include accelerating research and discovery, improving clinical decision support, and enhancing patient profiling from medical history. Contextual Answers on SageMaker Jumpstart makes it easier to leverage the power of RAG without needing to build and train models from scratch.


AWS Cloud_GenAI architecture


First, we need to subscribe to the Contextual Answer model in AWS Marketplace to make it available in Amazon SageMaker Jumpstart. Then, we will install the AI21 Software Development Kit (SDK), the SageMaker SDK and the boto3 python library. Note that if you are using SageMaker notebooks or SageMaker studio, the latter two libraries are already installed by default. For deployment, ensure your Identity and Access Management (IAM) role has access to one of the four compatible SageMaker inference instances supported by AI21 Contextual Answers (listed in the code below). The code described in this blog post can be found in the AWS-Samples GitHub repository.

To create a AI21 Lab’s Contextual Answers inference endpoint:

1. Open a Jupyter notebook. This can be in SageMaker Studio, SageMaker Notebook Instances, or a local notebook

2. Install the python packages: sagemaker, boto3, and ai21[AWS]>=2.0.0

3. Load the python packages

4. Get the proprietary AI21 model package

5. Deploy the model on an ml.g4dn.12xlarge instance. You can choose a larger instance for significantly faster processing and lower latency, but at a higher cost. You can deploy the AI21 Contextual Answers model on one of the following instances. When the model is deployed to a larger instance for the result is faster processing and lower latency, but at a higher cost:

a. ml.g4dn.12xlarge – least expensive instance

b. ml.p4d.24xlarge – recommended instance

c. ml.g5.48xlarge – less expensive, faster, recommended for relatively short inputs/outputs

d. ml.g5.12xlarge – even less expensive and faster (up to 10K characters)

To test the inference endpoint:

6. Supply mock clinical notes for a hypothetical medical visit by Timothy142 which describes his medical history and references his SDoH:

Mock Clinical Notes for TestingMock Clinical Notes for Testing

7. Then, provide some questions that can be answered from the mock clinical notes:

a. How old is the patient?

b. What insurance does the patient have?

c. What medical procedures were performed?

8. The model correctly replies with the following answers:

AI21s Contextual Answers Response to Questions in the DocumentAI21’s Contextual Answers Response to Questions in the Document

9. Next, let’s look at how Contextual Answers handles questions where the answers are not in the document:

a. What are insurances similar to Humana?

b. Does the patient’s family have a history of heart disease?

c. Is Timothy competent to choose his own medication?

10. Contextual Answers dutifully responds that the answer is not in the document.

To clean up the resources

11. Lastly, we will clean up the created resources, specifically the SageMaker Inference Endpoint.

Next Steps

You can empower your healthcare organization with more advanced natural language processing capabilities by building on top of your Contextual Answers SageMaker Inference Endpoint. You can create several advanced architectures that leverage contextual Answers.

For example, you can create a virtual assistant chatbot with Amazon Lex, Amazon’s conversational chatbot service, that clinicians and staff can use to surface insights from clinical notes. Additionally, you can leverage Amazon Kendra’s enterprise search capabilities to index clinical notes using AWS Glue, AWS’ automated data integration service. This creates a searchable document repository that your Lex chatbot can tap into to find relevant doctor’s notes. Kendra’s enterprise search capabilities combined with the Contextual Answers Endpoint provides a scalable solution for safely and responsibly enhancing patient care through evidence derived from clinical notes.


AI21’s Contextual Answers technology available on Amazon SageMaker JumpStart offers immense value to healthcare organizations seeking to gain insights from unstructured doctor’s notes and patient records. Through the use of powerful generative AI models, Contextual Answers can rapidly search large volumes of free-form text and generate summarized responses to natural language questions. This helps healthcare professionals efficiently discover and synthesize key information from patient histories and clinical literature.

Critically, Contextual Answers adheres to strict safety guardrails to mitigate risks like false information and hallucination. Its responsible approach to AI prioritizes reliability and transparency, making it well-suited for sensitive healthcare applications. Securely analyzing clinical notes with generative AI has the potential to enable more personalized treatment plans and improve quality of care.

As this blog post explored, retrieval augmented generation has clear benefits for accelerating healthcare research, improving clinical decisions, and incorporating social determinants of health. With its availability on SageMaker JumpStart, Contextual Answers makes these advantages more readily accessible. Healthcare organizations seeking to safely and responsibly leverage generative AI now have a simple way to leverage this innovative technology. You can get started using AI21 Labs’ models on AWS by clicking the link here.

Chris Haddad

Chris Haddad

Chris Haddad is a Senior AI/ML Solutions Architect in the Global Healthcare and Life Sciences team at Amazon Web Services. He is a results-driven and passionate machine learning specialist with over eight years of experience in the healthcare and life science industries. He leverages his expertise to help customers solve complex problems and achieve their business goals through the innovative use of artificial intelligence and machine learning.

Joshua Broyde

Joshua Broyde

Joshua Broyde, PhD, is a Principal Solution Architect at AI21 Labs. He works with customers and AI21 partners across the GenAI value chain, including enabling GenAI at an enterprise level, leveraging complex workflows for regulated and specialized environments, and using LLMs at scale. Joshua also specializes in the healthcare, life sciences, and pharma industries.