AWS Partner Network (APN) Blog

Improving Hospital-Patient Engagement and Increasing Hospital Ancillary Revenue Using AI

By Sukhmani Gill, Director, AI/ML – SourceFuse
By Vaidant Singh, Director, Strategic Accounts and Marketing – SourceFuse
By Nieves Garcia, Strategic Initiatives BD – AWS
By Ramanuj Vidyanta, WW AI/ML Specialist, AISPL – AWS


The goal: quicker business insights. However, for effective data management and analytics, especially when it comes to healthcare and patient data, the key lies in the extraction of the right data.

With the world’s largest organizations focused on building cloud solutions for data management, its foundation relies on data extraction. Usually, data is extracted from existing stored files and more than 2.5 quintillion bytes of data generated every day.

Over the past few years, most organizations have experienced a steep rise in the volume of documents they have to deal with. This situation arose from various reasons, especially regulatory compliance such as HIPAA, Hi-Trust, GDPR, and more.

According to Gartner, by the end of 2024, 75% of enterprises will shift from piloting to fully deploying artificial intelligence (AI), driving a 5X increase in streaming data and analytics infrastructures.

According to HIPAA guidelines, healthcare organizations in the U.S. have to maintain a minimum of six years of patient data on an electronic health records (EHR) system. Some of them are scans of health records from before digital experiences were so widespread.

These health records can be found in different formats like DICOM, PDFs, and XMLs. What adds to the problem of impossible data or document count is that some records are structured while others are unstructured and/or handwritten.

Why Analytics and Machine Learning Matter in Healthcare

When it comes to healthcare and patient data, managing and storing data involves some of the greatest challenges of any industry.

Disparate input sources, the need for real-time monitoring, and HIPAA data security and regulatory compliance are all necessary to maintain sensitive and personally identifiable health records. All this before even starting to think about statistical data analytics.

Healthcare providers maintain 50+ key performance indicators (KPIs) when it comes to patient journey and patient satisfaction visualization.

With the rise of AI, healthcare organizations are using Natural Language Processing (NLP) to help them extract key information, insights, and relationships from their data. NLP is a branch of AI that helps computers understand, interpret, and understand human language.

SourceFuse Technologies is an AWS Advanced Consulting Partner with the AWS Healthcare Competency and a niche focus on the healthcare and life sciences industries.

Recently, SourceFuse developed a hospital ancillary revenue insights solution for one of India’s leading hospital chains. The solution helps them extract, analyze, and generate business insights and KPIs using patient prescription information from thousands of e-prescriptions.

In this post, we’ll discuss how the solution increased productivity, removed manual data analysis, and provided hospital KPIs such as patient drug cost per stay, patient diagnostics test cost per stay, and patient expense. Insights from these KPIs enabled the customer to increase the patient satisfaction indicator.

The Solution

The SourceFuse team of AWS-certified engineers used Amazon Web Services (AWS) AI and machine learning (ML) services to deliver this NLP-driven solution. It’s designed for improving hospital-patient engagement and increasing hospital ancillary revenue insights, augmenting current business flows (see Figure 1).

The solution was built within three weeks with the necessary integration and customization. It has no recurring license fee, and the customer works with pay-per-use AWS services with no upfront commitment.

Deploying the solution on the customer’s AWS account ensures that security and compliance policies are met internally, and does not add any onboarding overhead.

Powered by the advanced ML models behind Amazon Comprehend Medical, the solution doesn’t require large amounts of training data since there are no ML models to build or train.


Figure 1 – High-level hospital ancillary revenue insights solution workflow.

As shown in the workflow above, the hospital ancillary revenue insights solution can receive prescription input from two sources:

  • Optical Character Recognition (OCR): Directly reads the doctor’s e-prescriptions. See Figure 2 for sample e-prescription.
  • Direct input: The hospital can share the required prescription raw data as direct input to SourceFuse’s solution.

After receiving the prescription data, the solution is able to:

  • Classify prescription data into categories: 1-Medicines; 2-Laboratory tests (radiology and pathology); and so on.
  • Detect and isolate each drug prescription’s dosage and frequency instructions, using the AI algorithm developed.
  • Map the prescription instructions with the hospital’s pricing database to generate patient expenses. This way, the hospital is enabled with patient expense insights, and allows them to personalize the patient’s ongoing journey with the hospital with a higher patient satisfaction indicator.
  • Personally Identifiable Information (PII): The solution matches the patient information via the unique patient ID generated by the hospital.


Figure 2 – Sample e-prescription.

How it Works

The hospital ancillary revenue insights solution uses a microservice approach to share prescription documents and fetch processed results. Amazon API Gateway creates and deploys the REST API, while Amazon Simple Queue Service (SQS) manages all incoming requests.

The solution associates the latest prescription with a visit. This was implemented using the First In First Out (FIFO) queue. The solution also uses the dead-letter queue (DLQ) functionality within SQS to handle any processing errors.

The solution can process both machine-readable and non-readable input data by differentiating the processing mechanism. Once a request is received and queued, the solution can either:

  • Extract the prescription text from the prescription PDF using the Detect Document Text API of Amazon Textract if non-machine-readable input data;
  • Directly process the machine-readable data received from the customer’s telemedicine application.

The input data is then processed through Amazon Comprehend Medical for relevant entity detection, such as the prescribed drug’s name, dosage, and frequency. The various entities are mapped to a pricing management list using fuzzy logic, and a final cost associated with medicines, investigations, and procedures calculated.

All data processing is run on AWS Lambda and orchestrated by AWS Step Functions, creating a scalable and serverless implementation.


Figure 3 – Technology workflow behind the solution’s analytics engine.

Tech Stack

Amazon Textract

The OCR-to-NLP pipeline starts with the Amazon Textract OCR wrapper, which reads in PDFs from Amazon Simple Storage Service (Amazon S3).

The Amazon Textract OCR wrapper is written in Python and runs as serverless Lambda functions that output entities from the patient medical records in a highly customizable manner. This allows the solution to extract key-value pairs from text structured as cell blocks, pages, lines, and more. It even enables text extraction based on defined thresholds as a confidence value is provided by Amazon Textract along with every entity detected.

Amazon Comprehend Medical

Amazon Comprehend Medical is an NLP engine that can be used to identify medical terminology in text. It performs various ML-based tasks, including named entity recognition, key phrase extraction, sentiment analysis, and syntax analysis, which is vital when working with complex medical terminology.

Amazon Comprehend Medical also provides a probability score for every medical entity recognized in the given text.

Amazon API Gateway

This service handles all of the tasks involved in accepting and processing hundreds of concurrent API calls to be sent to SQS queues. This includes traffic management, authorization, access control, monitoring, and API management.

AWS Lambda

For this solution, SourceFuse needed a serverless solution so it made sense to deploy the architecture on AWS Lambda, which takes medical data from the SQS queue, processes it using Amazon Comprehend Medical, performs the dictionary mapping, and stores the processed results in an output database.

All of this happens with a trigger passed by the SQS queue, meaning the resources start up only when required, producing a cost-effective approach.

Amazon SQS

Amazon SQS reduces the complexity and overhead associated with managing and operating message-oriented middleware and allows developers to focus on differentiating work. SQS handles the medical data sent by different consumers (Amazon API Gateway) processed by the producers (AWS Lambda), which helps the architecture to work asynchronously.

Rather than one application being reliant on another application, SQS helps the interaction of different services to work independently. It helped SourceFuse in processing and monitoring the medical data, and provided more security for this sensitive data. In addition, the increasing number of users benefit from the reduction in latency provided by SQS.

MongoDB on AWS

MongoDB was the preferred data storage service for the SourceFuse solution. The final processed output and statuses of all processes were stored on MongoDB instances.

AWS Step Functions

AWS Step Functions is a serverless function orchestrator that simplifies sequencing Lambda functions. You can create and run a series of checkpointed and event-driven workflows that maintain the application state through its visual interface.

This service helps orchestrate the calls to various AWS services, such as Amazon Textract or Amazon Comprehend, since the output of each step acts as an input to the next.

Benefits to Hospitals and Healthcare Insurance Organizations

With the AI-enabled hospital ancillary revenue insights solution built using AWS services, SourceFuse allows hospital and healthcare insurance organizations to accelerate their business analytics by:

  • Increasing efficiencies: Previously, someone manually had to generate business insights and KPIs associated with the patient revenue journey; on average, they could analyze 5-6 prescriptions an hour. With the automation of the entire repetitive manual process, however, SourceFuse expects analysis time to reduce by half, leading to a 100% increase in efficiency.
  • Human-error prevention.
  • Identifying and targeting prospective patients.
  • Creating targeted campaigns for specific health conditions.
  • Review pricing data to recommend the ideal healthcare provider.
  • Quickly identify potential fraud or at-risk claims.

The solution currently has an accuracy rate of 91%. SourceFuse plans to continue optimizing the NLP engine and AI algorithm to improve the accuracy to 96% in identifying medical terminology and prescription instructions.

SourceFuse believes that further enhancements to this solution—like automatically linking prescriptions with the pharmacy network–will enable single-touch patient approval that places the order and delivers to their doorstep.

SourceFuse has delivered more than 600 AWS implementations that boost efficiency, ensure compliance, deliver actionable insights, and lower the total cost of ownership (TCO).


SourceFuse – AWS Partner Spotlight

SourceFuse is an AWS Advanced Consulting Partner with a niche focus on healthcare and life sciences industries. They recently developed an easy-to-use and secure telemedicine application called SF Medic that can be adopted by hospitals, clinics, and even single-physician practices.

Contact SourceFuse | Partner Overview

*Already worked with SourceFuse? Rate the Partner

*To review an AWS Partner, you must be a customer that has worked with them directly on a project.