AWS Machine Learning Blog

Making sense of your health data with Amazon HealthLake

We’re excited to announce Amazon HealthLake, a new HIPAA-eligible service for healthcare providers, health insurance companies, and pharmaceutical companies to securely store, transform, query, analyze, and share health data in the cloud, at petabyte scale. HealthLake uses machine learning (ML) models trained to automatically understand and extract meaningful medical data from raw, disparate data, such as medications, procedures, and diagnoses. This revolutionizes a process that is traditionally manual, error-prone, and costly. HealthLake tags and indexes all the data and structures it in Fast Healthcare Interoperability Resources (FHIR) to provide a complete view of each patient and a consistent way to query and share the data. It integrates with services like Amazon QuickSight and Amazon SageMaker to visualize and understand relationships in the data, identify trends, and make predictions. Because HealthLake automatically structures all of a healthcare organization’s data into the FHIR industry format, the information can be easily and securely shared between health systems and with third-party applications, enabling providers to collaborate more effectively and allowing patients unfettered access to their medical information.

Every healthcare provider, payer, and life sciences company is trying to solve the problem of organizing and structuring their data in order to make better patient support decisions, design better clinical trials, operate more efficiently, understand population health trends, and share data securely. It all starts with making sense of health data.

Let’s look at one specific example—imagine you have a diabetic patient whom you’re trying to manage, and 2 months later their glucose level is still not responding to the treatment that you prescribed. With HealthLake, you can easily create a cohort of diabetic patients and their demographics, treatments, blood glucose readings, tests, and clinical observations and export this data. You can then create an interactive dashboard with QuickSight and compare that patient to a population with similar treatment options to see what helped improve their health outcome. You can use SageMaker to train and tune the best ML models to help you identify which subset of these diabetic patients are at increased risk of complications like high blood pressure so you can intervene early and introduce a second line of medications in addition to preventive measures, like special diets.

Health data is complex

Healthcare organizations are doing some amazing things with ML today, but health data remains complex and difficult to work with (data is siloed, spread out across multiple systems in incompatible formats). Over the past decade, we’ve witnessed a digital transformation in healthcare, with organizations capturing huge volumes of patient data every day, from family history and clinical observations to diagnoses and medications. The vast majority of this data is contained in unstructured medical records such as clinical notes, laboratory reports (PDFs), insurance claims (forms), recorded conversations (audio), X-rays (images), and more.

Before leveraging healthcare data for effective care, it all needs to be securely ingested, stored, and aggregated. Relevant attributes need to be extracted, tagged, indexed, and structured before you can start analyzing it. The cost and operational complexity of doing all this work well is prohibitive to most healthcare organizations and takes weeks, or even months. The FHIR standard is a start toward the goal of standardizing a data structure and exchange for healthcare, but the data still needs to be transformed to enable advanced analytics via queries, visualizations, and ML tools and techniques. This means analysis effectively remains hard to reach for almost all providers.

Create a complete view of a patient’s medical history, in minutes

With HealthLake, we’re demystifying a set of challenges for our healthcare and life sciences customers by removing the heavy lifting needed to tag, index, structure, and organize this data, providing a complete view of each patient’s medical history in minutes, instead of weeks or months. HealthLake makes it easy to copy your on-premises FHIR data to AWS. HealthLake transforms raw, disparate data with integrated medical natural language processing (NLP), which uses specialized ML models that have been trained to automatically understand and extract meaningful medical information, such as medications, procedures, and diagnoses, from raw, disparate data. HealthLake tags each patients’ record, indexes every data element using standardized labels, structures each data element in interoperable standards, and organizes the data in a timeline view for each patient. HealthLake presents data on each patient in chronological order of medical events so that you can look at trends like disease progression over time, giving you new tools to improve care and intervene earlier.

Your data in HealthLake is secure, compliant, and auditable. Data versioning is enabled to protect data against accidental deletion, and per FHIR specification, if you delete a piece of data, it’s only hidden from analysis and results—not deleted from the service, only versioned. Your data is encrypted using customer managed keys (CMKs) in a single-tenant architecture to provide an additional level of protection when data is accessed or searched, so that the same key isn’t shared by multiple customers. You retain ownership and control of your data, along with the ability to encrypt it, protect it, move it, and delete it in alignment with your organization’s security policies.

Identify trends and make predictions to manage your entire population

Today, the most widely used clinical models to predict disease risk lack personalization and often use a very limited number of commonly collected data points, which is problematic because the resulting models may produce imprecise predictions. However, if you look at an individual’s medical record, there may be hundreds of thousands of data points, and the majority of that is untapped data stored in doctors’ notes. With your health data structured and organized chronologically by medical events, you can easily query, perform analytics, and build ML models to observe health trends across an entire population.

You can use other AWS services that work seamlessly with HealthLake, such as QuickSight or SageMaker. For example, you can create an interactive dashboard with QuickSight to observe population health trends, and zoom in on a smaller group of patients with a similar state to compare their treatments and health outcomes. You can also build, train, and deploy your own ML models with SageMaker to track the progression of at-risk patients over the course of many years against a similar cohort of patients. This enables you to identify early warning signs that need to be addressed proactively and would be missed without the complete clinical picture provided by HealthLake.

Bringing it all together

Now, your health data is tagged, indexed, structured, and organized in chronological order of medical events, so it can be easily searched and analyzed. You can securely share patient’s data across health systems in a consistent, compatible FHIR format across multiple applications. You now have the ability to make point-of-care or population health decisions that are driven by evidence from the overall data.

AWS customers are excited about the innovation that HealthLake offers and the opportunity to make sense of their health data to deliver personalized treatments, understand population health trends, and identify patients for clinical trial enrollment. This offers an unprecedented opportunity to close gaps in care and provide the high quality and personalized care every patient deserves.

Cerner Corporation, a global healthcare technology company, is focused on using data to help solve issues at the speed of innovation—evolving healthcare to enhance clinical and operational outcomes, help resolve clinician burnout, and improve health equity.

“At Cerner, we are committed to transforming the future of healthcare through cloud delivery, machine learning, and AI. Working alongside AWS, we are in a position to accelerate innovation in healthcare. That starts with data. We are excited about the launch of HealthLake and its potential to quickly ingest patient data from diverse sources, unlock new insights through advanced analytics, and serve many of our initiatives across population health.”

—Ryan Hamilton, SVP & Chief Architect, Cerner

Konica Minolta Precision Medicine (KMPM) is a life science company dedicated to the advancement of precision medicine to more accurately predict, detect, treat, and ultimately cure disease.

“We are building a multi-modal platform at KMPM to handle a significant amount of health data inclusive of pathology, imaging, and genetic information. HealthLake will allow us to unlock the real power of this multi-modal approach to find novel associations and signals in our data. It will provide our expert team of data scientists and developers the ability to integrate, label, and structure this data faster and discover insights that our clinicians and pharmaceutical partners require to truly drive precision medicine.”

—Kiyotaka Fujii, President, Global Healthcare, Konica Minolta, & Chairman, Ambry Genetics

Orion Health is a global, award-winning provider of health information technology, advancing population health and precision medicine solutions for the delivery of care across the entire health ecosystem.

“At Orion Health, we believe that there is significant untapped potential to transform the healthcare sector by improving how technology is used and providing insights into the data being generated. Data is frequently messy and incomplete, which is costly and time consuming to clean up. We are excited to work alongside AWS to use HealthLake to help deliver new ways for patients to interact with the healthcare system, supporting initiatives such as the 21st Century Cures Act, designed to make healthcare more accessible and affordable, and Digital Front Door, which aims to improve health outcomes by helping patients receive the perfect care for them from the comfort of their home.”

—Anne O’Hanlon, Product Director, Orion Health

Conclusion

What was once just a pile of disparate and unstructured data looking like a patchwork quilt—an incomplete health history stitched together with limited data—is now structured to be easily read and searched. For every healthcare provider, health insurer, and life sciences company, there is now a purpose-built service enabled by ML they can use to aggregate and organize previously unusable health data, so that it can be analyzed in a secure and compliant single-tenant location in the cloud. HealthLake represents a significant leap forward for these organizations to learn from all their data to proactively manage their patients and population, improve the quality of patient care, optimize hospital efficiency, and reduce cost.

 


About the Authors

Dr. Taha Kass-Hout, is director of machine learning and chief medical officer at Amazon Web Services (AWS), where he leads initiatives such as as Amazon HealthLake and Amazon Comprehend Medical. A physician and bioinformatician, Taha has previously pioneered the use of emerging technologies and cloud at both the CDC (in electronic disease surveillance) and the FDA, where he was the Agency’s first Chief Health Informatics Officer, and established both the OpenFDA and PrecisionFDA data sharing initiatives.

 

Dr. Matt Wood is Vice President of Product Management and leads our vertical AI efforts on the ML team, including Personalize, Forecast, and Lookout for Metrics, along with our thought leadership projects such as DeepRacer. In his spare time Matt also serves as the chief science geek for the scalable COVID testing initiative at Amazon; providing guidance on scientific and technical development, including test design, lab sciences, regulatory oversight, and the evaluation and implementation of emerging testing technologies.