Guidance for Ingesting PDF and Image Files to AWS HealthLake
Overview
How it works
These technical details feature an architecture diagram to illustrate how to effectively use this solution. The architecture diagram shows the key components and their interactions, providing an overview of the architecture's structure and functionality step-by-step.
Well-Architected Pillars
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
Operational Excellence
CloudFormation automates the creation of AWS resources. Lambda enables automatic responses to events. Amazon CloudWatch observes and watches resources and events on AWS. HealthLake uses CloudWatch and AWS CloudTrail to monitor performance. Together, these services support development and your ability to run workloads effectively so you can gain insight into your operations.
Security
HealthLake, Amazon S3, Lambda, and CloudFormation are Health Insurance Portability and Accountability Act (HIPAA)-eligible services. These services meet rigorous security and access control standards to help ensure patients’ sensitive health data is protected and meets regulatory compliance. S3 buckets have encryption configured by default, and objects are automatically encrypted by using server-side encryption with Amazon S3-managed keys (SSE-S3). Lambda encrypts uploaded files and environmental variables at rest. CloudFormation stores data encrypted at rest and uses encrypted channels for service communications in compliance with the AWS shared responsibility model.
Additionally, AWS Identity and Access Management (IAM) policies have been scoped down to the minimum permissions required for the service to function properly. AWS Key Management Service (AWS KMS) encrypts customer data, both in transit and at rest. Per Fast Healthcare Interoperability Resources (FHIR) specification, if a customer deletes a piece of data, it will be only be hidden from analysis and results; it is not deleted from the service and is only versioned.
Reliability
Lambda maintains compute capacity across multiple Availability Zones (AZs) in each AWS Region to help protect your code against individual machine or data center facility failures. Amazon S3 stores data redundantly across a minimum of 3 AZs by default, providing built-in resilience against widespread disaster. Amazon S3 is also designed to sustain data in the event of AZ failure and provides a highly durable storage infrastructure designed for mission-critical and primary data storage.
Performance Efficiency
HealthLake organizes and indexes patient information and stores it in the FHIR industry standard format to provide a complete view of each patients’ medical history. In addition, HealthLake transforms unstructured data using specialized ML models, like NLP, to automatically extract meaningful medical information from the data. You can use FHIR REST API operations to manage and search resources in your HealthLake data store.
Cost Optimization
Lambda is a serverless, event-driven compute service. Serverless architectures remove the need for you to run and maintain physical servers for traditional compute activities, helping you lower transactional costs that may otherwise be spent on maintaining infrastructure.
Moving to Amazon S3 reduces costs by eliminating over-provisioning, minimizing the chance of getting locked into hardware refresh cycles, and providing virtually unlimited scale.
Sustainability
Amazon S3, Lambda, AWS Textract, and HealthLake are all AWS managed services, shared across a broad customer base to help optimize resource usage. Managed services reduce the amount of infrastructure needed to support cloud workloads, helping you minimize your environmental impact.
Disclaimer
Did you find what you were looking for today?
Let us know so we can improve the quality of the content on our pages