This Guidance demonstrates how to ingest PDF or other image files to AWS HealthLake so you can generate business insights. In this Guidance, AWS CloudFormation deploys AWS resources, including an AWS Lambda function, an Amazon Simple Storage Service (Amazon S3) bucket, and a HealthLake instance store. This Guidance will help healthcare workers turn medical or claims data from PDF and image files into a more usable format so they can more securely share patient medical information, use data to inform clinical decision-making, and optimize overall efficiency in the hospital.

Please note: [Disclaimer]

Architecture Diagram

[Architecture diagram description]

Download the architecture diagram PDF 

Well-Architected Pillars

The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

  • CloudFormation automates the creation of AWS resources. Lambda enables automatic responses to events. Amazon CloudWatch observes and watches resources and events on AWS. HealthLake uses CloudWatch and AWS CloudTrail to monitor performance. Together, these services support development and your ability to run workloads effectively so you can gain insight into your operations.

    Read the Operational Excellence whitepaper 
  • HealthLake, Amazon S3, Lambda, and CloudFormation are Health Insurance Portability and Accountability Act (HIPAA)-eligible services. These services meet rigorous security and access control standards to help ensure patients’ sensitive health data is protected and meets regulatory compliance. S3 buckets have encryption configured by default, and objects are automatically encrypted by using server-side encryption with Amazon S3-managed keys (SSE-S3). Lambda encrypts uploaded files and environmental variables at rest. CloudFormation stores data encrypted at rest and uses encrypted channels for service communications in compliance with the AWS shared responsibility model.

    Additionally, AWS Identity and Access Management (IAM) policies have been scoped down to the minimum permissions required for the service to function properly. AWS Key Management Service (AWS KMS) encrypts customer data, both in transit and at rest. Per Fast Healthcare Interoperability Resources (FHIR) specification, if a customer deletes a piece of data, it will be only be hidden from analysis and results; it is not deleted from the service and is only versioned.

    Read the Security whitepaper 
  • Lambda maintains compute capacity across multiple Availability Zones (AZs) in each AWS Region to help protect your code against individual machine or data center facility failures. Amazon S3 stores data redundantly across a minimum of 3 AZs by default, providing built-in resilience against widespread disaster. Amazon S3 is also designed to sustain data in the event of AZ failure and provides a highly durable storage infrastructure designed for mission-critical and primary data storage.

    Read the Reliability whitepaper 
  • HealthLake organizes and indexes patient information and stores it in the FHIR industry standard format to provide a complete view of each patients’ medical history. In addition, HealthLake transforms unstructured data using specialized ML models, like NLP, to automatically extract meaningful medical information from the data. You can use FHIR REST API operations to manage and search resources in your HealthLake data store.

    Read the Performance Efficiency whitepaper 
  • Lambda is a serverless, event-driven compute service. Serverless architectures remove the need for you to run and maintain physical servers for traditional compute activities, helping you lower transactional costs that may otherwise be spent on maintaining infrastructure.

    Moving to Amazon S3 reduces costs by eliminating over-provisioning, minimizing the chance of getting locked into hardware refresh cycles, and providing virtually unlimited scale.

    Read the Cost Optimization whitepaper 
  • Amazon S3, Lambda, AWS Textract, and HealthLake are all AWS managed services, shared across a broad customer base to help optimize resource usage. Managed services reduce the amount of infrastructure needed to support cloud workloads, helping you minimize your environmental impact.

    Read the Sustainability whitepaper 

Implementation Resources

A detailed guide is provided to experiment and use within your AWS account. Each stage of building the Guidance, including deployment, usage, and cleanup, is examined to prepare it for deployment.

The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.

[Content Type]


This [blog post/e-book/Guidance/sample code] demonstrates how [insert short description].


The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.

References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.

Was this page helpful?