Building a HIPAA-ready generative AI architecture for healthcare on AWS

Generative AI applications in healthcare use foundation models (large language models pre-trained on broad datasets) to analyze clinical data, generate responses, and automate workflows. These applications are typically built as AI agents that retrieve patient records, apply clinical knowledge, and return grounded responses to clinical staff. But the same capabilities that make these agents useful also introduce risks specific to healthcare: Electronic Protected Health Information (ePHI) exposure in model outputs, clinical hallucination, and regulatory non-compliance. See What Is the difference between PHI and ePHI? – The HIPAA Guide for more information. Addressing these risks requires a HIPAA-eligible architecture that protects ePHI at every layer of the stack.

Under the Health Insurance Portability and Accountability Act of 1996 (HIPAA), covered entities (organizations that create, receive, maintain, or transmit Protected Health Information (PHI), such as hospitals, health plans, and healthcare providers) must implement technical safeguards that protect ePHI during transmission and at rest, maintain audit controls that record and examine activity in systems containing ePHI, and retain documentation for a minimum of 6 years.

Moving a clinical generative AI application from proof of concept to production requires more than a single service or a single compliance control. It requires a layered architecture where compliance is built into the platform configuration, the AI inference layer, the data foundation, the agent deployment model, and the governance stack. A failure or misconfiguration in a single layer must not expose ePHI or compromise the integrity of clinical outputs.

Common healthcare generative AI use cases and challenges

The following are common scenarios where healthcare organizations adopt these capabilities.

Provider network and scheduling teams use the generative AI agent to search payor provider directories, match patients to in-network providers based on specialty, location, and availability, and coordinate appointment scheduling across systems.

Benefits verification and prior authorization teams use the generative AI agent to query payor systems, verify patient eligibility, and generate prior authorization requests by cross-referencing clinical documentation with payor-specific coverage policies, reducing manual review cycles and accelerating time to approval.

Clinical operations teams use the generative AI agent to query patient records stored in FHIR (Fast Healthcare Interoperability Resources) format, the industry standard for structured clinical data exchange, pulling active conditions, current medications, recent lab observations, and care gaps to generate a pre-visit summary that physicians review before seeing the patient.

Clinical documentation specialists generate preliminary clinical notes from patient-clinician conversations, with generated text cited back to the source transcript.

Revenue cycle management (RCM) teams submit encounter notes and the generative AI agent retrieves relevant medical billing codes (ICD-10-CM) from the knowledge base, cross-references the patient’s FHIR diagnosis history, and proposes codes with grounding scores indicating how well each suggestion is supported by the source documentation.

Population health and analytics teams run cohort queries across the FHIR data store, identify care gaps across patient populations, and generate compliance reports from audit logs.

These use cases share common challenges across the full stack:

ePHI exposure in model inputs and outputs: Foundation models can echo back patient identifiers in their responses, exposing sensitive data to unauthorized viewers.
Clinical hallucination: Models can fabricate clinical information such as medication dosages or lab results with no basis in actual records.
Fragmented clinical data across systems: A lack of data quality, integrity, and inconsistent representations make it difficult to provide clean, standards-based context to the model.
Session isolation gaps: Context from one patient interaction can bleed into another without proper boundaries.
Audit trail gaps: Organizations are unable to demonstrate which patient records were accessed, by whom, and when.
Multi-year retention requirements: HIPAA demands 6 years of documentation, far beyond standard log retention windows.

In this post, we describe a comprehensive, HIPAA-ready generative AI architecture for healthcare on Amazon Web Services (AWS) using a defense-in-depth approach. By layering compliance controls at multiple distinct levels, this architecture creates a system where no single point of failure compromises patient data protection, and each component that touches ePHI is independently auditable.

This post focuses on the technical controls and AWS service configurations that support HIPAA-covered entities’ ability to meet or exceed their HIPAA compliance requirements. It doesn’t address administrative safeguards, physical safeguards, organizational requirements, or policies and procedures that are also required under the HIPAA Security Rule. Compliance is a shared responsibility between AWS and the customer, and use of HIPAA-eligible services alone doesn’t constitute HIPAA compliance. Organizations are responsible for evaluating their own environment and regulatory obligations.

Prerequisites

To implement the architecture described in this post, you need:

An AWS account with a Business Associate Agreement (BAA) executed through AWS Artifact
Access to the AWS services referenced in this post (verify that account-level policies or service control policies don’t restrict them)
Sufficient AWS Identity and Access Management (IAM) permissions to provision and configure the resources described in each layer
Familiarity with the HIPAA Security Rule and its technical safeguard requirements
Understanding of FHIR R4 as a clinical data exchange standard

Architecture overview

The proposed architecture addresses each of the previously described challenges through multiple compliance layers that build on each other.

Edge protection (Layer 1) blocks malicious traffic at the global perimeter before it reaches the application. The HIPAA-eligible platform foundation (Layer 2) establishes BAA coverage, private network routing to keep ePHI off the public internet, network isolation to separate resources that process PHI from internet-facing components, and customer-managed encryption. PHI protection (Layer 3) adds redaction and content filtering at the AI inference layer. An FHIR data foundation (Layer 4) replaces fragmented clinical data with standards-based, terminology-normalized patient records.

A grounded Retrieval Augmented Generation (RAG) pipeline (Layer 5) constrains model responses to verified source documents and is designed to detect and block clinical statements that aren’t grounded in the retrieved context. RAG is a technique that retrieves relevant documents from a knowledge base and provides them as context to the foundation model before generating a response. Agentic deployment (Layer 6) introduces infrastructure-level patient session isolation with authenticated identity management. Finally, a governance and audit trail stack (Layer 7) delivers immutable log storage, searchable audit records, continuous compliance monitoring with operational dashboards and alarms, and continuous automated threat detection.

Each layer can be adopted independently, and together they form a defense in depth architecture built on HIPAA-eligible AWS services where no single point of failure compromises patient data protection or the integrity of clinical outputs.

Figure 1: Architecture diagram

Layer 1: Edge protection

Before traffic reaches the application, it traverses the public internet where the application infrastructure is exposed to threats such as distributed denial of service (DDoS) attacks, SQL injection, cross-site scripting (XSS), and prompt injection attempts. The edge layer is the first line of defense, intercepting traffic before it reaches the application infrastructure.

We recommend that DNS resolution, content delivery, and web application firewall inspection occur at the edge, before traffic enters the AWS Regional infrastructure. Amazon Route 53 provides DNS with health checks and DNS-level routing policies that support failover configurations. AWS Shield Advanced wraps both Route 53 and Amazon CloudFront with always-on DDoS detection. For healthcare organizations, DDoS resilience is a patient safety concern: an outage of a clinical decision support system during peak hours directly impacts care delivery.

CloudFront provides content delivery with HTTPS enforcement and serves as the attachment point for AWS WAF at the edge, configured with managed rule groups to block injection attempts such as SQL injection, XSS, and prompt injection patterns.

Layer 2: HIPAA-eligible platform configuration

Before a cloud service can process PHI, the organization must execute a BAA with the cloud provider. But a BAA alone isn’t sufficient. The HIPAA Security Rule requires transmission security (45 CFR 164.312(e)(1)) and encryption of electronic PHI at rest (45 CFR 164.312(a)(2)(iv)).

Executing a BAA through AWS Artifact covers HIPAA-eligible services in the account under a single agreement. AWS PrivateLink Interface endpoints route traffic between an Amazon VPC and AWS services through the private network, keeping PHI off the public internet. We recommend separating public subnets (for the Application Load Balancer and NAT Gateways) from private subnets where compute resources that process ePHI run. Compute resources in private subnets, whether AWS Lambda, Amazon Elastic Container Service (Amazon ECS) with AWS Fargate, or Amazon Elastic Kubernetes Service (Amazon EKS), resolve service endpoints to private IP addresses through DNS, transparently routing traffic through the private network.

AWS Key Management Service (AWS KMS) customer-managed keys add a second authorization gate beyond IAM policies: even if an IAM policy is misconfigured, data remains protected because the principal also needs explicit kms:Decrypt permission on the specific key. We recommend scoping IAM execution roles to specific foundation model Amazon Resource Names (ARNs) rather than wildcards. We recommend storing Electronic Health Record (EHR) credentials and API keys in AWS Secrets Manager with automatic rotation rather than in environment variables.

Layer 3: ePHI protection with Amazon Bedrock Guardrails

The HIPAA Privacy Rule requires de-identification of PHI before it can be used or disclosed without patient authorization. The HIPAA Safe Harbor method (45 CFR 164.514(b)(2)) specifies 18 categories of identifiers that must be removed. When a foundation model echoes back patient names, social security numbers, or addresses in its response, that ePHI is exposed to anyone who can read the API output.

Amazon Bedrock model invocation logging can capture full prompt and response content. If a user submits ePHI in a prompt, such as a patient name, diagnosis, or medical record number, that data is persisted in the logging destination. Organizations must treat invocation logs as ePHI-containing data and apply the same HIPAA safeguards (encryption, access controls, retention policies) as other ePHI stores.

Amazon Bedrock Guardrails provides customizable safeguards that can be applied across foundation models available in Amazon Bedrock, independently of the model invocation. Configuring guardrails to detect and redact ePHI before it reaches the foundation model reduces this exposure by preventing sensitive identifiers from being persisted in both model outputs and invocation logs. Three policy types work together to protect clinical outputs:

Sensitive information filters: Detect and anonymize PHI entity types from both model inputs and outputs. Bedrock Guardrails supports configurable entity types and custom regex patterns to cover organization-specific identifiers beyond the built-in types.
Content filters: Block harmful content categories including hate speech, sexual content, violence, misconduct, and prompt attacks at configurable strength levels. The prompt attack category is particularly important in a healthcare context, because it detects attempts to override the system prompt or manipulate the model into providing responses outside its intended clinical scope.
Contextual grounding checks: Evaluate model responses against source documents to detect and block hallucinated clinical facts that aren’t grounded in the retrieved context. Responses that fall below the grounding threshold are blocked rather than returned to the application, treating ungrounded clinical statements as a compliance violation rather than a quality issue.

Amazon Bedrock Guardrail interventions are logged in AWS CloudTrail with the requesting identity, timestamp, and guardrail identifier, providing an auditable record of ePHI protection enforcement.

Layer 4: FHIR data foundation with AWS HealthLake

The HIPAA Security Rule (45 CFR 164.312(c)(1)) requires policies to protect ePHI from improper alteration or destruction. Healthcare data is complex because the same clinical concept can be represented differently across systems. For example, diabetes, Type 2 DM, and E11.9 refer to the same condition.

AWS HealthLake is a HIPAA-eligible service that stores and analyzes healthcare data in the FHIR R4 format, automatically standardizing medical terminologies including SNOMED CT, ICD-10-CM, RxNorm, and LOINC. A generative AI application can retrieve structured patient records (conditions, medications, observations, clinical history) through the FHIR REST API before invoking the foundation model, providing verified clinical context rather than relying on general training data. FHIR data access operations are also logged in CloudTrail, supporting the data minimization principle: only the clinical data needed for a specific interaction is accessed.

Layer 5: Grounded RAG pipeline

A generative AI agent that generates clinical statements from general knowledge alone poses a direct risk to patient safety. The HIPAA Security Rule (45 CFR 164.308(a)(1)) requires risk analysis and risk management measures. Grounding model responses in verified source documents, and preventing ungrounded responses from reaching end users, is the risk addressed by this management control.

Amazon Bedrock Knowledge Bases provide a fully managed RAG pipeline that ingests clinical documents, generates vector embeddings, and retrieves semantically relevant document chunks at query time. Amazon Bedrock Knowledge Bases supports multiple vector store options, including Amazon OpenSearch Serverless, Amazon Aurora, Amazon Neptune Analytics, and Amazon Simple Storage Service (Amazon S3) Vectors. The RetrieveAndGenerate API returns a response with a grounding score measuring how well the response is supported by source documents.

When the grounding score falls below the configured threshold, the Amazon Bedrock Guardrails contextual grounding checks (described in Layer 3) block the response before it is returned to the application. While Layer 4 provides the clinical data foundation with structured FHIR records, Layer 5 adds precision and verification: the RAG pipeline retrieves only the document chunks relevant to the specific query, and the grounding check verifies the model’s response is supported by those documents.

Layer 6: Agentic deployment with Amazon Bedrock AgentCore

The HIPAA Security Rule requires access controls (45 CFR 164.312(a)(1)), person or entity authentication (45 CFR 164.312(d)), and audit controls (45 CFR 164.312(b)). Without session boundaries, authentication, and structured trace logging, a clinical generative AI agent might not meet these requirements.

Amazon Bedrock AgentCore provides purpose-built infrastructure for operating AI agents at production scale:

AgentCore Runtime assigns a unique session identifier to each patient interaction, isolating execution context at the infrastructure level. Session data is encrypted with the customer-managed KMS key.
AgentCore Identity provides authentication compatible with Amazon Cognito, Microsoft, Google, Salesforce, and OAuth 2.0 providers, enabling role-based access control where, for example, a physician receives different data access than a billing specialist.
AgentCore Policy enforces least-privilege access at the agent-to-tool boundary using Cedar, an open source authorization policy language. Cedar policies use a default-deny posture with forbid-overrides-permit evaluation, which means that organizations can define fine-grained rules based on user identity and tool input parameters. Because policies are evaluated deterministically outside the agent’s code, they can’t be bypassed through prompt manipulation.
AgentCore Observability provides OpenTelemetry-compatible trace logs of agent reasoning steps, producing audit-ready logs for HIPAA compliance reporting.

When AgentCore Runtime processes a patient interaction, it assigns a unique sessionId that appears in CloudTrail events, Observability traces, Guardrail evaluations, and HealthLake access records, enabling complete audit trail reconstruction for a given patient interaction.

Layer 7: Governance and audit trails

The HIPAA Security Rule requires audit controls (45 CFR 164.312(b)), information system activity review (45 CFR 164.308(a)(1)(ii)(D)), and documentation retention for a minimum of 6 years (45 CFR 164.316(b)(2)(i)), though state regulations might require longer retention periods. Without a governance stack, the compliance controls in the previous layers can’t be monitored continuously or queried for audit investigations.

Immutable audit storage: Amazon S3 with Object Lock in COMPLIANCE mode is designed to prevent audit log deletion or modification for the configured retention period, satisfying the 6-year minimum. CloudTrail log file integrity validation (SHA-256 with RSA signing) provides a mechanism to verify logs haven’t been modified since delivery.
Resource-level data event logging: CloudTrail advanced event selectors capture data events for compute functions, Amazon Bedrock agents, knowledge bases, and guardrails, providing the granularity to demonstrate ePHI access is logged at the individual record level.
Searchable audit logs: Amazon Athena with AWS Glue partition projection enables serverless SQL querying of CloudTrail logs across configurable time ranges, supporting both routine compliance monitoring and ad hoc investigations.
Operational dashboards and alarms: Amazon CloudWatch dashboards provide generative AI-specific metrics such as invocation count, Guardrail interventions, error rates, and session count. CloudWatch alarms with Amazon Simple Notification Service (Amazon SNS) notify the operations team when compliance thresholds are breached.
Threat detection and compliance posture: Amazon GuardDuty monitors VPC Flow Logs and CloudTrail events for anomalous access patterns. AWS Security Hub aggregates findings from GuardDuty, AWS Config, and other services into a unified HIPAA compliance view.

Key architectural patterns

The preceding layered architecture is built on five patterns that are directly applicable to production healthcare generative AI systems.

Incremental compliance adoption: Compliance controls might not need to be adopted simultaneously. By designing each layer to be independently adoptable, you can prioritize the highest-risk gaps first and validate each control before moving to the next.

Defense in depth for ePHI protection: ePHI protection is applied at multiple independent layers: edge filtering (Layer 1), private networking (Layer 2), PHI redaction (Layer 3), HIPAA-eligible data storage (Layer 4), session isolation (Layer 6), and threat detection (Layer 7). No single layer is sufficient on its own. A failure in a single layer is less likely to expose PHI when the others remain active.

Grounding as a clinical safety control: Grounding checks in blocking mode (Layer 5) treat ungrounded responses the same way a compliance control treats unauthorized PHI access: as something to be blocked, not only logged.

Session identifier as an audit correlation key: The sessionId from Layer 6 links the components of the audit trail for a patient interaction: CloudTrail events, AgentCore Observability traces, Amazon Bedrock Guardrail evaluations, and HealthLake access records. This enables end-to-end audit trail reconstruction for HIPAA compliance reporting.

Consistent customer-managed encryption: AWS KMS customer-managed keys applied consistently across the services that process ePHI (Bedrock, HealthLake, Amazon S3, Amazon Bedrock AgentCore, CloudTrail) satisfy the HIPAA encryption requirement with auditable key policies and CloudTrail audit trails of encryption operations. We recommend defining your key strategy based on your security and scope requirements.

Compliance controls mapped to HIPAA Security Rule

The following table maps each compliance control to the specific HIPAA Security Rule requirement it addresses.

Compliance control	HIPAA Security Rule requirement	Layer
Edge protection (Route 53, CloudFront, AWS WAF, Shield)	45 CFR 164.312(e)(1) – Transmission security	1
BAA execution for Amazon Bedrock	45 CFR 164.308(b)(1) – Business Associate contracts	2
AWS PrivateLink for private network routing	45 CFR 164.312(e)(1) – Transmission security	2
Customer-managed KMS encryption	45 CFR 164.312(a)(2)(iv) – Encryption and decryption	2
Amazon Bedrock Guardrails PHI redaction	45 CFR 164.312(e)(2)(ii) – Encryption of PHI in transit	3
AWS HealthLake FHIR R4 datastore	45 CFR 164.312(c)(1) – Integrity controls	4
Grounding checks in blocking mode	45 CFR 164.308(a)(1) – Risk management	5
AgentCore Runtime session isolation	45 CFR 164.312(a)(1) – Access control	6
AgentCore Identity with Amazon Cognito	45 CFR 164.312(d) – Person or entity authentication	6
CloudTrail data event logging with immutable Amazon S3 storage	45 CFR 164.312(b) – Audit controls	7
CloudWatch compliance alarms	45 CFR 164.308(a)(1)(ii)(D) – Information system activity review	7
GuardDuty threat detection + Security Hub compliance posture	45 CFR 164.308(a)(1)(ii)(D) – Information system activity review	7

Conclusion

Securing healthcare generative AI applications requires compliance controls at each level of the architecture. The layered approach described in this post—from edge protection through platform configuration, PHI redaction, FHIR data foundation, grounded RAG, agentic session isolation, and governance—creates a defense in depth architecture where components that touch ePHI are independently auditable.

The key takeaway is that compliance and innovation aren’t mutually exclusive. By building compliance into the architecture from the start rather than retrofitting it later, you can adopt generative AI with confidence that patient data is protected, clinical outputs are grounded in verified records, and audit trails are available when regulators or internal compliance teams need them.

The Healthcare Industry Lens of the AWS Well-Architected Framework reinforces these principles: encrypt sensitive data with customer-managed keys, log data access and modification events, and implement least privilege at each layer. The layered architecture described in this post implements these principles using AWS services that are designated as HIPAA-eligible and covered under the AWS BAA.

Compliance is an ongoing process. As generative AI capabilities evolve and regulatory guidance for AI in healthcare continues to develop, the layered architecture provides a foundation that can adapt: new compliance controls can be added at the appropriate layer without disrupting the others. We recommend continuously monitoring your environment, regularly reviewing access policies and Guardrail configurations, and treating compliance as a fundamental architectural principle rather than a one-time exercise. The services and patterns described in this post are available today and designed to work together.

Next steps

To get started, execute a BAA through AWS Artifact to establish HIPAA coverage for eligible services in your account. The AWS Solutions Library and AWS Samples GitHub repository provide reference implementations for several components described in this architecture, including VPC configurations with PrivateLink endpoints, Amazon Bedrock Guardrails configurations, and HealthLake FHIR data store deployments.

AWS for Industries