How was this content?
- Learn
- Anterior reduces clinical review time by 75% with Amazon Bedrock and Llama
Anterior reduces clinical review time by 75% with Amazon Bedrock and Llama

Anterior, a clinician-led AI company building automation for healthcare payers (insurance companies), set out to solve one of healthcare’s hardest data problems: identifying and structuring clinical documents that often arrive as hundreds of pages of unstructured records. After implementing Llama models from Meta on Amazon Bedrock to power document identification within customers’ Amazon Web Services (AWS) environments, Anterior achieved production-grade performance while meeting strict healthcare data governance requirements. Using this approach, Anterior delivered complete document extraction, improved metadata accuracy, and enabled downstream automation that reduces manual clinical review by 75 percent.
Addressing healthcare’s document identification challenge
Healthcare administrative costs in the United States exceed $950 billion annually in a $5 trillion industry. Much of this burden comes from clinical review workflows inside health plans, where physicians and nurses manually review large packets of medical records to approve treatments, verify coverage, and manage patient care. Anterior is a clinician-led AI company focused on automating these workflows for healthcare payers, organizations that sit at the intersection of providers and patients.
At the center of these workflows is a task that sounds deceptively simple: before AI can reason about a clinical case, it must understand what it's looking at. Document identification is the prerequisite for all downstream automation. Anterior must segment each incoming clinical packet into its constituent documents, identify where each begins and ends, and extract structured metadata including document type, title, author, and creation date. Only then can clinical automation proceed, whether that’s routing an MRI report to the right step in a prior authorization review, surfacing recent imaging for a clinician, or verifying that documentation supports a recommended course of care. However, clinical packets can be hundreds of pages long and arrive as faxes, scanned PDFs, and merged multi-document files. They may combine imaging, tables, forms, and even handwritten notes in ways traditional AI and ML approaches have long struggled to handle reliably at production scale.
The stakes of getting this wrong are high. “Even small errors in document identification can cascade downstream, because you’re basing clinical decisions on incomplete or incorrect information,” said Khadija Mahmoud, MD, clinician scientist at Anterior. A misidentified document boundary could mean surfacing clinical information from the wrong part of a patient record, while a dropped page could create a compliance gap. Any model capable of handling production-grade document identification also has to meet strict healthcare data governance requirements. Many of Anterior’s largest customers require that all AI processing, including LLM inference on Protected Health Information (PHI), occur entirely within their AWS environment, making external APIs or third-party infrastructure unacceptable.
Building a scalable pipeline for clinical automation
Anterior implemented a document identification workflow powered by Meta Llama models running on Amazon Bedrock. This architecture processes complex clinical document packets end to end within a customer's AWS environment, so patient data never leaves that boundary. The workflow operates as a two-stage pipeline. In the first stage, large clinical PDFs are processed using optical character recognition (OCR) and layout-aware parsing. Each page is converted into structured text extracts while preserving page references and unique identifiers. In the second stage, a language model analyzes these parsed extracts to determine document boundaries, classify document type, and extract metadata such as title, author, creation date, and a clinical description. This stage is where Llama models on Amazon Bedrock do the work.
Anterior evaluated Llama Maverick 17B and Llama Scout 17B against a frontier-scale proprietary multimodal model using identical prompts, datasets, and evaluation criteria. The evaluation ran entirely within AWS infrastructure and measured production readiness across accuracy, completeness, consistency, and latency. Datasets were generated through Anterior's synthetic data pipeline and curated by clinician scientists to reflect real-world complexity: ambiguous formatting, multi-document packets, and edge cases. Llama was a strong candidate for several reasons: it supports multimodal inputs (which aligns with the inherently multimodal nature of clinical data), enables efficient inference for high-throughput workloads, and offers a large context window that comfortably handles lengthy clinical packets. It is also among the most tunable open-weight models available, allowing Anterior to tailor model behavior through prompting and system-level constraints and explore smaller, specialized models tuned to specific clinical tasks rather than relying solely on frontier- scale models.
Running Llama on Amazon Bedrock allowed the company’s team of clinicians and engineers to focus on solving the clinical problem rather than managing infrastructure. Bedrock provides a unified interface for evaluating and deploying foundation models while integrating directly with AWS environments. "Many major health plans we work with ask the same question: 'Can we run AI on PHI inside our AWS environment?' Bedrock- hosted Llama models let us say yes without compromising performance," said Anuj Iravane, applied AI lead at Anterior. Bedrock also preserves flexibility: Anterior can evaluate additional models or deploy custom fine- tuned versions as clinical requirements evolve without rebuilding its architecture.
Accelerating clinical decisions and operational efficiency
Across a dataset of clinician-curated synthetic clinical cases, both Llama Maverick 17B and Llama Scout 17B delivered production-grade performance for clinical document identification. The models matched a frontier-scale model with hundreds of billions of parameters while running more efficiently, despite using 17B active parameters within larger model architectures. They achieved complete page coverage, meaning every page in a clinical packet was assigned exactly once with no dropped or duplicated content. The results were particularly strong in metadata extraction. Llama models matched or exceeded the frontier baseline when identifying key information such as document authorship and descriptions. Author identification accuracy reached as high as 97 percent, compared with 93.5 percent for the frontier model, while description faithfulness reached 98.4 percent. “We were impressed,” said Iravane. “Llama models on Bedrock matched our frontier baseline at a fraction of the cost—and in metadata extraction, they actually outperformed it. You don’t need the biggest model to solve healthcare’s hardest problems.”
Latency across models was comparable, but the efficiency advantages of smaller Llama models running on Bedrock compound at scale. As document volumes grow, Anterior can process more cases per unit of compute at a lower cost per document without sacrificing accuracy. The downstream impact on healthcare workflows is significant. In prior authorization review, the Anterior platform reduces manual clinical review time by 75 percent while maintaining 99.24 percent clinical accuracy. A KLAS Research case study found the system reduced patient wait times for cancer care approvals from days or weeks to just 155 seconds. For a regional healthcare organization serving about one million covered lives, these improvements translate to approximately $30 million in annual operational savings. Faster document understanding ultimately means faster clinical decisions and quicker access to care for patients.
Anterior moved from initial integration to deployment in six weeks. Llama models are now part of the company’s production document identification workflow serving multiple enterprise customers. The results also validated a broader architectural approach: that smaller open-weight models hosted on Amazon Bedrock can compete with frontier-scale general-purpose models across healthcare workflows. “Much of US healthcare lives on AWS,” said Iravane. “Proving that Llama models on Bedrock can match frontier performance means our customers can deploy faster, control costs better, and maintain the security posture they require.”
How was this content?