Amazon Comprehend Medical features


Amazon Comprehend Medical is a HIPAA-eligible natural language processing (NLP) service that uses machine learning to extract health data from medical text–no machine learning experience is required.

Much of health data today is in free-form medical text like doctors’ notes, clinical trial reports, and patient health records. Manually extracting the data is a time consuming process, while automated rule-based attempts to extract the data don’t capture the full story as they fail to take context into account. As a result, the data remains unusable in large-scale analytics needed to advance the healthcare and life sciences industry and improve patient outcomes and create efficiencies.

With a simple API call to Amazon Comprehend Medical you can quickly and accurately extract information such as medical conditions, medications, dosages, tests, treatments and procedures, and protected health information while retaining the context of the information. Amazon Comprehend Medical can identify the relationships among the extracted information to help you build applications for use cases like population health analytics, clinical trial management, pharmacovigilance, and summarization. You can also use Amazon Comprehend Medical to link the extracted information to medical ontologies such as ICD10-CM, RxNorm or SNOMED CT to help you build applications for use cases like revenue cycle management (medical coding), claim validation and processing, and electronic health record creation.

Amazon Comprehend Medical is fully managed, so there are no servers to provision, and no machine learning models to build, train, or deploy. You pay only for what you use, and there are no minimum fees and no upfront commitments.

Page Topics



The Medical NERe API returns the medical information such as medication, medical condition, test, treatment and procedures (TTP), anatomy, and Protected Health Information (PHI). It also identifies relationships between extracted sub-types associated to Medications and TTP. There is also contextual information provided as entity “traits” (negation, or if a diagnosis is a sign or symptom). The table below shows the extracted information with relevant sub-types and entity traits.

To only extract PHI, you can use the Protected Health Information Data Identification (PHId) API.

Example: In this example, we are looking at the admission note. The API identifies medical information, and returns a confidence score.

Sample Text: Mr. Smith is a 63-year-old gentleman with coronary artery disease and hypertension. CURRENT MEDICATIONS: taking a dose of LIPITOR 20 mg once daily.

The Medical Ontology Linking APIs identifies medical information and links them to codes and concepts in standard medical ontologies. Medical conditions are linked to ICD-10-CM codes (e.g. “headache” is linked to the “R51” code) with the InferICD10CM API, while medications are linked to RxNorm codes (“Acetaminophine / Codeine” is linked to the “C2341132” cui). Use InferSNOMEDCT to detect medical entities and link them to concepts from the 2021-03 version of the Systematized Nomenclature of Medicine, Clinical Terms (SNOMED CT). SNOMED CT provides a comprehensive vocabulary of medical concepts, including medical conditions and anatomy, as well as medical tests, treatments, and procedures. The Medical Ontology Linking APIs also detects contextual information as entity traits (e.g. negation). You can use ontology linking batch analysis to analyze either a collection of documents or a single document.

Healthcare and life science customers often have data that contains protected health information (PHI) that need to be moved/tagged before further processing, such as de-identifying PHI to comply with HIPAA before conducting research. Amazon Comprehend Medical can detect PHI and uses the Safe Harbour guidelines to search for PHI.