AWS Solutions Library

Enhanced Document Understanding on AWS

Deploy an event-driven AWS Solution that automates document ingestion, analysis, detection, and redaction

Overview

Organizations across industries are increasingly required to process large volumes of semi-structured and unstructured documents with greater accuracy and speed. Enhanced Document Understanding on AWS delivers an easy-to-use web application that ingests and analyzes documents, extracts content, identifies and redacts sensitive customer information, and creates search indexes from the analyzed data.

Documents can be uploaded through the web interface for processing. You can optionally enable Amazon Kendra support for machine learning-based enterprise search.

Benefits

Extend the modular architecture

Based on the features required for your use case, you can configure the root template to deploy some or all of the nested templates.

Customize the workflow orchestration

Choose from out-of-the-box workflow configuration definitions.

Use artificial intelligence (AI) and machine learning (ML) automation

Get insights from AWS managed AI services, even if you have little or no knowledge or training in deploying ML models.

Text extraction

Use Amazon Textract to pull text and structural information from files and use Amazon Comprehend and Amazon Comprehend Medical for deeper analysis.

Technical details

You can automatically deploy this architecture using the implementation guide and the accompanying AWS CloudFormation template.

Enhanced Document Understanding on AWS | Architecture Diagram

Step 1
The user requests the browser to navigate to an Amazon CloudFront URL.

Step 2
The user interface (UI) prompts the user for authentication, which the AWS Solution validates using Amazon Cognito.

Step 3
The UI interacts with the REST endpoint deployed on Amazon API Gateway.

Step 4
The user creates a case that the AWS Solution stores in the Case management store Amazon DynamoDB table.

Step 5
The user requests a signed Amazon Simple Storage Service (Amazon S3) URL to upload documents to an S3 bucket.

Step 6
Amazon S3 generates an s3:PutObject event on the default Amazon EventBridge event bus.

Step 7
The s3:PutObject event invokes the workflow orchestrator AWS Lambda function. This function uses the configuration stored in the Configuration for orchestrating workflows DynamoDB table to determine the workflows to be called.

Step 8
The workflow orchestrator Lambda function creates an event and sends it to the custom event bus.

Step 9
The custom event bus invokes one of the three AWS Step Functions state machine workflows based on the event definition.

Step 10
The workflow completes and publishes an event to the custom EventBridge event bus.

Step 11
The custom EventBridge event bus invokes the workflow orchestrator Lambda function. This function uses the configuration stored in the Configuration for orchestrating workflows DynamoDB table to determine whether the sequence is complete or if the sequence requires another workflow.

Step 12 (Optional)
The workflow orchestrator Lambda function writes metadata from the processed information to an Amazon Kendra index. This index provides the ability to perform ML-powered search.

Overview

Benefits

Technical details

Related content

Was this page helpful?

Enhanced Document Understanding on AWS

Overview

Benefits

Technical details

Related content

Was this page helpful?

Ending Support for Internet Explorer