Automatically extract text using optical character recognition (OCR) to reinvent your business processes
This Guidance helps you automate document processing with AWS Artificial Intelligence and Machine Learning (AI/ML) services. With Intelligent Document Processing (IDP), you can speed up business processes, improve decision quality, and reduce overall costs. IDP automation allows you to focus on the decisions that need your expertise.
Architecture Diagram

Step 1
Documents are uploaded to Amazon Simple Storage Service (Amazon S3), which invokes an AWS Lambda function, starting an Amazon Textract asynchronous detection job.
Step 2
Amazon Textract outputs to an Amazon S3 bucket location, then sends a completion notification to Amazon Simple Notification Service (AmazonSNS).
Step 3
The Amazon SNS topic sends the completion message to the Amazon Simple Queue Service (Amazon SQS) queue.
Step 4
A Lambda function is invoked by the SQS queue to process and read the Amazon Textract output.
Step 5
The Lambda function calls an Amazon Comprehend custom classifier async operation to classify the documents. Comprehend outputs to an S3 bucket.
Step 6
A Lambda function is invoked by the S3 bucket. It sorts the input documents on the basis of classes determined by Amazon Comprehend and places the documents into another S3 bucket location.
Step 7
The S3 bucket with the classified document invokes a Lambda function that can (a) call Amazon Textract sync or async APIs for data extraction, (b) call the Amazon Comprehend pre-defined or custom name entity recognizer to detect personal (PII) or health information (PHI), and (c) perform further document enrichment with medical insights with Amazon Comprehend Medical.
Step 8
The data from the AI service calls and any enriched documents are saved in an S3 bucket location.
Step 9
A Lambda function is invoked from the S3 bucket. The function runs review and validation on the data using predefined rules. It also checks accuracy scores and sends the information for human review if threshold scores are not met.
Step 10
A human completes the review and uses Amazon Augmented AI (Amazon A2I) to update the appropriate information in to the S3 location, which initiates another validation using the Lambda function.
Step 11
A Lambda function stores the extracted and verified data in an Amazon DynamoDB table.
Step 12
Lambda sends a notification that all rules were verified correctly or if any information needs further human review.
Sample Code

Start building with this sample code. Learn how to automate business processes which presently rely on manual input and intervention across various file types and formats.
Related Content

Well-Architected Pillars

The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
-
Operational Excellence
The Intelligent Document Processing Architecture can be deployed fully with infrastructure as code; the serverless infrastructure can be deployed with CDK and orchestrated with low-code visual workflow service like AWS Step Functions. You can bring this automation to your own development pipeline to enable fast iteration and consistent deployments. Observability is achieved with Amazon CloudWatch Logs from AWS AI services such as Amazon Textract and Amazon Comprehend.
-
Security
The AI service supports security for resting and transitional data. Services like Amazon Textract, Amazon Comprehend, and Amazon Comprehend Medical support encryption at rest with Amazon S3 buckets and AWS Key Management Service (KMS). Amazon Textract Sync API and Amazon Comprehend Medical services support in-memory data processing. In-transit encryption is supported for all of the AI services required for IDP. You can also leverage VPC endpoints to meet your security requirements. IDP solutions can be orchestrated with a serverless backend with AWS IAM-based authentication for secure validation. Intelligent Document Processing can categorize documents accurately by using Amazon Comprehend PII and Amazon Comprehend Medical PHI identification and redaction options, which enables you to handle sensitive information. IDP can also define separation of access control per user role. For example, you can give the owner full access to all documents, but allow an operator to access only de-identified documents.
-
Reliability
The Intelligent Document Processing architecture uses managed regional AI services. AWS takes care of the reliability and availability in your selected AWS Region. The inherent nature of managed AI services is resilient to failure and highly availability. If you decide to use S3 as your scalable datastore, consider Amazon S3 cross-region replication to further increase the reliability and take advantage of disaster recovery options.
-
Performance Efficiency
The serverless and event-driven nature of the architecture makes it efficient; the resources are not wasted when documents aren’t being processed. Solutions can be scaled in a particular region to accommodate for large scale document processing. The solution can be scaled out by increasing the call rates for the AI services and AWS Lambda. We can design a serverless decoupled architecture with Amazon SNS and SQS for concurrent processing of multiple documents. If human workflow scaling is needed, it can be accomplished as well. You can configure document processing to operate in real time with response time in seconds, or asynchronous mode, depending on your requirement.
-
Cost Optimization
Intelligent Document Processing minimizes the cost by using serverless event-driven architecture so you pay for only the time and resources used for processing documents. Amazon Comprehend has options to train your model in addition to pre-defined entities extraction. For urgent document processing in real time, you can use Amazon Comprehend resource endpoints for your custom model. If your use case can handle asynchronous or batch processing, however, we recommend asynchronous jobs for Comprehend custom models to bring the cost down.
-
Sustainability
By extensively using managed services and dynamic scaling, we minimize the environmental impact of the backend services.
Disclaimer
The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.