Guidance for Low Code Intelligent Document Processing on AWS
Overview
How it works
These technical details feature an architecture diagram to illustrate how to effectively use this solution. The architecture diagram shows the key components and their interactions, providing an overview of the architecture's structure and functionality step-by-step.
Well-Architected Pillars
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
Operational Excellence
This Guidance comes with a Git repository that contains all the artifacts required to deploy the architecture.
Security
Amazon S3 encrypts your data by default using Amazon S3-managed encryption keys. You may also use AWS Key Management Service (AWS KMS), a managed service that allows you to use your own cryptographic keys to protect your data. Data shared between services in your account never leaves your account. You can use the Amazon Comprehend console or APIs to detect personally identifiable information (PII) in English text documents. With PII detection, you have the choice of locating the PII entities or redacting the PII entities in the text.
Reliability
The Guidance recommends and has separate AWS CDK components for each Lambda function that can be used as microservices. The serverless, event-driven architecture in addition to retry and exponential back off features make this architecture scalable. The Lambda functions included in the sample code have logging enabled, set with the default mode of "DEBUG.” You can view these logs in Amazon CloudWatch, through which you can also monitor and set alarms for specific log events.
Performance Efficiency
The Guidance deploys a serverless event-driven architecture that scales according to traffic patterns.
Cost Optimization
This Guidance and the associated workshop use AWS Cloud9 to create instances to install Docker and deploy the AWS CDK stacks. We recommend using the cost-saving setting that prompts the environment to auto-hibernate after thirty minutes of no activity. The Step Functions workflow is initiated only when the document is uploaded to a particular Amazon S3 location. The workshop contains an estimate on total cost of execution and has a clean-up section to destroy the deployed stack.
Sustainability
This Guidance allows you to maximize your utilization and right-size your implementation by using Step Functions, which only runs when your documents are being processed. This allows you to use resources only when needed and conserve energy consumption of the underlying infrastructure. By using managed services like AWS Textract and Amazon Comprehend, you can operate at scale and share the underlying resources, which allows you to further maximize resource usage.
Implementation Resources
Disclaimer
Did you find what you were looking for today?
Let us know so we can improve the quality of the content on our pages