Skip to main content

Guidance for SAP Intelligent Document Processing and Insights using Generative AI on AWS

Overview

This Guidance demonstrates how to automate unstructured document processing and subsequent auditing and analysis using AWS AI/ML and generative AI Services. It includes Amazon Textract to extract, classify, and process documents, AWS SDK for SAP ABAP for SAP Clean Core extensions, and Amazon Bedrock to build smart audit chatbot assistants to build audit summaries and to improve business user productivity. This Guidance is designed to be extensible, allowing you to seamlessly incorporate additional components or integrate with other AWS services.

How it works

SAP in-stack extensions

This architecture diagram shows how to build SAP in-stack extensions and intelligently process your documents to streamline your SAP business process using AI/ML and generative AI services.

Architecture diagram showing the integration of SAP source systems with AWS generative AI services for intelligent document processing. It illustrates document sources, SAP S/4HANA or ECC backend, AWS services including S3, Textract, Translate, SNS, and Amazon Bedrock, with a workflow for extracting, translating, and processing documents using generative AI insights.

SAP side-by-side extensions

This architecture diagram shows how to build external SAP extensions to intelligently process your documents.

Architecture diagram depicting SAP Intelligent Document Processing using Generative AI services on AWS. The diagram illustrates the integration between SAP Business Technology Platform, SAP Source Systems, and AWS services such as Amazon Bedrock, Textract, Translate, SNS, and S3 for processing documents, generating insights, and orchestrating workflows with generative AI capabilities.

Deploy with confidence

Ready to deploy? Review the sample code on GitHub for detailed deployment instructions to deploy as-is or customize to fit your needs. 

Go to sample code

Well-Architected Pillars

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

Amazon CloudWatch and AWS CloudTrail provide monitoring and logging capabilities so you can detect and respond to performance issues, security threats, and configuration changes. For example, CloudWatch monitors application performance and resource utilization, while CloudTrail tracks API calls and resource changes, providing a complete audit trail.

Read the Operational Excellence whitepaper 

AWS provides a range of security services to help you secure your AWS resources and data. AWS Identity and Access Management (IAM) offers least privilege access and key management capabilities so only authorized users and applications can interact with your AWS resources. Amazon S3 bucket policies and access control lists (ACLs) further control access to Amazon S3 buckets and objects, enforce data encryption, and block public access. Additionally, AWS Config provides a resource inventory, configuration history, and compliance reports to help track and manage security best practices.

When integrating SAP data with Amazon Bedrock, it is important to implement appropriate security measures. This includes using IAM roles and permissions to restrict access to AWS resources based on the principle of least privilege. Additionally, Amazon S3 bucket policies and ACLs should be used to control access to any sensitive SAP data stored in Amazon S3. Finally, ongoing monitoring and management of resource configurations using AWS Config can help ensure the security posture is maintained over time.

Read the Security whitepaper 

Amazon TextractAmazon Bedrock, and Amazon S3 have built-in redundancy and fault tolerance across multiple Availability Zones (AZs), where high availability is achieved without the need for manual configuration or additional infrastructure. 

Read the Reliability whitepaper 

Amazon Textract offers document extraction capabilities using pre-built machine learning models to extract information from invoices and other documents. Amazon Bedrock further allows the use of large-scale machine learning models to enhance the performance and accuracy of applications. Amazon S3, a highly scalable and durable object storage service, facilitates efficient storage and retrieval of data, enabling applications to quickly and reliably access the information they require. Additionally, Amazon Translate, a neural language translation service, offers the capability to reliably translate documents with a high degree of accuracy. Notably, these AWS services are all serverless in nature, automatically scaling to accommodate fluctuations in workload demand.

Read the Performance Efficiency whitepaper 

The AWS services featured in this Guidance, including Amazon Textract, Amazon Translate, Amazon Bedrock, and Amazon S3, are designed to deliver scalable and efficient approaches for your workloads while minimizing operational costs. Specifically, Amazon Textract enables cost-effective intelligent document processing by allowing extraction of information from documents using pre-built LLM models, while Amazon S3 offers highly scalable and durable object storage with pay-as-you-go pricing. Amazon Translate facilitates multilingual applications at scale, and Amazon Bedrock enables the construction of cost-effective RAG workflows. You can further optimize costs by using the serverless nature of these services, using the lifecycle policies in Amazon S3, and monitoring usage with AWS Cost Explorer and CloudWatch.

Read the Cost Optimization whitepaper 

The AWS services featured in this Guidance all contribute to sustainability by optimizing resource usage and reducing environmental impact. For instance, Amazon Bedrock, a fully managed service that offers a range of foundation models, helps you to develop advanced AI models with reduced computational requirements, leading to improved energy efficiency. Amazon S3 promotes efficient data storage and management, reducing the need for physical data centers and associated energy usage. Additionally, Amazon Athena facilitates data analysis without the need for resource-intensive data warehousing for efficient use of your cloud-based resources. Furthermore, the cloud-native, serverless architecture of these services eliminates the need for resource provisioning and reduces energy consumption. Lastly, the use of Amazon S3 for durable and highly available data storage helps minimize e-waste by reducing the need for redundant storage.

Read the Sustainability whitepaper 

Disclaimer

The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.