AWS Partner Network (APN) Blog
Intelligent Purchase Order Processing for SAP Powered by TCS Doqulogic on AWS
By Bharati Banpel, Product Owner – TCS
By Tejas Mishra, Solutions Architect – TCS
By David Dobrovich, Global GTM Lead, SAP – TCS
By Gurudath Pai, Sr. Partner Solutions Architect SAP – AWS
SAP customers spend hours running manual processes of standard, paper-bound document processing workflows. Manual entries of such orders for posting into enterprise resource planning (ERP) systems like SAP can be tedious and error-prone.
Many industries are adopting a “digital exchange” to receive orders in bulk via email or customer portals through attached documents. SAP customers are looking for automated alternatives for customer order processing which are cost efficient and enhance the business processes hosted in their ERP solutions.
Doqulogic is built by Tata Consultancy Services (TCS) to address challenges faced by companies who spend large amounts of time and resources in the manual processing of documents for business processes like sales and customer order processing.
TCS Doqulogic is an intelligent document recognition, classification, and extraction solution complemented by an augmented intelligence capability that has been powered exclusively by Amazon Web Services (AWS). It provides customers with an innovative, flexible, and cost-efficient alternative when compared to expensive licensed solutions available in the market.
In this post, we will take a deeper look at how Doqulogic eliminates the processing overhead and reduces the time taken to process documents. We’ll discuss how the solution tailors multiple AWS services together to make processes like purchase order processing simpler for customers by integrating through emails and backend ERP solutions like SAP S/4HANA.
TCS is an AWS Premier Tier Services Partner and Managed Services Provider (MSP) with the AWS SAP Consulting Competency. An IT services and business solutions organization, TCS has been partnering with many of the world’s largest businesses in their transformation journeys for the last 50 years.
In businesses today, data is sourced from multiple channels at a large scale, leading to complex processes of manual segregation and tedious data entry method into systems, including ERP solutions like SAP S/4HANA.
Data maintenance becomes a cumbersome and redundant activity that consumes effort. This causes inconsistent document standards and brings in complex management of physical documents.
Taking a use case from SAP’s point of view, many customers receive a large number of documents like purchase orders as email attachments. In many cases, these orders are unstructured documents that are manually read, and the data is entered into SAP to generate sales orders which requires plenty of manual effort.
Customers are looking for solutions to minimize manual interventions and adopt an automated approach for the processing of documents. Business scenarios such as vendor validation, duplicate purchase order checks, and successful generation of sales orders are key to running an efficient business process in an SAP landscape.
TCS Doqulogic for automated document processing is split into three phases:
Ingestion of Documents from Multiple Platforms
This is done via web portal integration. Doqulogic can also ingest documents through an intermediate database or document repositories from the customer’s native environments or websites and begin processing.
Doqulogic can be powered by email integrations, wherein it consumes data directly from emails and email attachments and begins processing them. The solution is flexible to ingest documents via real-time processing or batch processing as well.
Processing and Storage of Documents
Once documents are in Doqulogic, they are subjected to a page separator where each page in the document is separated. Each page is then subject to image preprocessing (cropping, brightness, contrast, sharpening) to enhance quality to ensure high extraction accuracy.
The images are passed through a classification layer; there’s a readily available framework and this has to be trained according to different customer and document type. Once they are classified, there is a custom data extraction layer, and according to the labels the documents are classified. This enables higher degree of extraction accuracy.
There is an intermediate “human-in-the-loop” layer, which ensures any classification or extraction is manually reviewed until a time the whole solution becomes mature with respect to the customer environment. This is a necessary step to minimize any further downstream errors.
The manual review is only required for documents whose classification or extraction results have not been completed with a high degree of confidence. The intervention reduces proportionally with the amount of time and training provided to the entire solution.
Retraining through production data is also enabled, where the end user can choose when to retrain the model and with which production data to retrain it.
Documents are then stored and indexed after processing for use in business processes like search and retrieve. Doqulogic possesses the capabilities of generating templatized versions of documents along with the activity of distributing them via multiple channels.
Output to Target Systems
Once the data is extracted and then corrected, they are stored in databases which can be referred to for further downstream processing to target systems like SAP S/4HANA. This is enabled through ready-to-use APIs that can be configured with a few tweaks to suit the customer ingestion protocol.
Below is the representation of the above phases, its sub-phases, along with its function which form the TCS Doqulogic solution.
Figure 1 – TCS Doqulogic components and functions.
Solution Architecture with SAP
The architecture for Doqulogic is modular and flexible to be changed according to customer needs. Below is an example of a customized architecture where it uniquely stitches together multiple AWS services along with SAP.
Figure 2 – TCS Doqulogic architecture, working with SAP Fiori.
Here is how AWS services are used in the solution:
- Amazon Textract: Offers powerful optical character recognition (OCR) capabilities based on machine learning (ML). It provides features such as text extraction, structured data extraction, and handwritten data extraction from scanned documents with high precision.
- AWS Lambda: Used for serverless computing service that’s run by response from events being generated from other AWS services such as Amazon S3 and Amazon API Gateway.
- Amazon WorkMail: Provides a secure, controlled, and managed business email service that allows users to smoothly access their email, contacts, and calendars using client applications such as outlook, iOS, and Android email applications.
- Amazon Simple Storage Service: Amazon S3 delivers a fast, secure, and scalable cloud-based storage service. It stores data as objects and supports easy data upload, access, and retrieval procedures.
- Amazon Simple Email Service: Amazon SES offers flexible and scalable email service to send messages via applications.
- Amazon Relational Database Service: Amazon RDS provides an easy-to-access, maintain, and scale relational databases in the cloud. This is used to store end results of the processing run.
- AWS Amplify: Used here to build a pixel-perfect custom graphical user interface (GUI) where data can be rendered.
Salient Features of Doqulogic
Below are a few key features of Doqulogic and SAP integration which differentiates the solution:
OCR and Data Extraction
Doqulogic consists of email integration with the vendor using Amazon WorkMail and SES in order to ingest purchase orders into its ecosystem. To create an input channel for purchase orders with minimal changes on the client’s end, a separate business email is maintained using WorkMail.
Purchase order attachments from emails are stored in an Amazon S3 bucket. This triggers an AWS Lambda function in the process flow, which carries out intelligent document extraction using Amazon Textract.
The purchase orders which are temporarily stored in the S3 bucket are then processed by Textract, and the response is stored in another S3 bucket in JSON and CSV formats for further processing.
Integration with SAP S/4HANA
The solution achieves AWS integration with SAP S/4HANA using its ODATA service as an endpoint to communicate with the client’s SAP system. To verify the extracted data, a JSON response containing the extracted data is sent to the SAP system.
If the vendor information exists in the SAP database, the system will generate a sales order and send it as a response to Doqulogic. The response is then parsed to update Amazon RDS, and the data is rendered onto the customized Doqulogic user interface (UI).
If the vendor details do not exist in the SAP database, it sends an error message in the response with a proper description. This error message is appropriately shown in the UI against that particular purchase order.
Figure 3 – Flow chart of purchase order automation in SAP with TCS Doqulogic.
If a purchase order is received which has been processed previously in the SAP system, and its respective sales order number already exists in the database, then the purchase order is deemed as a duplicate. This helps avoid unnecessary processing of duplicate purchase orders.
Business Email Integration
To notify the client about the successful sales order generation, the CSV file stored in S3 is forwarded as an attachment to the client’s email, along with a well-structured subject and description in the mail body. Amazon SES is used to write notification mails from a Lambda function.
Additional Use Cases Fulfilled by TCS Doqulogic
- Know your customer (KYC) data extraction and validation: Data extraction from identification documents to validate them against user application details and automate the verification workflow, thereby reducing manual labor and human errors.
- Healthcare bills and prescription processing: Analyzing expenses and extracting patient information as a part of healthcare administration automation. This use case also caters to medical insurance processing and pharmaceutical business processing.
- Banking insight automation: Processing bank statements, calculating transactions to validate the balance and generate a report based on certain rules provided by the client. Cash flows and transaction-based anomalies are also analyzed and detected on a real-time basis for specific patterns through integrated ML models. This solution helps in streamlining account validation and suspicion detection.
- Excel automation and format conversions: Fetching data from unstructured Excel spreadsheets provided by the client, processing it, unifying multiple templates, and writing the required fields in a CSV file.
- Semantic comparison: Comparing semantics of documents with reference to their original documents and keeping track of all the changes made. Also, highlighting the modified/deleted text in the preview section.
- Document classification: Using the solution’s supervised ML model to classify various documents and label them alleviates manual labor of selecting correct documents to process. This is an image-based classification model (less dependent on text inside the document) that can be enabled quickly for training and retraining. With this, documents can undergo appropriate extraction.
Manual document processing is time consuming and prone to errors. The traditional way of processing and data entry activities fails to cope up with today’s digital business processes.
TCS Doqulogic tailors together AWS-native services to eliminate the slow manual process and provide customers with a solution offering which enables business with:
- Intelligent classification of documents through supervised machine learning models.
- Effective image pre-processing through in-house algorithms for enhanced readability.
- Customizable highly interactive GUI for end-to-end document extraction, business rule validations, human-in-the-loop data corrections, and document comparison including actionable items.
- Easy integrational capabilities available with third-party backend systems including SAP S/4HANA.
- Fit to customer output formats of results.
TCS Doqulogic is offered as a service commercial model, with the flexibility of choosing from the bouquet of AWS services and an option of customizing the solution to suit customers’ business needs. The solution continues to be enhanced by TCS with a focus on making Doqulogic template-less, agnostic, and serverless.
TCS has a proven track record of delivering industry-leading solutions for SAP customers on AWS. Contact the TCS team to learn more about TCS Doqulogic and their SAP services offerings on AWS.
TCS – AWS Partner Spotlight
TCS is an AWS Premier Tier Services Partner and MSP that has been partnering with many of the world’s largest businesses in their transformation journeys for the last 50 years.