AWS Partner Network (APN) Blog

Wipro’s Email Automation Framework Helps Customers with Content Extraction and Classification

By Bhajandeep Singh, AWS AI/ML Lead – Wipro AI Solution Practice
By Utpal Dutta, Sr. AWS AI/ML Architect – Wipro AI Solution Practice
By Bindhu Chinnadurai, Sr. Partner Solutions Architect – AWS

Connect with Wipro-1

Many organizations derive business understanding and new insights through content analytics, which can also be referred to as content intelligence.

Many have tapped the benefits of natural language processing (NLP) to derive context from content and provide recommendations or rankings that augment manual processing efficiency. As the results provided by machine learning (ML) algorithms increase in accuracy, customers become more confident in implementing automation workflows.

Wipro’s email automation framework leverages machine learning services from Amazon Web Services (AWS) that enable organizations to extract data from emails and provide automated instructions which enhance accuracy and improve staff productivity.

This solution supports data extraction from the mailbox (email body and attachments) and validates and prioritizes based on the urgency of emails. It performs sentiment classification and provides automated instructions for tasks assigned to operators. It has the flexibility to be configured for manual intervention in the decision-making steps, and the framework provides the capability to audit and visually analyze performance.

In this post, we will discuss on how the artificial intelligence (AI) and machine learning capabilities of AWS were leveraged by Wipro to build a solution which helped resolve a customer’s challenges.

Wipro is an AWS Premier Tier Services Partner and Managed Service Provider (MSP). Its AI/ML solutions drive enhanced operational efficiency, productivity, and superior customer experience for many enterprise clients.

Customer Challenges

A leading U.S. radio broadcaster with 800+ radio stations providing internet radio and podcasting services was looking to optimize order processing and promotions along with advertising requests.

Requests for promotions and “advertainments” have historically been entertained through all channels, but email communication is the company’s primary request.

This being a top revenue-generating area, the customer wanted to find a solution to the manual processing of email content which led to:

  • Missing urgent and revised emails.
  • Missing important information in emails.
  • Reduced staff productivity, as employees were involved in validation, filtering relevancy, and resolving inconsistency between systems. As a result, they were constantly juggling between priorities.
  • Failure to indicate warnings and errors in the process.
  • Failure to capture operating effectiveness.

Solution Overview

Wipro helped the customer overcome these challenges by proposing an on-demand solution on AWS that automates email pipeline including extraction, validation, named-entity recognition (NER), classification, sentiment analysis, and auto task assignment.

The feedback loop helps the ML algorithms to continuously improve, thereby providing better classification accuracy in a progressive manner. The solution also makes use of dashboards to analyze performance and provide auditing capabilities. With this, the customer benefited by seeing a reduction in publishing time and effort, which were previously a challenge as prioritizing work was a manual activity.

The diagram below explains how the email processing happens with the ML pipeline.


Figure 1 – Email processing ML pipeline.

Once a customer email lands in the defined inbox, the primary email reader service extracts the email body content and attachments. A work item is then created with the information extracted from the details of the email body/attachment and is then routed to the centralized storage.

For all attachments and email body content, entity extraction is carried out using NLP based AI services. Extracted output is passed through a validation engine to get the final extracted fields. These validated fields, along with a confidence score, are projected onto an user interface (UI) for user validation. Post-validation, the output is submitted to the downstream order processing system.

Downstream Order Processing Flow

After extraction, the downstream processing commences:

  • The extracted data, along with a confidence score, is validated using custom-built UI.
  • The UI allows users to take a consolidated look at the complete information, and to compare the extracted fields against the actual document. The metadata extracted from the custom application to enrich the content is mapped for the user to review before the work item is moved to the next step.
  • Business users can submit the data as-is or edit it before submission.
  • Post-submission, the system communicates with the defined action streams to carry out the necessary data exchange for downstream processing of the work item.


Figure 2 – Extraction of content from attachments.

Technical Architecture

Wipro’s solution is developed using multiple AWS services and open-source technologies that store, process, transform, correlate, and analyze the email content to auto-populate data with instructions and visually monitor metrics.

The diagram below explains the services and components used in the processing engine.


Figure 3 – Technical architecture.

As shown above, all of the custom components and AI services are secured with defined virtual private cloud (VPC) connection with specific endpoints and security groups to avoid unnecessary internet traffic.

Extraction is the primary component of the solution, and the efficiency and accuracy of the downstream processing system largely depends on the accuracy of the extraction service. Amazon Textract performs the named entity extraction for attachments in the emails, which is the largest portion of the email content.

Below are the steps involved in the ML pipeline and the AWS services leveraged:

  • Emails from customers land in the defined email inbox.
  • Emails from each location’s (station) inbox are forwarded to a centralized outlook inbox. A Python-based custom reader service keeps checking for new emails using a cron job.
  • Automated AI-based email reader service reads and checks for duplicity. When emails land in multiple inboxes, the process selects the most relevant one. Custom code for services like the reader service and extraction service are deployed and managed using Amazon Elastic Container Service (Amazon ECS).
  • Amazon Simple Queue Service (SQS) is leveraged for asynchronous processing of the emails as emails are read and processed on the defined schedule.
  • Custom code sends the emails along with attachments to Amazon Simple Storage Service (Amazon S3) which triggers event-based processing.
  • Emails are classified into respective classes and appropriate actions are defined based on the classification. A custom ML model developed using Amazon SageMaker classifies and tags emails with different categories. The email body is used as model input and the output is a pre-defined tag/class.
  • Depending upon the email class, attachments are extracted and workloads are created by combining the email metadata, email body, and attachment date.
  • When the email attachment contains images, a custom convolutional neural network (CNN)-based model trained using Amazon SageMaker filters out relevant information; non-relevant images like logos and signatures are ignored. A CNN image classification model is leveraged to classify the email or attachment image as class 1 (need processing) or class 2 (filtered out from further processing).
  • Depending on the nature of the content, the appropriate AI-based extraction method is applied.
  • For email attachments (PDF/Excel/Word), Amazon Textract extracts content from the attachments from the email workloads stored in Amazon S3.
  • Similarly, for content with no attachments a custom named entity recognition model takes care of the named entity extraction process. A custom NLP model for classification tags the emails with preconfigured tags (urgent, production, review) which are used in business prioritization.
  • The output is passed through an additional layer which comprises of a custom regex model and parsers for further validation. Email body content along with metadata is stored in a relational database managements system (RDBMS).
  • Amazon API Gateway is the entry point to the backend system and responsible for request routing.
  • Backend/downstream applications use this extracted validated information for different stages of order processing workflow, such as new order creation or order updates.

Customer Benefits

The customer in this case benefit from the introduction of intelligent automation into a largely manual process, which leads to shorter turnaround time. Through validation, unnecessary and duplicate information is discarded and discrepancies and delays in process are highly minimized. This helps users prioritize work items based on automated tagging that speeds up processing.

End-to-end automation of the process prevents revenue leakage and minimizes work items processed without an order mapping, which was frequent before automation. The solution provides a consolidated and real-time view for users at each level, as well as optimization of human resources.

The solution also provides improved identification of critical work items requiring immediate attention, and a dashboard at different levels provides overall health of the process, including the ML models involved. The dashboard includes details of performance metrics for the ML models and the dataset used to train them. Users can manually trigger a re-training from the dashboard.


Wipro’s email automation framework leverages AWS machine learning services and enabled one of the largest media customers in the U.S. to extract data from emails to provide automated instructions, thereby enhancing accuracy and improving staff productivity.

The solution has benefited the customer by reducing the turnaround time and manual intervention while increasing accuracy in most cases.

Previous processes that were tedious and time-consuming are transformed to an automated and interactive process. This brings more intelligence into the current process and injects more compliance, visibility, and auditability.


Wipro – AWS Partner Spotlight

Wipro is an AWS Premier Tier Services Partner and MSP. Its AI/ML solutions drive enhanced operational efficiency, productivity, and superior customer experience for many enterprise clients.

Contact Wipro | Partner Overview