Speed up Amazon Textract optical character recognition (OCR) with an intelligent processing pipeline.

Many companies extract data from scanned documents such as PDFs, tables, and forms through manual data entry that is slow, expensive, and prone to errors. In some cases, simple OCR software is used to extract data, but this method still requires manual configuration that must be updated each time the form changes to be usable.
Amazon Textract is a fully managed machine learning service that automatically extracts printed text, handwriting, and other data from scanned documents using OCR to identify, understand, and extract data from forms and tables. Amazon Textract uses machine learning to instantly read and process any type of document, accurately extracting printed text, handwriting, forms, tables, and other data without the need for any manual effort or custom code.
TensorIoT has created JumpStart for Amazon Textract to further enhance the benefits of Amazon Textract by reducing setup time and adding functionality that improves the data extraction experience and streamlines the user interface. Through the use of customer data templates (forms), data can be extracted from a variety of sources sorted for parsing, making data review and analysis easier and quicker.
Company logo

AWS Partner Network | Competency


United States, United Kingdom, Germany, France, Spain, Italy, Australia, India


Automate manual processes

Automated filtering and processing of raw data exports to identify specific text and values.

Speed time to insights

Instantly extract and process documents without manual effort or custom code.

Innovative user interface

Out-of-the-box user interface for viewing the data exported from Amazon Textract.

Save time and money

Realize benefits in days (or weeks), not months, using pay-per-use services to lower costs.

  • How it works
  • Enterprises that process a large number of forms, documents, and PDFs need to be able to extract data from those documents to discover insights, analyze trends, and put their data to work. However, document processing can be time consuming and labor intensive, inhibiting effective data analysis. Using Amazon Textract can help solve these issues, providing a quick, efficient, and low-cost way to improve document processing. JumpStart for Amazon Textract is a deployable pipeline that allows you to quickly analyze PDF documents with Amazon Textract and view the results via a web-based user interface.

    Working with your company, TensorIoT's consultants will identify what piece of information is required and what type of obstacles may need to be overcome (such as leaving out letterheads or translating written dollar amounts to numerical values) in your JumpStart for Amazon Textract solution. After establishing the form structure and the information targeted for extraction, TensorIoT will develop your template with custom changes to optimize your data processing.

    Finally, when the pipeline has been completed and approved, TensorIoT will assist in packaging up the entire pipeline through either AWS Cloud Development Kit (AWS CDK) or AWS CloudFormation and deploying it in the client's AWS account. TensorIoT has extensive experience working with AWS as an asset when working with customers on data solutions.

  • Key activities
  • 1) Identify customer requirements

    TensorIoT will host an ideation session with the customer to determine project requirements.

    2) Develop form template(s)

    TensorIoT develops the form template(s) to extract specific information from Amazon Textract's raw output.

    3) Test and verify form template(s)

    TensorIoT will run sample forms through the platform to test and verify form template function.

    4) Deploy the solution

    TensorIoT will deploy the new form templates to the desired account.

    5) Support

    TensorIoT gives ongoing support for troubleshooting and enhanced functionality with support contracts available.

  • Customer contribution
  • Leadership and team collaboration

    Appointment of a project sponsor; participation of technical and business leadership.

    Provide forms

    Customer provides structured form to be ingested, and sample forms to aid in development and testing.

    Working knowledge of data ingestion

    Understand how the customer ingests data various sources and where the data is accessed, stored, and used.

    Systems and security access

    Access to systems and data sources as required for project. Can be done fully remotely as needed.

  • About this consultant
  • TensorIoT is an AWS Partner who has achieved the AWS IoT, AWS Machine Learning, AWS Industrial Software, and AWS Retail Competencies as well as multiple AWS Service Delivery designations. Founded by a former AWS employee, TensorIoT has delivered successful IoT and machine learning projects across the world, with offices in the United States (California, Nevada, Texas, Virginia, and Washington), the United Kingdom, and India. With TensorIoT's deep experience delivering complete end-to-end solutions and products, from edge devices to end-user environments in IoT or data engineering to automated pipeline in machine learning, TensorIoT's team of certified AWS architects can quickly assist customers in realizing their technology and business goals.

  • Architecture diagram

Ready to get started?

AWS Partner Highlights

TensorIoT’s AWS validated qualifications, customer references, and office locations.

AWS Competency Details

TensorIoT has demonstrated deep AWS technical expertise and proven customer success.

Explore icon
Explore all Consulting Offers

Browse our portfolio of Consulting Offers to get AWS verified help with solution deployment.

Learn more 
Build icon
Deploy a solution yourself

Browse our library of AWS self-deploy solutions to common architectural problems.

Learn more 
Find an APN Partner icon
Find an AWS Partner

Find AWS Certified Partners to help you get started.

Learn more