Amazon Textract is a machine learning (ML) service that automatically extracts text, handwriting, and data from scanned documents. It goes beyond simple optical character recognition (OCR) to identify, understand, and extract data from forms and tables. With Amazon Textract, you pay only for what you use. There are no minimum fees and no upfront commitments. Amazon Textract charges only for pages processed whether you extract text, text with tables, form data, queries or process invoices and identity documents. See the FAQ for additional details about pages and acceptable use of Amazon Textract.

Amazon Textract has five different APIs: Detect Document Text API, Analyze Document API, Analyze Expense API, and Analyze ID API, and Analyze Lending API.

Detect Document Text API uses OCR technology to extract text and handwriting from a document.

Analyze Document API has four features, Forms, Tables, Queries, and Signatures. You have the flexibility to call any combination of Forms, Tables, Queries, and Signatures together.

  • Analyze Document API for Forms extracts data such as key-value pairs (“First Name” and associated value, such as “Jane Smith”). It also uses OCR technology to extract all the text and handwriting from a document.
  • Analyze Document API for Tables extracts tabular or table data organized in columns and rows. It also uses OCR technology to extract all the text and handwriting from a document.
  • Analyze Document API for Queries provides you the flexibility to specify the information you need from a document (e.g., “What is the customer name?”) and receive that data (e.g., “Jane Doe”) as part of the response. You do not need to worry about the structure of the data in the document or variations in how the data is laid out across different formats and versions of the document. It also uses OCR technology to extract all the text and handwriting from a document.
  • Analyze Document API for Signatures provides the ability to detect handwritten signatures, electronic signatures, and initials on any document or image. It also uses OCR technology to extract all the text and handwriting from a document.
 
Analyze Expense API  extracts data from invoices and receipts, such as an invoice ID, invoice No., invoice #, and the associated value of 12345. Amazon Textract recognizes these various terms as the invoice ID and the corresponding value as 12345 and enables a standard taxonomy of common fields. 
 
Analyze ID API uses machine learning to understand the context of identity documents such as U.S. passports, driver’s licenses, and other IDs. You can automatically extract specific information such as date of expiry and date of birth, as well as intelligently identify and extract implied information such as name and address. Each ID image is considered a page.
 
Analyze Lending API is a specialized mortgage document processing API that automates the classification and extraction of information from a range of mortgage-related application documents. Analyze Lending’s machine learning models have been pre-trained across the diversity of document types that are seen in a typical mortgage application package. Analyze Lending will classify, split and extract results with accuracy and provide a summary of your results including whether or not a signature was detected on the page.
 

Request a custom quote

For high volume use cases, connect with our sales team to request a custom pricing proposal.

Free Tier

As part of the AWS Free Tier, you can get started with Amazon Textract for free. The Free Tier lasts for three months, and new AWS customers can analyze up to:

Detect Document Text API: 1,000 pages per month
Analyze Document API:

  • 1000 Pages per month when using Signatures only
  • 100 Pages per month when using Forms or Tables feature
  • Additional 100 pages per month when using Queries feature NEW

Analyze Expense API: 100 pages per month
Analyze ID API: 100 pages per month

Analyze Lending API: 2,000 pages per month

Amazon Textract API pricing

*Analyze Document API output comes with OCR included irrespective of feature type selected
*Analyze Expense and Analyze ID APIs have OCR included in the output

Pricing examples outside the free tier

Pricing example 1 - Detect Document Text API

Let’s say you want to extract the text from 100,000 pages of research reports using the Detect Document Text API. The pricing per page in US West (Oregon) region for the first one million pages is $0.0015, for a cost of $150.

Total pages processed = 100,000

Price per page = $0.0015

Total charge per month = $0.0015 * 100,000 = $150

Pricing example 2 - Detect Document Text API

Let’s say you want to extract the text from two million pages of research reports using the Detect Document Text API. The pricing per page in the US West (Oregon) region for the first one million pages is $0.0015, and pages after one million are $0.0006 so for processing two million pages the total cost would be $2,100.

Total pages processed = 2,000,000

Price per page = $0.0015 for first 1 million and $0.0006 for pages after 1 million

Total charge per month = $0.0015 * 1,000,000 + $0.0006 * 1,000,000 = $1,500 + $600 = $2,100

Pricing example 3 - Analyze Document API – Forms and Tables

Let’s say you want to extract the text and structured data from 5,000 pages of tax forms using the Analyze Document API. The pricing per page in the US West (Oregon) region for one million pages with tables is $0.015, and with forms is $0.05, for a total of $325.

Total pages processed = 5,000 pages

Price for page with table = $0.015

Price for page with form (key-value pair) = $0.05

Total charge = $0.015 * 5,000 + $0.05 * 5,000 = $75 + $250 = $325

Pricing example 4 - Analyze Document API – Forms and Tables

Let’s say you want to extract text, forms, and tables from two million pages of tax forms using the Analyze Document API. The pricing per page in the US West (Oregon) region for one million pages with tables is $0.015, and $0.01 per page after one million pages. Pages with forms is $0.05 for one million pages, and $0.04 per page after one million. The total cost would be $115,000.

Total pages processed = 2,000,000 pages

Price for page with form (key-value pair) = $0.05 for the first 1 million and $0.04 for the next 1 million

Total charge = $0.015 * 1,000,000 + $0.01 * 1,000,000 + $0.05 * 1,000,000 + $0.04 * 1,000,000 = $15,000 + $10,000 + 50,000 + 40,000 = $115,000

Pricing example 5 - Analyze Document API - Queries

Let’s say you want to extract the text from 5,000 pages of mortgage forms using the Analyze Document API. You also want to extract 10 specific data points from each page via Queries. The pricing per page in the US West (Oregon) region for one million pages is $0.015, for a total of $75.

Total pages processed = 5,000 pages

Price per page with Queries = $0.015

Total charge = $0.015 * 5,000 = $75

Pricing example 6 – Analyze Document API – Forms and Tables and Queries

Let’s say you want to extract text, forms, and tables from two million pages of pay stubs using the Analyze Document API. You also want to extract 10 specific data points from each page via Queries. The pricing per page in the US West (Oregon) region for one million pages with Tables, Forms and Queries is $0.070, and $0.055 per page after one million pages. The total cost would be $125,000.

Total pages processed = 2,000,000 pages 

Price for page with Tables, Forms and Queries = $0.070 for the first one million and $0.055 for the next one million 

Total charge = $0.070 * 1,000,000 + $0.055 * 1,000,000 = $70,000 + $55,000 = $125,000

Pricing example 7 - Analyze Document API - Forms and Queries

Let’s say you want to extract the text and tables data from 5,000 pages of tax forms using the Analyze Document API. You also want to extract 10 specific data points from each page via Queries. The pricing per page in the US West (Oregon) region for one million pages with Tables and Queries is $0.020, and $0.015 per page after one million pages. The total cost would be $100.

Total pages processed = 5,000 pages

Price for page with table and Queries = $0.020

Total charge = $0.020 * 5,000 = $100

Pricing example 8 – Analyze Document API – Signatures

Let’s say you want to detect signatures and extract the raw text from 100,000 pages of mortgage documents using the Analyze Document
API – signatures feature type. The pricing per page in US West (Oregon) region for the first one million pages is $0.0035, for a cost of
$350.
Total pages processed = 100,000
Price per page = $0.0035
Total charge per month = $0.00035*100,000 = $350

Pricing example 9 – Analyze Document API – Signatures

Let’s say you want to detect signatures and extract the raw text from 5M pages of mortgage documents using the Analyze Document API
- signatures feature type. The pricing per page in US West (Oregon) region for the first one million pages is $0.00035, for a cost of $20.
Total pages processed = 5000,000
Price per page for first 1M pages = $0.0035*1000000 = $3500
Price per page for next 4M pages = $0.0014*4000000 = $5600
Total = $3500 + $5600 = $9100

Pricing example 10 – Analyze Expense API

Let’s assume you want to extract data from 100,000 invoices using the Analyze Expense API. The pricing per page in the US West (Oregon) region for 1 million pages is $0.01 and you process 100,000 invoices. The total cost would be $1,000. See the calculation below: 

Total pages processed = 100,000 

Price per page = $0.01 

Total charge per month = $0.01 * 100,000 = $1,000

Pricing example 11 – Analyze Expense API

Let’s assume you want to extract data from 1,500,000 invoices using the Analyze Expense API. The pricing per page in the US West (Oregon) region for one million pages is $0.01 per page and $0.008 per page after one million. The total cost would be $14,000. See the calculation below: 

Total pages processed = 1,500,000 

Price per page = $0.01 for the first 1 million and $0.008 for the next 500,000 

Total charge per month = $0.01 * 1,000,000 + $0.008 * 500,000 = $14,000

Pricing example 12 – Analyze ID API

Let’s say you want to extract information from 100,000 identity documents using the Analyze ID API. The pricing per page in the US West (Oregon) Region for 100,000 pages is $0.025 per page for up to 100,000 pages. The total cost would be $2,500. 

Total pages processed = 100,000 

Price per page = $0.025 

Total charge per month = $0.025 * 100,000 = $2,500

Pricing example 13 – Analyze ID API

Let’s say you want to extract information from 600,000 identity documents using the Analyze ID API. The pricing per page in the US West (Oregon) Region for 100,000 pages is $0.025 per page and $0.01 per page after 100,000. The total cost would be $7,500.

Total pages processed = 600,000

Price per page = $0.025 for the first 100K and $0.01 for the next 500,000

Total charge per month = $0.025 * 100,000 + $0.01 * 500,000 = $7,500

Pricing example 14 – Analyze Lending API

Let’s say you want to extract information from 200,000 pages of mortgage lending documents using the Analyze Lending API. The pricing per page in the US West (Oregon) Region is $0.07 per page for up to one million pages. Of the 200,000 pages you processed, Analyze Lending provided classification and extraction for document types it supported, resulting in 100,000 pages of classification and data extraction. The total cost would be $7,000 for the 100,000 pages.

Total pages processed = 200,000

Total pages supported with classification and extraction = 100,000 

Price per page = $0.07

Total charge per month = $0.07 * 100,000 = $7,000

Pricing example 15 – Analyze Lending API

Let’s say you want to extract information from 2,000,000 pages of mortgage lending documents using the Analyze Lending API. The pricing per page in the US West (Oregon) Region is $0.07 per page for up to one million pages and $0.055 per page after 1,000,000. Of the 2,000,000 pages you processed, Analyze Lending provided classification and extraction for document types it supported, resulting in 1,200,000 pages of classification and data extraction. The total cost would be $81,000 for the 1,200,000 pages.

Total pages processed = 2,000,000

Total pages supported with classification and extraction = 1,200,000

Price per page = $0.07 for the first 1M and $0.055 for the next 200,000

Total charge per month = $0.07 * 1,000,000 + $0.055 * 200,000 = $81,000

Check out Amazon Textract FAQs

Learn more about how Amazon Textract extracts text and structured data from virtually any document.

Learn more 
Sign up for a free account

Instantly get access to the AWS Free Tier. 

Sign up 
Start building in the console

Get started building with Amazon Textract in the AWS Management Console.

Sign up