
Overview
Upstage Document OCR (Optical Character Recognition) is designed to efficiently detect and recognize text from a wide range of document images, ensuring high accuracy and versatility across various languages and image qualities.
Highlights
- ### Key Features - **Word-Level Coordinate/Transcription Results:** Provides word-level bounding box and transcription for easy text processing. - **Robustness on Rotated Documents:** Detects and corrects text orientation in rotated documents. - **Multilingual Text Detection:** Recognizes texts in multiple languages.(English, Chinese, Japanese, and Korean) - **Confidence Scores:** Outputs word-level confidence scores to assess reliability of extracted text for further verification.
- ### Key Applications - **Automated Data Entry:** Converts printed or handwritten documents into digital text, streamlining data entry and reducing manual effort. - **Archival and Digitization:** Digitizes documents, books, and records, preserving information and making it searchable and accessible. - **Multilingual Document Processing:** Handles documents in English and CJK (Chinese, Japanese, Korean), enabling effective international document processing.
- ### Key Tasks - Text Extraction - Document Digitization - Multilingual Document Handling - Automated Data Entry - Information Retrieval
Unlock automation with AI agent solutions

Features and programs
Financing for AWS Marketplace purchases
Pricing
Dimension | Description | Cost/host/hour |
|---|---|---|
ml.m5.12xlarge Inference (Batch) Recommended | Model inference on the ml.m5.12xlarge instance type, batch mode | $0.00 |
ml.g5.xlarge Inference (Real-Time) Recommended | Model inference on the ml.g5.xlarge instance type, real-time mode | $1.50 |
ml.p3.2xlarge Inference (Real-Time) | Model inference on the ml.p3.2xlarge instance type, real-time mode | $1.50 |
ml.g6.2xlarge Inference (Real-Time) | Model inference on the ml.g6.2xlarge instance type, real-time mode | $1.50 |
ml.g5.2xlarge Inference (Real-Time) | Model inference on the ml.g5.2xlarge instance type, real-time mode | $1.50 |
ml.g4dn.xlarge Inference (Real-Time) | Model inference on the ml.g4dn.xlarge instance type, real-time mode | $1.50 |
ml.g6.xlarge Inference (Real-Time) | Model inference on the ml.g6.xlarge instance type, real-time mode | $1.50 |
Vendor refund policy
We do not support any refunds currently.
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
Amazon SageMaker model
An Amazon SageMaker model package is a pre-trained machine learning model ready to use without additional training. Use the model package to create a model on Amazon SageMaker for real-time inference or batch processing. Amazon SageMaker is a fully managed platform for building, training, and deploying machine learning models at scale.
Version release notes
The patchify feature has been added.
Patchify processes large images (2560 pixels or more) by dividing them into smaller patches for inference. This enhances the accuracy of inference for large images.
Additional details
Inputs
- Summary
Provide an image file in binary format to the request body.
- Input MIME type
- multipart/form-data
Input data descriptions
The following table describes supported input data fields for real-time inference and batch transform.
Field name | Description | Constraints | Required |
|---|---|---|---|
use_patchify | Patchify processes large images (2560 pixels or more) by dividing them into smaller patches for inference. This enhances the accuracy of inference for large images. | Default value: false
Type: Categorical
Allowed values: true, false | No |
Resources
Vendor resources
Support
Vendor support
Contact us for model inquiries.
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.
