Overview
Upstage Information Extract is a schema-driven document intelligence API that transforms unstructured content into structured data. It applies semantic understanding across any document type or format to reliably capture the information you define.
User-Defined Schema Extraction: Define exactly the fields you need, from simple key-values to complex nested structures. Unlike fixed and predefined key extraction, this flexible engine adapts to any document structure.
Semantic Understanding: Going beyond position or template-based extraction, it extracts data based on a deep semantic understanding of both the document context and the user-defined schema.
Designed for Reliable Automation: We prioritize reliability and traceability, allowing users to verify reference locations and confidence scores for seamless human-in-the-loop workflows.
Highlights
- Schema-Driven Extraction: Define what to extract using flexible schemas. Works across any document type or layout, capturing structured data exactly as specified.
- Semantic, Inference-Based Extraction: Analyzes document meaning to extract data beyond explicit text. Captures fields even when values are implied, unlabeled, or expressed in varied ways.
- Key Tasks: Key-Value Extraction - Table to Structured JSON - Nested & Hierarchical Data Extraction - Multi-Document Consistency - Field-Level Location Tracking - Confidence Scoring - Schema Generation
Details
Introducing multi-product solutions
You can now purchase comprehensive solutions tailored to use cases and industries.
Features and programs
Financing for AWS Marketplace purchases
Pricing
Dimension | Description | Cost/host/hour |
|---|---|---|
ml.g7e.4xlarge Inference (Real-Time) Recommended | Model inference on the ml.g7e.4xlarge instance type, real-time mode | $20.00 |
ml.m5.12xlarge Inference (Batch) Recommended | Model inference on the ml.m5.12xlarge instance type, batch mode | $20.00 |
ml.g7e.2xlarge Inference (Real-Time) | Model inference on the ml.g7e.2xlarge instance type, real-time mode | $20.00 |
ml.g7e.8xlarge Inference (Real-Time) | Model inference on the ml.g7e.8xlarge instance type, real-time mode | $20.00 |
Vendor refund policy
Contact us for refund inquiries. https://www.upstage.ai/contact-us?utm_source=marketplace
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
Amazon SageMaker model
An Amazon SageMaker model package is a pre-trained machine learning model ready to use without additional training. Use the model package to create a model on Amazon SageMaker for real-time inference or batch processing. Amazon SageMaker is a fully managed platform for building, training, and deploying machine learning models at scale.
Version release notes
Minor stability and dependency updates; Real-time inference behavior unchanged.
Additional details
Inputs
- Summary
Provide input data in JSON request body. The api field (SageMaker-specific) selects one of three modes:
- information-extraction — Request body
- schema-generation — Request body
- document-classification — Request body
Apart from the api field, each mode's request body matches the linked spec.
- Input MIME type
- application/json
Resources
Vendor resources
Support
Vendor support
Contact us for model, usage and enterprise integration inquiries.
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.
Similar products

