Overview
This 30B parameter vision-language model represents the optimal balance of accuracy, cost, and performance for production OCR and structured extraction pipelines.
The model achieves 90% accuracy on OCRBench evaluations - the highest in its class - delivering enterprise-grade reliability for mission-critical document processing.
Excelling at complex structured extraction from forms, financial documents, medical records, legal contracts, and technical diagrams, it demonstrates a 20.3 Character Error Rate on FUNSD benchmark, translating to 79.7% field-level accuracy.
The Mixture-of-Experts architecture activates only 3B parameters per inference, delivering exceptional accuracy with superior computational efficiency.
The 32K context window processes lengthy documents and multi-page batches seamlessly.
Enhanced with advanced training techniques, it demonstrates superior reasoning for ambiguous layouts, degraded document quality, and complex multi-table structures. This model delivers production-ready accuracy for high-volume workflows requiring highest reliability at scale.
Production Excellence
- Most cost-efficient option for enterprise OCR at scale
- Optimal for high-volume automated document processing
- Superior structured extraction for financial, medical, and legal documents
- Ideal for production pipelines processing 10K+ documents daily
- Handles degraded scans and varying document quality
- Seamless integration with enterprise document management systems
Highlights
- Industry-Leading Performance: >>Achieves 90% accuracy on OCRBench >>Demonstrates 20.3 Character Error Rate on FUNSD (79.7% field-level accuracy) >>Processes 25+ languages with consistent accuracy >>Superior performance on charts, diagrams, tables, and complex layouts >>Exceptional reliability for production-grade document processing
- Technical Specifications: >>30B total parameters with 3B active per inference (MoE architecture) >>Maximum context length: 32K tokens >>Image resolution: Up to 8MP/4K (3840 X 2160) >>Advanced training for enhanced reasoning and accuracy >>4 X inference speedup through optimized deployment architecture
- Structured Extraction Excellence: >>Superior JSON generation from complex document layouts >>Excellent chart and data visualization comprehension (91-93%) >>Advanced table extraction with structure preservation >>Robust handling of nested tables and hierarchical data >>Reliable key-value extraction from challenging layouts
Details
Introducing multi-product solutions
You can now purchase comprehensive solutions tailored to use cases and industries.
Features and programs
Financing for AWS Marketplace purchases
Pricing
Free trial
Dimension | Description | Cost/host/hour |
|---|---|---|
ml.g5.12xlarge Inference (Batch) Recommended | Model inference on the ml.g5.12xlarge instance type, batch mode | $9.98 |
ml.g5.12xlarge Inference (Real-Time) Recommended | Model inference on the ml.g5.12xlarge instance type, real-time mode | $9.98 |
Vendor refund policy
No refunds are possible.
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
Amazon SageMaker model
An Amazon SageMaker model package is a pre-trained machine learning model ready to use without additional training. Use the model package to create a model on Amazon SageMaker for real-time inference or batch processing. Amazon SageMaker is a fully managed platform for building, training, and deploying machine learning models at scale.
Version release notes
This vision-language model represents the optimal balance of accuracy, cost, and performance for production OCR and structured extraction pipelines. The model achieves 90% accuracy on OCRBench evaluations - the highest in its class - delivering enterprise-grade reliability for mission-critical document processing, excelling at complex structured extraction from forms, financial documents, medical records, legal contracts, and technical diagrams.
Additional details
Inputs
- Summary
1. Chat Completion
Example Payload {
"model": "/opt/ml/model",
"messages": [
{"role": "system", "content": "You are a helpful medical assistant."},
{"role": "user", "content": "What should I do if I have a fever and body aches?"}
],
"max_tokens": 1024,
"temperature": 0.6
}For additional parameters:
ChatCompletionRequest OpenAI Chat API
2. Text Completion
Single Prompt Example {
"model": "/opt/ml/model",
"prompt": "How can I maintain good kidney health?",
"max_tokens": 512,
"temperature": 0.6
}
Multiple Prompts Example {
"model": "/opt/ml/model",
"prompt": [
"How can I maintain good kidney health?",
"What are the best practices for kidney care?"
],
"max_tokens": 512,
"temperature": 0.6
}
Reference:3. Image + Text Inference
The model supports both online (direct URL) and offline (base64-encoded) image inputs.
Online Image Example { "model": "/opt/ml/model", "messages": [ {"role": "system", "content": "You are a helpful medical assistant."}, { "role": "user", "content": [ {"type": "text", "text": "What does this medical image show?"}, {"type": "image_url", "image_url": {"url": "https://example.com/image.jpg "}} ] } ], "max_tokens": 2048, "temperature": 0.1 } Offline Image Example (Base64) { "model": "/opt/ml/model", "messages": [ {"role": "system", "content": "You are a helpful medical assistant."}, { "role": "user", "content": [ {"type": "text", "text": "What does this medical image show?"}, {"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,..."}} ] } ], "max_tokens": 2048, "temperature": 0.1 } Reference:
4. Structured Output (JSON Schema)
Force the model to output valid JSON matching a specific schema using response_format.
Example with Schema
{ "model": "/opt/ml/model", "messages": [ { "role": "system", "content": "Extract patient information as JSON." }, { "role": "user", "content": "Patient John Doe, age 45, has hypertension." } ], "temperature": 0.0, "max_tokens": 512, "response_format": { "type": "json_schema", "json_schema": { "name": "patient_info", "strict": true, "schema": { "type": "object", "required": ["name", "age", "conditions"], "properties": { "name": {"type": "string"}, "age": {"type": "integer"}, "conditions": { "type": "array", "items": {"type": "string"} } } } } } }
Reference:
Important Notes:
- Streaming Responses: Add "stream": true to your request payload to enable streaming
- Model Path Requirement: Always set "model": "/opt/ml/model" (SageMaker's fixed model location)
- Input MIME type
- application/json
Support
Vendor support
For any assistance, please reach out to support@johnsnowlabs.com .
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.
Similar products
