Overview
Nanonets is an intelligent document processing API that transforms unstructured PDFs, images, and scanned documents into clean, structured data. Our vision-language models extract text, tables, and fields with high accuracy - no templates or manual configuration required. Key Capabilities: PDF to markdown conversion optimized for LLM context windows Complex table extraction including merged cells and nested headers Multi-language OCR with 100+ languages supported LaTeX equation recognition for technical documents Signature, watermark, and checkbox detection Field-level confidence scores for human-in-the-loop validation Output formats: Markdown, JSON, HTML Built for AI Applications: Nanonets delivers LLM-ready output ideal for Retrieval-Augmented Generation (RAG) pipelines, knowledge bases, and document search. Unlike traditional OCR that outputs plain text, we preserve semantic structure - headings, lists, tables, and visual hierarchy - enabling better chunking, more accurate retrieval, and reduced hallucinations. Enterprise Ready: SOC 2 Type II certified, GDPR and HIPAA compliant Native AWS integration: S3, Lambda, Bedrock, EventBridge Scales to millions of pages with 99.9% uptime Trusted by finance, healthcare, insurance, and legal enterprises
Highlights
- Intelligent document processing API that extracts structured data from PDFs, images, and scanned documents without templates. Industry-leading accuracy on complex tables, multi-column layouts, and forms. Supports 100+ languages, LaTeX equations, signatures, watermarks, and checkboxes. Field-level confidence scores enable human-in-the-loop validation for high-stakes workflows.
- Purpose-built for LLM and RAG applications. Converts documents to clean, structured markdown that preserves semantic hierarchy - headings, lists, tables, and visual structure. Optimized for better chunking, more accurate retrieval, and reduced hallucinations in AI pipelines. Output formats include markdown, JSON, and HTML
- Enterprise-ready with SOC 2 Type II, GDPR, and HIPAA compliance. Native AWS integration with S3 for document ingestion, Lambda for event-driven processing, and Bedrock for LLM workflows. Scales to millions of pages with consistent latency and 99.9% uptime. Trusted by enterprises in financial services, healthcare, insurance, and legal industries
Details
Introducing multi-product solutions
You can now purchase comprehensive solutions tailored to use cases and industries.
Features and programs
Financing for AWS Marketplace purchases
Pricing
Dimension | Description | Cost/request |
|---|---|---|
Pages Processed | Number of document pages processed through the Nanonets document AI API | $0.01 |
Vendor refund policy
Nanonets offers refunds for unused prepaid credits within 30 days of purchase. Usage-based charges for successfully processed pages are non-refundable. If you experience technical issues or service failures that prevent successful document processing, please contact us for a credit or refund review. To request a refund, email support@nanonets.com with your AWS account ID, transaction details, and reason for the request. Refund requests are typically reviewed within 5 business days.
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
Software as a Service (SaaS)
SaaS delivers cloud-based software applications directly to customers over the internet. You can access these applications through a subscription model. You will pay recurring monthly usage fees through your AWS bill, while AWS handles deployment and infrastructure management, ensuring scalability, reliability, and seamless integration with other AWS services.
Support
Vendor support
Support email: support@nanonets.com
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.