Overview
Articul8 Table Understanding Agent is a lightweight, production-ready GenAI agent that transforms dense, unstructured documents into clean, machine-readable table data in seconds. Designed for enterprises that depend on PDFs, reports, scanned images, and operational documents, the agent uses multimodal GenAI to interpret tables, not just extract them, returning structured outputs in a standardized list-of-lists format ready for analytics pipelines, downstream systems, or other AI agents. Powered by advanced layout-parsing and GenAI table-reasoning models, the agent accurately reconstructs rows, columns, merged cells, and hierarchies, even in noisy scans, multi-table pages, or irregular formats. It handles processing entirely in memory and securely discards files after each request, supporting strict enterprise privacy, governance, and regulatory requirements. Built on scalable, AWS-native infrastructure with multi-tenancy, usage-based billing, and low operational overhead, Articul8 Table Understanding Agent allows organizations to automate reporting workflows, accelerate research and audit processes, enrich compliance reviews, and unlock structured insights from their document landscape, without building or maintaining a custom document-processing stack.
Highlights
- Multimodal GenAI Table Understanding: Accurately detects, interprets, and reconstructs complex tables from PDFs, reports, and scanned images, producing clean, structured data ready for downstream workflows.
- Enterprise-grade Accuracy & Governance: Rebuilds rows, columns, merged cells, and hierarchies, even in noisy or irregular documents, while processing entirely in memory and discarding files after each request to meet strict privacy and compliance requirements.
- AWS-native Scalability with Zero Maintenance: Delivered as a lightweight, production-ready agent with multi-tenancy and usage-based billing, enabling organizations to automate research, reporting, and compliance pipelines without owning a document-processing stack.
Details
Introducing multi-product solutions
You can now purchase comprehensive solutions tailored to use cases and industries.
Features and programs
Financing for AWS Marketplace purchases
Quick Launch
Pricing
Dimension | Cost/request |
|---|---|
Number of tables extracted per API request | $0.40 |
Vendor refund policy
Articul8 bills based on the number of successfully extracted tables, not the number of API calls.
Requests that return zero tables are not billed.
Failed, incomplete, or misclassified extractions are excluded from billing, and refunds or credits may be issued if a table was incorrectly counted.
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
API-Based Agents & Tools
API-Based Agents and Tools integrate through standard web protocols. Your applications can make API calls to access agent capabilities and receive responses.
Additional details
Usage instructions
API
Articul8 Table Understanding Agent
The Articul8 Table Understanding Agent is a GenAI-powered document-processing service that extracts structured tables from PDF files and images. Built as a lightweight, secure microservice, it converts unstructured documents into clean, machine-readable table formats for analytics, automation, and multi-agent workflows.
Using document layout analysis and multimodal models, the agent performs real-time table detection and extraction, returning each table as a canonical list-of-lists structure. Results are streamed as they are generated, enabling low-latency processing even for large or multi-table documents.
Key Benefits
- Automated table extraction from PDFs and images
- Standardized list-of-lists JSON output for direct pipeline ingestion
- In-memory processing with no persistent storage
- Synchronous REST API with predictable latency
- Seamless integration with downstream agents and automation tools
Quick Start
Step 1: Authenticate
All requests must include:
Authorization: Bearer <your_token>Step 2: Send a POST Request
Upload your file as multipart form-data. Supported formats: .pdf, .png, .jpg, .jpeg.
Processing begins immediately and runs synchronously.
Step 3: Receive Streamed or Synchronous Output
Returns:
- Number of tables
- Each table as list-of-lists JSON
- Table metadata (page, region, row/column counts)
- Base64-encoded ZIP of CSVs (one per table)
- Progress events during extraction
If no tables are found:
{ "success": true, "data": { "table_count": 0, "tables": [] } }Extraction Endpoint
Required Headers
Example Request
Streaming (SSE) Output
The agent uses Server-Sent Events, streaming each extraction step:
- progress_update — emitted as processing advances
- table_extracted — one event per table, including content and metadata
- complete — final event containing all tables, processing time, status, and Base64 ZIP
- error — sent if extraction fails; no further events follow
Large documents may take longer depending on page count, table density, and scan quality.
Examples
progress_update
event: progress_update data: { "current_step": "image_processing_started", "type": "progress_update" }table_extracted
event: table_extracted data: { "table": [["GRADE","STANDARD"],["50","AMS 3302"]], "table_rows": 2, "table_cols": 2, "table_index": 1 }complete
Includes keys like:
"csv_zip_base64": "<Base64 ZIP>", "csv_zip_filename": "document_tables.zip"error
{ "type": "error", "message": "Unsupported file format" }CSV ZIP Archive
Each extraction produces:
- One CSV per table
- Filenames: {base_filename}_table_{i}.csv
- Final archive: {base_filename}_tables.zip
Returned as Base64 in the final event:
"csv_zip_base64": "UEsDBBQAAAAI...", "csv_zip_filename": "sample_tables.zip"Clients must decode, save, and unzip.
Best Practices
- Validate file size and type before upload
- Ensure adequate client timeout for large documents
- Process streamed tables incrementally
- Decode CSV ZIP only after receiving the final event
Support
Vendor support
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.