Listing Thumbnail

    TextIn xParse v2

     Info
    Sold by: TextIn 
    Deployed on AWS
    TextIn xParse is a next-gen document intelligence product powered by LLMs, turning unstructured complex documents into structured, queryable data assets for relational and vector databases. It optimizes ETL pipelines and empowers high-quality RAG Q&A, serving 1000+ leading enterprises worldwide for diverse document processing needs. This version is passport parse version.

    Overview

    Rebuilt from the ground up with LLMs and beyond traditional OCR, TextIn xParse excels in processing all types of complex documents, breaking down arbitrary layouts into semantically complete paragraphs and restoring reading order for large model adaptability. Boasting industry-leading table recognition, it resolves merged cells, multi-page tables and borderless tables with ease, and integrates seamlessly with image processing to handle watermarked and curved documents. As an intelligent ETL solution, it enables zero-sample key information extraction, cross-document retrieval and intelligent document classification, solving large model pain points like unstable output and length truncation. TextIn xParse generates high-quality Chunks with semantic relationship labeling, coordinate and chapter information, boosting RAG Q&A accuracy and search efficiency, and supports one-click import to mainstream RAG frameworks including RagFlow, Dify and Coze. It builds a solid document infrastructure for enterprise scenarios like Knowledge Q&A, Agent Enablement, Data Entry and Data Cleaning, automating unstructured data processing, reducing manual workload and maximizing data asset value. Trusted by global leading enterprises, it delivers efficient, accurate document processing capabilities for mission-critical business scenarios.

    Highlights

    • LLM-Powered Document Parsing: Beyond OCR, realizes accurate structured conversion of complex unstructured documents and adapts perfectly to large model application needs
    • High-Quality RAG Empowerment: Generates semantic-optimized Chunks to improve Q&A accuracy and search efficiency, supporting one-click access to mainstream RAG frameworks
    • Intelligent ETL Pipeline: Achieves zero-sample extraction, cross-document retrieval and intelligent classification, maximizing enterprise unstructured data asset value

    Details

    Sold by

    Delivery method

    Delivery option
    64-bit (x86) Amazon Machine Image (AMI)

    Latest version

    Operating system
    Ubuntu 20.04

    Deployed on AWS
    New

    Introducing multi-product solutions

    You can now purchase comprehensive solutions tailored to use cases and industries.

    Multi-product solutions

    Features and programs

    Financing for AWS Marketplace purchases

    AWS Marketplace now accepts line of credit payments through the PNC Vendor Finance program. This program is available to select AWS customers in the US, excluding NV, NC, ND, TN, & VT.
    Financing for AWS Marketplace purchases

    Pricing

    TextIn xParse v2

     Info
    Pricing is based on actual usage, with charges varying according to how much you consume. Subscriptions have no end date and may be canceled any time.
    Additional AWS infrastructure costs may apply. Use the AWS Pricing Calculator  to estimate your infrastructure costs.

    Usage costs (1)

     Info
    Dimension
    Cost/hour
    t3a.large
    Recommended
    $10.00

    Vendor refund policy

    nonrefundable

    How can we make this page better?

    We'd like to hear your feedback and ideas on how to improve this page.
    We'd like to hear your feedback and ideas on how to improve this page.

    Legal

    Vendor terms and conditions

    Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

    Content disclaimer

    Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

    Usage information

     Info

    Delivery details

    64-bit (x86) Amazon Machine Image (AMI)

    Amazon Machine Image (AMI)

    An AMI is a virtual image that provides the information required to launch an instance. Amazon EC2 (Elastic Compute Cloud) instances are virtual servers on which you can run your applications and workloads, offering varying combinations of CPU, memory, storage, and networking resources. You can launch as many instances from as many different AMIs as you need.

    Version release notes

    TextIn Document Parser AMI - Initial release. Supports PDF, DOCX, TXT parsing and outputs clean, structured JSON data for AI systems.

    Additional details

    Usage instructions

    TextIn Document Parser AMI Usage Instructions Overview This AMI provides a pre-installed, native document parsing service for Ubuntu 20.04 LTS. It converts PDF, DOCX, and TXT files into structured, AI-ready JSON data optimized for LLMs, Agents, and RAG systems. Base OS: Ubuntu 20.04 LTS Default username: ubuntu SSH port: 22 Service port: 30006

    1. Connect to Your EC2 Instance From the Amazon EC2 Console, obtain the public IP or DNS of your instance. Connect using SSH with your AWS key pair: plaintext ssh -i "your-key-pair.pem" ubuntu@<instance-public-ip> Type yes to confirm the host key on first connection.
    2. Verify the Service Status The TextIn service starts automatically on boot. Check service status: plaintext sudo systemctl status textin-parser.service If inactive, start and enable it: plaintext sudo systemctl start textin-parser.service sudo systemctl enable textin-parser.service Verify health: plaintext curl http://localhost:30006/health  A healthy response returns: plaintext {"status":"healthy"}
    3. Use the Document Parsing API Submit documents for parsing: plaintext curl -X POST -F "file=@/path/to/your/document.pdf" http://<instance-public-ip>:30006/parse Supported formats: PDF, DOCX, TXT Output: Clean structured JSON
    4. License Activation (Optional) For enterprise features: Run the license setup script: plaintext ./1-install_licserver.sh Send the generated machine fingerprint (seed.txt) to simon_liu@intsig.net  to obtain a license. Place your license file in /home/ubuntu/licFile/. Apply the license: plaintext ./2-apply_license.sh
    5. Service Management Stop service: plaintext sudo systemctl stop textin-parser.service View real-time logs: plaintext sudo journalctl -u textin-parser.service -f
    6. Security Best Practices Open only ports 22 (SSH) and 30006 (API) in your security group. Restrict SSH access to trusted IP ranges. Do not expose the service to 0.0.0.0/0 in production.
    7. Troubleshooting Service not running: Check logs with journalctl. API unreachable: Verify port 30006 is open in the security group. License issues: Confirm seed and license file match.

    Last updated: March 2026 Support: For license and service support: simon_liu@intsig.net 

    Support

    Vendor support

    Contact email: sheng_song@intsig.net  URL: https://www.textin.ai/contact  Support time: 8 hours *5 workingdays Buyers can get professional and all-round technical support and after-sales service for TextIn products.

    AWS infrastructure support

    AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

    Similar products

    Customer reviews

    Ratings and reviews

     Info
    0 ratings
    5 star
    4 star
    3 star
    2 star
    1 star
    0%
    0%
    0%
    0%
    0%
    0 reviews
    No customer reviews yet
    Be the first to review this product . We've partnered with PeerSpot to gather customer feedback. You can share your experience by writing or recording a review, or scheduling a call with a PeerSpot analyst.