Listing Thumbnail

    RAGFlow - Hardened Self-Hosted RAG Engine with Deep Document AI

     Info
    Sold by: Lynxroute 
    Deployed on AWS
    Free Trial
    This product has charges associated with it for hardening, security configuration, and support. RAGFlow is a self-hosted Retrieval-Augmented Generation engine that turns enterprise documents (PDFs, scans, tables, slides) into citation-grounded answers via LLM agents - with deep document parsing, hybrid vector + full-text search, and a multi-tenant web UI. Unlike bare RAGFlow AMIs that ship with default passwords, no TLS, and ports wide open, this Lynxroute build is ready out of the box: per-instance rotated credentials for MySQL, Elasticsearch, MinIO and Redis at first boot, self-registration with auto-close after first signup, Nginx TLS reverse proxy, on a CIS Level 1 hardened Ubuntu 24.04 LTS base. Apache 2.0 license - fully auditable, no vendor lock-in.

    Overview

    This is a repackaged software product wherein additional charges apply for hardening, security configuration, and support.

    WHAT IS RAGFLOW

    RAGFlow is an open-source Retrieval-Augmented Generation engine for enterprise document intelligence. Its deep document understanding layer (deepdoc) parses PDFs, Word, slides, spreadsheets, scanned books and images with layout recognition, table extraction and chunk-level citation tracking - so every answer the LLM produces is traceable back to a paragraph, page, table cell or figure in the original source. RAGFlow ingests files into knowledge bases, builds hybrid vector plus full-text indices in Elasticsearch, stores blobs in MinIO, and exposes a multi-tenant web UI plus REST API plus MCP server for chat, agents, and programmatic retrieval. Supports any LLM provider through a single dropdown - OpenAI, Anthropic Claude, AWS Bedrock, Azure OpenAI, Google Gemini, Cohere, Mistral, DeepSeek, Ollama, vLLM, and many more. Agent workflows let the model browse, run SQL, call tools, and chain retrieval over multiple knowledge bases. Apache 2.0 license, no vendor lock-in.

    WHAT THIS AMI ADDS

    Security hardening:

    • Self-registration with auto-close - the first user who hits the web UI becomes the workspace owner, then the registration endpoint is closed automatically by a background service
    • No baked-in admin email, no default credentials shipped on disk - the customer owns the identity layer
    • MySQL, Elasticsearch, MinIO and Redis passwords rotated at first boot (>=32 chars each), never baked into the AMI
    • Upstream Go admin server (license-tracker module) explicitly NOT enabled - the AMI runs the pure Apache 2.0 Python server only
    • Containers bound to the docker bridge - no backing store (MySQL, Elasticsearch, MinIO, Redis) is reachable from outside the host
    • Nginx reverse proxy with TLS, HTTP-to-HTTPS redirect, WebSocket support for streaming chat, security headers
    • UFW firewall pre-configured - only TCP 22, 80, 443 are exposed
    • fail2ban, AppArmor
    • CVE scan - every image is scanned for vulnerabilities before release

    Out of the box, with no external services:

    • Elasticsearch 8.11.3 vector + full-text store with xpack.security enabled
    • Embedding model bundled inside the upstream ragflow image - no external embedding API key required
    • MinIO blob store for document originals
    • MySQL 8 for users, tenants, knowledge bases, datasets
    • Valkey 8 (BSD-3, OSS Redis fork) for cache + task queue

    OS hardening (CIS Level 1):

    • CIS Ubuntu 24.04 LTS Level 1 benchmark applied via ansible-lockdown
    • auditd, SSH hardening, kernel hardening, IMDSv2 enforced

    Compliance artifacts:

    • SBOM - CycloneDX 1.6 at /etc/lynxroute/sbom.json
    • CIS Conformance Report at /etc/lynxroute/cis-report.html
    • CIS Tailored Profile at /usr/share/doc/lynxroute/CIS_TAILORED_PROFILE.md

    Highlights

    • RAGFlow security baked in: random per-instance passwords for all backing stores, self-registration auto-closed after first signup, Nginx TLS reverse proxy - unlike bare RAGFlow AMIs that ship with default passwords, the admin port wide open, and no TLS.
    • CIS Level 1 hardened Ubuntu 24.04 LTS: auditd, fail2ban, AppArmor, SSH key-only, IMDSv2 enforced. CVE-scanned before every release. SBOM (CycloneDX) and CIS Conformance Report included.
    • Deep document RAG works out of the box: Elasticsearch 8.11 vector + full-text store, MinIO blob storage, embedding model bundled in the ragflow image, deepdoc parser for PDFs, scans, tables, slides. Sign up, upload documents, chat with citations. Add provider API keys (OpenAI, Anthropic, Bedrock, Ollama) later in the web UI. Apache-2.0 license - fully auditable, no vendor lock-in.

    Details

    Delivery method

    Delivery option
    64-bit (x86) Amazon Machine Image (AMI)

    Latest version

    Operating system
    Ubuntu 24.04

    Deployed on AWS
    New

    Introducing multi-product solutions

    You can now purchase comprehensive solutions tailored to use cases and industries.

    Multi-product solutions

    Features and programs

    Financing for AWS Marketplace purchases

    AWS Marketplace now accepts line of credit payments through the PNC Vendor Finance program. This program is available to select AWS customers in the US, excluding NV, NC, ND, TN, & VT.
    Financing for AWS Marketplace purchases

    Pricing

    Free trial

    Try this product free for 5 days according to the free trial terms set by the vendor. Usage-based pricing is in effect for usage beyond the free trial terms. Your free trial gets automatically converted to a paid subscription when the trial ends, but may be canceled any time before that.

    RAGFlow - Hardened Self-Hosted RAG Engine with Deep Document AI

     Info
    Pricing is based on actual usage, with charges varying according to how much you consume. Subscriptions have no end date and may be canceled any time.
    Additional AWS infrastructure costs may apply. Use the AWS Pricing Calculator  to estimate your infrastructure costs.

    Usage costs (2)

     Info
    Dimension
    Cost/hour
    m6i.xlarge
    Recommended
    $0.05
    g4dn.xlarge
    $0.05

    Vendor refund policy

    We do not offer refunds for this product. AWS infrastructure charges (EC2, EBS, data transfer) are billed separately by AWS and are not refundable by us.

    How can we make this page better?

    Tell us how we can improve this page, or report an issue with this product.
    Tell us how we can improve this page, or report an issue with this product.

    Legal

    Vendor terms and conditions

    Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

    Content disclaimer

    Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

    Usage information

     Info

    Delivery details

    64-bit (x86) Amazon Machine Image (AMI)

    Amazon Machine Image (AMI)

    An AMI is a virtual image that provides the information required to launch an instance. Amazon EC2 (Elastic Compute Cloud) instances are virtual servers on which you can run your applications and workloads, offering varying combinations of CPU, memory, storage, and networking resources. You can launch as many instances from as many different AMIs as you need.

    Version release notes

    Version 0.25.4 - Initial release (May 2026)

    • RAGFlow 0.25.4 upstream Docker image (infiniflow/ragflow:v0.25.4) on Ubuntu 24.04 LTS
    • Docker Compose stack: ragflow_server + ragflow_mysql (8.0.39) + ragflow_es (8.11.3) + ragflow_minio + ragflow_redis (valkey/valkey:8)
    • CIS Level 1 hardening applied (ansible-lockdown/UBUNTU24-CIS)
    • CVE-scanned before every release
    • Self-registration enabled at first boot; auto-closed by a systemd watcher after the first user signs up
    • MySQL, Elasticsearch, MinIO and Redis passwords (>=32 chars each) rotated per instance at first boot
    • Upstream Go admin server (--enable-adminserver) explicitly NOT enabled - the AMI runs the Apache-2.0 Python server only
    • Backing stores (MySQL, Elasticsearch, MinIO, Redis) bound to the docker bridge only - not reachable from outside the host
    • ragflow container bound to 127.0.0.1:8088; host Nginx terminates TLS on 443
    • No provider API keys pre-configured - operator configures OpenAI, Anthropic, Bedrock, Ollama, etc. in the web UI
    • Persistent storage at /opt/ragflow/{mysql,es,minio,redis} - any of these can be moved to a dedicated EBS volume
    • UFW firewall pre-configured (TCP 22, 80, 443 only)
    • fail2ban, auditd, AppArmor pre-configured
    • SBOM (CycloneDX 1.6) at /etc/lynxroute/sbom.json
    • CIS Conformance Report (OpenSCAP) at /etc/lynxroute/cis-report.html
    • IMDSv2 enforced

    Additional details

    Usage instructions

    1. Launch instance (m6i.xlarge minimum - RAGFlow needs >=16 GB RAM; m6i.2xlarge for production document throughput)
    2. Open Security Group - allow TCP 443 from YOUR IP/32 only, until you have registered
    3. SSH: ssh -i key.pem ubuntu@<PUBLIC_IP>
    4. Read connection details: sudo cat /root/ragflow-credentials.txt
    5. Open https://<PUBLIC_IP>/ in your browser - accept the self-signed certificate warning
    6. Click "Sign up" - register with YOUR real email; the first registered user becomes the workspace owner
    7. Within ~30 seconds the registration endpoint auto-closes (systemctl status ragflow-register-watch)
    8. Log in, go to "Model Providers" and configure your preferred LLM (OpenAI, Anthropic, AWS Bedrock, Ollama, etc.)
    9. Create a knowledge base, upload documents, and start chatting with citations

    No admin credentials ship in the AMI - the customer owns the workspace owner identity by registering it. Backing store passwords (MySQL, Elasticsearch, MinIO, Redis) are rotated at first boot and recorded in /root/ragflow-credentials.txt for operator reference; those services are reachable only inside the docker network. Replace the self-signed TLS certificate with a CA-signed certificate for production use (sudo certbot --nginx -d YOUR_DOMAIN).

    Resources

    Vendor resources

    Support

    Vendor support

    Visit us online: https://lynxroute.com 

    For RAGFlow documentation: https://ragflow.io/docs/dev/  For RAGFlow upstream issues:

    AWS infrastructure support

    AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

    Similar products

    Customer reviews

    Ratings and reviews

     Info
    0 ratings
    5 star
    4 star
    3 star
    2 star
    1 star
    0%
    0%
    0%
    0%
    0%
    0 reviews
    No customer reviews yet
    Be the first to review this product . We've partnered with PeerSpot to gather customer feedback. You can share your experience by writing or recording a review, or scheduling a call with a PeerSpot analyst.