Listing Thumbnail

    vLLM + Open WebUI - Hardened Self-Hosted OpenAI-Compatible LLM Server

     Info
    Sold by: Lynxroute 
    Deployed on AWS
    Free Trial
    This product has charges associated with it for hardening, security configuration, and support. vLLM is a high-throughput OpenAI-compatible inference server for open-source LLMs (Python + PyTorch CPU build), bundled with Open WebUI as a browser chat front end. Unlike bare vLLM AMIs that ship without TLS, with the API server on 0.0.0.0:8000, with no Bearer-token auth, and no web UI for non-developers, this Lynxroute build is ready out of the box: random 32-byte API key generated at first boot, vLLM and Open WebUI bound to loopback behind Nginx TLS, a default tiny model (facebook/opt-125m) preloaded so the API and chat work immediately, UFW firewall pre-configured, and a CIS Level 1 hardened Ubuntu 24.04 LTS base. Apache-2.0 (vLLM) and MIT (Open WebUI) - fully auditable, no vendor lock-in.

    Overview

    This is a repackaged software product wherein additional charges apply for hardening, security configuration, and support.

    WHAT IS VLLM

    vLLM is an open-source, high-throughput inference and serving engine for large language models, built in Python on top of PyTorch. It implements PagedAttention, continuous batching, and tensor parallelism to serve any HuggingFace-compatible transformer model (Llama, Mistral, Qwen, Phi, Gemma, OPT, GPT-J, MPT, Falcon, and 100+ more) through a fully OpenAI-compatible REST API. Any client built for OpenAI (openai-python, openai-node, LangChain, LlamaIndex, AnythingLLM, the OpenAI ChatGPT SDK) connects unchanged - just point the base URL at this instance and pass the local Bearer token. This AMI ships the CPU build of vLLM 0.21.0, bundled with Open WebUI 0.9.5 as a browser chat front end pre-wired to the local vLLM. Persists nothing externally - models cache to /var/lib/vllm/hf-cache, chats and accounts to /var/lib/open-webui. Apache-2.0 (vLLM) and MIT (Open WebUI), no vendor lock-in.

    WHAT THIS AMI ADDS

    Security hardening:

    • Random 32-byte API key generated at first boot, written to /root/vllm-credentials.txt - never baked into the AMI; the same key is injected into Open WebUI so the chat UI authenticates to vLLM transparently
    • vLLM API server bound to 127.0.0.1:8000 only - reachable only through Nginx with TLS, with --api-key Bearer auth enforced on every /v1/* request
    • Open WebUI bound to 127.0.0.1:8080 only - reachable only through Nginx with TLS
    • First registered user in Open WebUI becomes the workspace administrator; no admin baked in
    • Nginx reverse proxy with TLS, HTTP-to-HTTPS redirect, WebSocket upgrade for streaming chat, security headers (X-Content-Type-Options, X-Frame-Options, Referrer-Policy)
    • Loading splash page served while the model warms up on first request
    • Anonymous telemetry disabled (VLLM_NO_USAGE_STATS, DO_NOT_TRACK, ANONYMIZED_TELEMETRY)
    • UFW firewall pre-configured - only TCP 22, 80, 443 are exposed; 8000 and 8080 explicitly denied
    • fail2ban, AppArmor
    • CVE scan - every image is scanned for vulnerabilities before release

    Out of the box, with no external services:

    • Default tiny model facebook/opt-125m (~250 MB) loaded at first boot - chat and API work immediately, no HuggingFace token required
    • Swap to any HuggingFace-compatible model by editing /etc/vllm/server.env and restarting the service - no rebuild needed
    • No third-party LLM API keys baked in or required - everything runs on-instance

    OS hardening (CIS Level 1):

    • CIS Ubuntu 24.04 LTS Level 1 benchmark applied via ansible-lockdown
    • auditd, SSH hardening, kernel hardening, IMDSv2 enforced

    Compliance artifacts:

    • SBOM - CycloneDX 1.6 at /etc/lynxroute/sbom.json
    • CIS Conformance Report at /etc/lynxroute/cis-report.html
    • CIS Tailored Profile at /usr/share/doc/lynxroute/CIS_TAILORED_PROFILE.md

    Highlights

    • vLLM security baked in: random 32-byte API key generated at first boot, vLLM and Open WebUI bound to 127.0.0.1 behind Nginx TLS, Bearer-token auth enforced on every /v1 request - unlike bare vLLM AMIs that expose port 8000 on 0.0.0.0 with no auth and no TLS.
    • CIS Level 1 hardened Ubuntu 24.04 LTS: auditd, fail2ban, AppArmor, SSH key-only, IMDSv2 enforced. CVE-scanned before every release. SBOM (CycloneDX) and CIS Conformance Report included.
    • Two ways to use the same engine: browser chat via Open WebUI for analysts and non-developers, OpenAI-compatible REST API at /v1 for openai-python, LangChain, LlamaIndex, AnythingLLM, and any other OpenAI client. Apache-2.0 (vLLM) and MIT (Open WebUI) - fully auditable, no vendor lock-in.

    Details

    Delivery method

    Delivery option
    64-bit (x86) Amazon Machine Image (AMI)

    Latest version

    Operating system
    Ubuntu 24.04

    Deployed on AWS
    New

    Introducing multi-product solutions

    You can now purchase comprehensive solutions tailored to use cases and industries.

    Multi-product solutions

    Features and programs

    Financing for AWS Marketplace purchases

    AWS Marketplace now accepts line of credit payments through the PNC Vendor Finance program. This program is available to select AWS customers in the US, excluding NV, NC, ND, TN, & VT.
    Financing for AWS Marketplace purchases

    Pricing

    Free trial

    Try this product free for 5 days according to the free trial terms set by the vendor. Usage-based pricing is in effect for usage beyond the free trial terms. Your free trial gets automatically converted to a paid subscription when the trial ends, but may be canceled any time before that.

    vLLM + Open WebUI - Hardened Self-Hosted OpenAI-Compatible LLM Server

     Info
    Pricing is based on actual usage, with charges varying according to how much you consume. Subscriptions have no end date and may be canceled any time.
    Additional AWS infrastructure costs may apply. Use the AWS Pricing Calculator  to estimate your infrastructure costs.

    Usage costs (4)

     Info
    Dimension
    Cost/hour
    m6i.xlarge
    Recommended
    $0.05
    t3.large
    $0.03
    m6i.large
    $0.03
    m6i.2xlarge
    $0.07

    Vendor refund policy

    We do not offer refunds for this product. AWS infrastructure charges (EC2, EBS, data transfer) are billed separately by AWS and are not refundable by us.

    How can we make this page better?

    Tell us how we can improve this page, or report an issue with this product.
    Tell us how we can improve this page, or report an issue with this product.

    Legal

    Vendor terms and conditions

    Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

    Content disclaimer

    Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

    Usage information

     Info

    Delivery details

    64-bit (x86) Amazon Machine Image (AMI)

    Amazon Machine Image (AMI)

    An AMI is a virtual image that provides the information required to launch an instance. Amazon EC2 (Elastic Compute Cloud) instances are virtual servers on which you can run your applications and workloads, offering varying combinations of CPU, memory, storage, and networking resources. You can launch as many instances from as many different AMIs as you need.

    Version release notes

    Version 0.21.0 - Initial release (May 2026)

    • vLLM 0.21.0 (CPU build from upstream wheel) + Open WebUI 0.9.5 on Ubuntu 24.04 LTS
    • CIS Level 1 hardening applied (ansible-lockdown/UBUNTU24-CIS)
    • CVE-scanned before every release
    • Random 32-byte API key generated at first boot - written to /root/vllm-credentials.txt and shared between vLLM and Open WebUI
    • vLLM API server bound to 127.0.0.1:8000 with --api-key Bearer auth on every /v1/* request
    • Open WebUI bound to 127.0.0.1:8080, pre-wired to local vLLM via OPENAI_API_BASE_URL
    • First registered user in Open WebUI becomes the workspace administrator
    • Default model facebook/opt-125m preloaded - chat and API work immediately, no HuggingFace token required
    • Nginx reverse proxy with self-signed TLS, HTTP-to-HTTPS redirect, WebSocket upgrade for streaming chat
    • Loading splash page during model warmup
    • Anonymous telemetry disabled (VLLM_NO_USAGE_STATS, DO_NOT_TRACK, ANONYMIZED_TELEMETRY)
    • UFW firewall pre-configured (TCP 22, 80, 443 only; 8000 and 8080 explicitly denied)
    • fail2ban, auditd, AppArmor pre-configured
    • SBOM (CycloneDX 1.6) at /etc/lynxroute/sbom.json
    • CIS Conformance Report (OpenSCAP) at /etc/lynxroute/cis-report.html
    • IMDSv2 enforced

    Additional details

    Usage instructions

    1. Launch instance (m6i.xlarge recommended; minimum t3.large or m6i.large with 8 GB RAM)
    2. Open Security Group - allow TCP 443 from YOUR IP only until you have registered as the admin
    3. Open https://<PUBLIC_IP>/ in your browser - accept the self-signed certificate warning
    4. Click "Sign up" and register with YOUR real email - the first registered user becomes the workspace administrator
    5. Start chatting; the default model facebook/opt-125m is preloaded
    6. SSH if needed: ssh -i key.pem ubuntu@<PUBLIC_IP> ; credentials in /root/vllm-credentials.txt

    To call the OpenAI-compatible API directly: curl https://<PUBLIC_IP>/v1/models -H "Authorization: Bearer <API_KEY>" -k The API key is the Bearer token shown in /root/vllm-credentials.txt.

    To serve a different HuggingFace-compatible model: sudo nano /etc/vllm/server.env # change VLLM_MODEL=<huggingface-id> sudo systemctl restart vllm

    After registration, restrict Security Group TCP 443 to your team's IP range. Replace the self-signed TLS certificate with a CA-signed certificate for production use.

    Support

    Vendor support

    Visit us online: https://lynxroute.com 

    For vLLM documentation: https://docs.vllm.ai/en/latest/getting_started/quickstart/  For vLLM upstream issues: https://github.com/vllm-project/vllm/issues  For Open WebUI documentation: https://docs.openwebui.com/getting-started/  For Open WebUI upstream issues:

    AWS infrastructure support

    AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

    Similar products

    Customer reviews

    Ratings and reviews

     Info
    0 ratings
    5 star
    4 star
    3 star
    2 star
    1 star
    0%
    0%
    0%
    0%
    0%
    0 reviews
    No customer reviews yet
    Be the first to review this product . We've partnered with PeerSpot to gather customer feedback. You can share your experience by writing or recording a review, or scheduling a call with a PeerSpot analyst.