Listing Thumbnail

    Open WebUI, vLLM, Ollama: Secure AI Inference Sandbox & GPU Optimized

     Info
    Deployed on AWS
    Deploy a fully secure, enterprise-grade Private AI environment in 5 minutes. Pre-configured with Open WebUI, vLLM, and Ollama on a hardened Ubuntu 22.04 LTS base. Optimized for AWS NVIDIA GPU instances (G6, G5, P4) to deliver high-throughput local inference with 100% data privacy. Perfect for running Llama 3, Qwen, and Mistral models securely within your own VPC without third-party data leaks.

    Overview

    Deploy a production-ready, security-hardened Private AI Sandbox on Amazon EC2 GPU instances. Built on Ubuntu 22.04 LTS with a CIS-oriented baseline, this AMI eliminates hours of manual work configuring NVIDIA drivers, CUDA 12.4, Docker, NVIDIA Container Toolkit, Nginx, and Open WebUI.

    WHAT YOU GET

    CoreNova Enterprise Secure AI Sandbox ships with a decoupled, secure three-layer architecture:

    • 1. Open WebUI (Default User Interface) Browser-based chat, RAG, and administrative controls. Served securely over HTTPS on port 443 via Nginx. (Note: Due to the private environment setup, a self-signed TLS certificate is used on first boot. It is perfectly safe to bypass the browser's privacy warning to proceed; you can replace this with your own public SSL certificate for production).
    • 2. Ollama (DEFAULT Inference Engine) GPU-accelerated GGUF model serving. This acts as your default out-of-the-box engine for chat. It follows a strict Bring Your Own Model (BYOM) approach, allowing you to pull any open-source or custom models directly via the WebUI or CLI without any pre-bloated weights. Ollama listens on localhost only and is never exposed to the public internet.
    • 3. vLLM (OPTIONAL Second Engine) A high-throughput, OpenAI-compatible API listening on localhost port 8000. It is NOT started by default. Enable it manually when you need to deploy HuggingFace transformers or require higher serving throughput. Multi-GPU instances will auto-configure the tensor parallel size.

    FIRST BOOT (Typical 5 to 10 minutes on g4dn.xlarge)

    After launching the instance, please allow 5 to 10 minutes for initialization before attempting to log in. The instance will automatically:

    • Configure the Docker NVIDIA container runtime.
    • Initialize the private inference engine environment.
    • Seed the secure administrator account and start Open WebUI behind the Nginx reverse proxy.

    Access URL: https://YOUR_PUBLIC_IP/ (Direct HTTPS access, no port 3000 exposure)

    First Login Credentials (Public signup is strictly disabled):

    • Email: <admin@local.host>
    • Password: Your unique EC2 Instance ID (e.g., i-0abcdef123456789f)

    > CRITICAL SECURITY NOTE: Please change your administrator password immediately after your first successful login.

    STORAGE (Recommended Expansion)

    To accommodate large LLM weights, attach an additional gp3 EBS volume (100 GB or larger) during launch. On the first boot, our background script will automatically detect, partition, and mount any blank/unformatted secondary volume to /mnt/models. (Note: To protect your data assets, existing formatted volumes containing data will not be modified or overwritten. Ollama models will persist under /mnt/models/ollama).

    SECURITY AND PRIVACY

    • Network Isolation: HTTPS 443 for WebUI access; Ollama and vLLM are bound strictly to localhost to narrow the attack surface.
    • Access Control: Admin seeded securely from your EC2 Instance ID; ENABLE_SIGNUP=false prevents rogue registrations.
    • OS-Level Hardening: Pre-configured with UFW firewall, fail2ban, auditd, and unattended security upgrades.
    • 100% Data Privacy: 100% on-instance inference. Your prompts, embeddings, and uploaded documents stay strictly within your own VPC.
    • Audit Readiness: Includes a pre-configured CloudWatch Agent configuration template for Nginx audit logs (requires attaching an appropriate IAM role to the EC2 instance).

    REQUIREMENTS

    • Amazon EC2 GPU Instance: Compatible with g4dn, g5, g6, p3, p4d, or p5 families.
    • Security Group Rules: Inbound TCP 443 for WebUI users, TCP 22 for administrative SSH.
    • EC2 Key Pair: Required for SSH access (password-based SSH login is strictly disabled).

    OPTIONAL vLLM SWITCHING & VRAM SAFETY

    On single-GPU instances, running Ollama and vLLM simultaneously will cause a VRAM Out-of-Memory (OOM) crash. To safely switch from the default Ollama engine to the high-throughput vLLM engine, you MUST stop Ollama first to free up GPU memory:

    To safely switch to vLLM:

    sudo systemctl stop ollama

    sudo systemctl start corenova-ai-vllm

    To safely switch back to Ollama:

    sudo systemctl stop corenova-ai-vllm

    sudo systemctl start ollama


    Part of the CoreNova Hardened AMI series. Technical Support & Inquiries: support@corenovacloud.com 

    Highlights

    • All-in-One AI Stack: Pre-integrated Open WebUI, vLLM, and Ollama for instant local LLM deployment.
    • Anti-Hijack Security Lock: Public registration is disabled; your unique AWS Instance ID is the default admin key.
    • Auto Elastic Multi-GPU Tuning: Native CUDA optimization with automatic tensor parallelism for multi-card scaling.

    Details

    Delivery method

    Delivery option
    64-bit (x86) Amazon Machine Image (AMI)

    Latest version

    Operating system
    Ubuntu 22.04 LTS

    Deployed on AWS
    New

    Introducing multi-product solutions

    You can now purchase comprehensive solutions tailored to use cases and industries.

    Multi-product solutions

    Features and programs

    Financing for AWS Marketplace purchases

    AWS Marketplace now accepts line of credit payments through the PNC Vendor Finance program. This program is available to select AWS customers in the US, excluding NV, NC, ND, TN, & VT.
    Financing for AWS Marketplace purchases

    Pricing

    Open WebUI, vLLM, Ollama: Secure AI Inference Sandbox & GPU Optimized

     Info
    Pricing is based on actual usage, with charges varying according to how much you consume. Subscriptions have no end date and may be canceled any time.
    Additional AWS infrastructure costs may apply. Use the AWS Pricing Calculator  to estimate your infrastructure costs.

    Usage costs (7)

     Info
    Dimension
    Cost/hour
    g6.xlarge
    Recommended
    $0.25
    g5.12xlarge
    $0.25
    g6.48xlarge
    $0.25
    p5.48xlarge
    $0.25
    g5.xlarge
    $0.25
    g6.12xlarge
    $0.25
    g4dn.xlarge
    $0.25

    Vendor refund policy

    30-day refund on AWS Marketplace software fees for this product. Email support@corenovacloud.com  with your AWS account ID and purchase date. Software fees only; EC2 charges are not refundable by the seller. We reply within 5 business days. Free trial: no software charges during the trial.

    How can we make this page better?

    Tell us how we can improve this page, or report an issue with this product.
    Tell us how we can improve this page, or report an issue with this product.

    Legal

    Vendor terms and conditions

    Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

    Content disclaimer

    Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

    Usage information

     Info

    Delivery details

    64-bit (x86) Amazon Machine Image (AMI)

    Amazon Machine Image (AMI)

    An AMI is a virtual image that provides the information required to launch an instance. Amazon EC2 (Elastic Compute Cloud) instances are virtual servers on which you can run your applications and workloads, offering varying combinations of CPU, memory, storage, and networking resources. You can launch as many instances from as many different AMIs as you need.

    Version release notes

    Enterprise Secure AI Inference Sandbox Ubuntu 22.04 LTS on Ubuntu 22.04 LTS (x86_64, HVM).

    Version: v20260610

    Stack:

    • SSH hardening: key-only access, root login disabled (user: ubuntu)
    • Firewall baseline (UFW)
    • auditd, rsyslog, chrony enabled
    • NVIDIA CUDA drivers pre-installed (verified with nvidia-smi)
    • NVIDIA AI Enterprise stack: CUDA 12.x, cuDNN, TensorRT
    • Ollama runtime for local LLM inference
    • Automatic security updates via unattended-upgrades

    Compliance: CIS Benchmark guidance with OpenSCAP profile for transparency. Organizations should run their own validation to meet specific regulatory requirements.

    Part of the CoreNova Hardened AMI product family.

    AMI: ami-0a72d9650178b60b5 (us-east-1)

    Additional details

    Usage instructions

    Overview

    Enterprise Secure AI Inference Sandbox Ubuntu 22.04 LTS - security-hardened AI inference AMI for Ubuntu 22.04 LTS (x86_64). Recommended instance type: g5.xlarge (GPU instances with NVIDIA AI stack pre-installed).

    Launch checklist

    Step 1 - Subscribe in AWS Marketplace, then Launch in us-east-1.

    Step 2 - Instance type: g5.xlarge, p3.2xlarge, g4dn.xlarge, or other NVIDIA GPU instance.

    Step 3 - Key pair: select your EC2 SSH key. Password login is disabled.

    Step 4 - Security group: allow inbound TCP 22 from your administrator IP only.

    First connection

    ssh -i your-key.pem ubuntu@YOUR_PUBLIC_IP

    Post-launch verify

    nvidia-smi Expected: shows NVIDIA driver version and GPU(s). systemctl is-active auditd chrony rsyslog sudo systemctl is-active ufw ollama list Expected: shows available AI models (if pre-configured).

    Support

    Email: support@corenovacloud.com  Web: https://www.corenovacloud.com/  Refund: 30-day refund on Marketplace software fees. EC2 charges not refundable by seller.

    Include AWS Region, AMI ID, EC2 Instance ID, instance type, and steps to reproduce.

    Support

    Vendor support

    Email: support@corenovacloud.com  Web: https://www.corenovacloud.com/  Refund: 30-day refund on Marketplace software fees. EC2 charges not refundable by seller.

    Include AWS Region, AMI ID, EC2 Instance ID, instance type, and steps to reproduce.

    AWS infrastructure support

    AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

    Similar products

    Customer reviews

    Ratings and reviews

     Info
    0 ratings
    5 star
    4 star
    3 star
    2 star
    1 star
    0%
    0%
    0%
    0%
    0%
    0 reviews
    No customer reviews yet
    Be the first to review this product . We've partnered with PeerSpot to gather customer feedback. You can share your experience by writing or recording a review, or scheduling a call with a PeerSpot analyst.