Overview
Deploy a production-ready, security-hardened Private AI Sandbox on Amazon EC2 GPU instances. Built on Ubuntu 22.04 LTS with a CIS-oriented baseline, this AMI eliminates hours of manual work configuring NVIDIA drivers, CUDA 12.4, Docker, NVIDIA Container Toolkit, Nginx, and Open WebUI.
WHAT YOU GET
CoreNova Enterprise Secure AI Sandbox ships with a decoupled, secure three-layer architecture:
- 1. Open WebUI (Default User Interface) Browser-based chat, RAG, and administrative controls. Served securely over HTTPS on port 443 via Nginx. (Note: Due to the private environment setup, a self-signed TLS certificate is used on first boot. It is perfectly safe to bypass the browser's privacy warning to proceed; you can replace this with your own public SSL certificate for production).
- 2. Ollama (DEFAULT Inference Engine) GPU-accelerated GGUF model serving. This acts as your default out-of-the-box engine for chat. It follows a strict Bring Your Own Model (BYOM) approach, allowing you to pull any open-source or custom models directly via the WebUI or CLI without any pre-bloated weights. Ollama listens on localhost only and is never exposed to the public internet.
- 3. vLLM (OPTIONAL Second Engine) A high-throughput, OpenAI-compatible API listening on localhost port 8000. It is NOT started by default. Enable it manually when you need to deploy HuggingFace transformers or require higher serving throughput. Multi-GPU instances will auto-configure the tensor parallel size.
FIRST BOOT (Typical 5 to 10 minutes on g4dn.xlarge)
After launching the instance, please allow 5 to 10 minutes for initialization before attempting to log in. The instance will automatically:
- Configure the Docker NVIDIA container runtime.
- Initialize the private inference engine environment.
- Seed the secure administrator account and start Open WebUI behind the Nginx reverse proxy.
Access URL: https://YOUR_PUBLIC_IP/ (Direct HTTPS access, no port 3000 exposure)
First Login Credentials (Public signup is strictly disabled):
- Email: <admin@local.host>
- Password: Your unique EC2 Instance ID (e.g., i-0abcdef123456789f)
> CRITICAL SECURITY NOTE: Please change your administrator password immediately after your first successful login.
STORAGE (Recommended Expansion)
To accommodate large LLM weights, attach an additional gp3 EBS volume (100 GB or larger) during launch. On the first boot, our background script will automatically detect, partition, and mount any blank/unformatted secondary volume to /mnt/models. (Note: To protect your data assets, existing formatted volumes containing data will not be modified or overwritten. Ollama models will persist under /mnt/models/ollama).
SECURITY AND PRIVACY
- Network Isolation: HTTPS 443 for WebUI access; Ollama and vLLM are bound strictly to localhost to narrow the attack surface.
- Access Control: Admin seeded securely from your EC2 Instance ID; ENABLE_SIGNUP=false prevents rogue registrations.
- OS-Level Hardening: Pre-configured with UFW firewall, fail2ban, auditd, and unattended security upgrades.
- 100% Data Privacy: 100% on-instance inference. Your prompts, embeddings, and uploaded documents stay strictly within your own VPC.
- Audit Readiness: Includes a pre-configured CloudWatch Agent configuration template for Nginx audit logs (requires attaching an appropriate IAM role to the EC2 instance).
REQUIREMENTS
- Amazon EC2 GPU Instance: Compatible with g4dn, g5, g6, p3, p4d, or p5 families.
- Security Group Rules: Inbound TCP 443 for WebUI users, TCP 22 for administrative SSH.
- EC2 Key Pair: Required for SSH access (password-based SSH login is strictly disabled).
OPTIONAL vLLM SWITCHING & VRAM SAFETY
On single-GPU instances, running Ollama and vLLM simultaneously will cause a VRAM Out-of-Memory (OOM) crash. To safely switch from the default Ollama engine to the high-throughput vLLM engine, you MUST stop Ollama first to free up GPU memory:
To safely switch to vLLM:
sudo systemctl stop ollama
sudo systemctl start corenova-ai-vllm
To safely switch back to Ollama:
sudo systemctl stop corenova-ai-vllm
sudo systemctl start ollama
Part of the CoreNova Hardened AMI series. Technical Support & Inquiries: support@corenovacloud.com
Highlights
- All-in-One AI Stack: Pre-integrated Open WebUI, vLLM, and Ollama for instant local LLM deployment.
- Anti-Hijack Security Lock: Public registration is disabled; your unique AWS Instance ID is the default admin key.
- Auto Elastic Multi-GPU Tuning: Native CUDA optimization with automatic tensor parallelism for multi-card scaling.
Details
Introducing multi-product solutions
You can now purchase comprehensive solutions tailored to use cases and industries.
Features and programs
Financing for AWS Marketplace purchases
Pricing
Dimension | Cost/hour |
|---|---|
g6.xlarge Recommended | $0.25 |
g5.12xlarge | $0.25 |
g6.48xlarge | $0.25 |
p5.48xlarge | $0.25 |
g5.xlarge | $0.25 |
g6.12xlarge | $0.25 |
g4dn.xlarge | $0.25 |
Vendor refund policy
30-day refund on AWS Marketplace software fees for this product. Email support@corenovacloud.com with your AWS account ID and purchase date. Software fees only; EC2 charges are not refundable by the seller. We reply within 5 business days. Free trial: no software charges during the trial.
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
64-bit (x86) Amazon Machine Image (AMI)
Amazon Machine Image (AMI)
An AMI is a virtual image that provides the information required to launch an instance. Amazon EC2 (Elastic Compute Cloud) instances are virtual servers on which you can run your applications and workloads, offering varying combinations of CPU, memory, storage, and networking resources. You can launch as many instances from as many different AMIs as you need.
Version release notes
Enterprise Secure AI Inference Sandbox Ubuntu 22.04 LTS on Ubuntu 22.04 LTS (x86_64, HVM).
Version: v20260610
Stack:
- SSH hardening: key-only access, root login disabled (user: ubuntu)
- Firewall baseline (UFW)
- auditd, rsyslog, chrony enabled
- NVIDIA CUDA drivers pre-installed (verified with nvidia-smi)
- NVIDIA AI Enterprise stack: CUDA 12.x, cuDNN, TensorRT
- Ollama runtime for local LLM inference
- Automatic security updates via unattended-upgrades
Compliance: CIS Benchmark guidance with OpenSCAP profile for transparency. Organizations should run their own validation to meet specific regulatory requirements.
Part of the CoreNova Hardened AMI product family.
AMI: ami-0a72d9650178b60b5 (us-east-1)
Additional details
Usage instructions
Overview
Enterprise Secure AI Inference Sandbox Ubuntu 22.04 LTS - security-hardened AI inference AMI for Ubuntu 22.04 LTS (x86_64). Recommended instance type: g5.xlarge (GPU instances with NVIDIA AI stack pre-installed).
Launch checklist
Step 1 - Subscribe in AWS Marketplace, then Launch in us-east-1.
Step 2 - Instance type: g5.xlarge, p3.2xlarge, g4dn.xlarge, or other NVIDIA GPU instance.
Step 3 - Key pair: select your EC2 SSH key. Password login is disabled.
Step 4 - Security group: allow inbound TCP 22 from your administrator IP only.
First connection
ssh -i your-key.pem ubuntu@YOUR_PUBLIC_IP
Post-launch verify
nvidia-smi Expected: shows NVIDIA driver version and GPU(s). systemctl is-active auditd chrony rsyslog sudo systemctl is-active ufw ollama list Expected: shows available AI models (if pre-configured).
Support
Email: support@corenovacloud.com Web: https://www.corenovacloud.com/ Refund: 30-day refund on Marketplace software fees. EC2 charges not refundable by seller.
Include AWS Region, AMI ID, EC2 Instance ID, instance type, and steps to reproduce.
Support
Vendor support
Email: support@corenovacloud.com Web: https://www.corenovacloud.com/ Refund: 30-day refund on Marketplace software fees. EC2 charges not refundable by seller.
Include AWS Region, AMI ID, EC2 Instance ID, instance type, and steps to reproduce.
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.