Overview
ClearImages AI-Ready Ubuntu 24.04 LTS is a turn-key, GPU-enabled AMI that serves a private, bearer-token-gated LLM endpoint - both Ollama-native and OpenAI-compatible - inside your own VPC, with no Kubernetes or hand-assembled GPU stack. It is built on the CIS Level 1 Ubuntu 24.04 LTS baseline; the sections below are how you configure and use it.
Configure with cloud-init (8-key schema)
Pass an ai-edge intake JSON via cloud-init user-data to /var/lib/clearimages/firstboot/ai-edge-intake.json. Set any subset; omitted keys take baked defaults.
- model_name - Ollama model tag to serve (default qwen3:1.7b-q4_K_M). First boot runs ollama pull for any non-default tag (needs outbound to ollama.com / huggingface.co); on failure the baked model keeps serving. Examples: qwen3:8b-q4_K_M, llama3.2:3b, phi3.5-mini.
- api_auth_token - bearer token for the endpoint. Omit to auto-generate (stored at /etc/ai-edge/api-token), or resolve from SSM / Secrets Manager (below).
- api_bind_mode - loopback | private | public (default loopback).
- api_port - HTTPS listen port (default 443).
- tls_mode - self-signed | acme | off (default self-signed).
- acme_domain - FQDN for ACME / Let's Encrypt when tls_mode=acme.
- max_concurrent_requests - per-IP concurrency cap (default 2).
- max_prompt_bytes - max request body in bytes (default 131072 / 128 KiB).
Example user-data:
#cloud-config
write_files:
- path: /var/lib/clearimages/firstboot/ai-edge-intake.json
content: '{"model_name":"qwen3:8b-q4_K_M","api_bind_mode":"public"}'
Use a token without plaintext (SSM / Secrets Manager)
Instead of api_auth_token, set a resolver in /etc/clearimages/firstboot.conf via user-data and attach an IAM instance profile with ssm:GetParameter (or secretsmanager:GetSecretValue):
CLEARIMAGES_RESOLVERS=("env:AI_API_AUTH_TOKEN=ssm:/your/path/api_token")
Access the model (two API schemas, one port)
Caddy reverse-proxies both API surfaces on api_port (default 443); every request requires the header Authorization: Bearer <token>. Self-signed TLS means callers pass curl -k (or trust the cert).
- Ollama-native /api/* : /api/generate, /api/chat, /api/embeddings, /api/tags.
- OpenAI-compatible /v1/* : /v1/chat/completions, /v1/completions, /v1/embeddings, /v1/models - a drop-in base_url for OpenAI SDKs.
Examples:
curl -k -H "Authorization: Bearer $TOKEN" -d '{"model":"qwen3:1.7b-q4_K_M","prompt":"2+2=","stream":false}' https://<IP>/api/generate
curl -k -H "Authorization: Bearer $TOKEN" -d '{"model":"qwen3:1.7b-q4_K_M","messages":[{"role":"user","content":"hi"}]}' https://<IP>/v1/chat/completions
Manage models at runtime (ai-edge CLI)
sudo ai-edge status - driver / GPU, Ollama, Caddy, and model health.
sudo ai-edge swap-model qwen3:8b-q4_K_M - pull and switch the served model live.
sudo ai-edge token show - print the active bearer token; to rotate, update the SSM parameter then restart clearimages-firstboot.service.
What is included: NVIDIA 580 open-kernel driver + CUDA 13.2 (pinned), Ollama 0.12.11 (loopback:11434 under systemd confinement: NoNewPrivileges, ProtectSystem=strict, scoped /dev/nvidia*), Caddy 443 edge proxy (bearer auth, per-IP rate limit, 128 KiB body cap, TLS 1.2+), and Qwen3-1.7B (Q4_K_M, ~1 GiB) baked in for an instant first response.
Hardened and audit-ready: IMDSv2 enforced, encrypted gp3 root, ENA; CIS Level 1 baseline; root SSH and password authentication disabled.
GPU instances: x86_64 g4dn, g5, g6, g6e (NVIDIA T4 / A10G / L4 / L40S); g4dn.xlarge is a good start. A companion ARM listing covers Graviton g5g (NVIDIA T4G). GPU instance types only.
Note: Preview release. vLLM, Open WebUI, customer-supplied model upload, and Inferentia / Trainium variants are not included in this version.
Highlights
- Private LLM Endpoint in Minutes: NVIDIA GPU driver (580-series, CUDA 13.2), the Ollama orchestrator, and a Caddy edge proxy are pre-wired. Qwen3-1.7B is baked into the image, so the first OpenAI-compatible request is served as soon as first boot converges - no external download required.
- Secure by Default: Bearer-token authentication on every request, per-IP rate limiting, a 128 KiB body cap, and TLS 1.2+ at the edge. IMDSv2 enforced, encrypted gp3 root, root SSH and password auth disabled, and CIS Level 1 hardening inherited from the ClearImages baseline.
- Customer-Tunable, No Plaintext Secrets: An 8-key cloud-init schema swaps models (Qwen3-8B, Llama 3.2, Phi-3.5) and tunes the edge, with the API token resolved from SSM Parameter Store or Secrets Manager. The on-box ai-edge CLI handles status, model swap, token rotation, and health checks.
Details
Introducing multi-product solutions
You can now purchase comprehensive solutions tailored to use cases and industries.
Features and programs
Financing for AWS Marketplace purchases
Pricing
Dimension | Cost/hour |
|---|---|
g4dn.xlarge Recommended | $0.24 |
g6.48xlarge | $3.84 |
g5.48xlarge | $3.84 |
g6.16xlarge | $3.84 |
g5.12xlarge | $2.88 |
g6.24xlarge | $3.84 |
g4dn.8xlarge | $1.92 |
g5.24xlarge | $3.84 |
g6.xlarge | $0.24 |
g6e.8xlarge | $1.92 |
Vendor refund policy
Usage is billed by AWS on a pay-as-you-go basis by the hour. The AI-Ready Ubuntu 24.04 LTS GPU instance can be stopped or terminated at any time to stop incurring additional software charges. Refunds are not available once launched. To completely avoid future costs, ensure you terminate the instance and cancel your AWS Marketplace subscription. For refund requests, contact clearimages-support@clearscale.com .
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
64-bit (x86) Amazon Machine Image (AMI)
Amazon Machine Image (AMI)
An AMI is a virtual image that provides the information required to launch an instance. Amazon EC2 (Elastic Compute Cloud) instances are virtual servers on which you can run your applications and workloads, offering varying combinations of CPU, memory, storage, and networking resources. You can launch as many instances from as many different AMIs as you need.
Version release notes
CIS L1 hardened build - latest security patches and improvements
Additional details
Usage instructions
- Launch via AWS Marketplace 1-Click or the EC2 console on a GPU instance (g4dn.xlarge or larger). Attach an IAM instance profile that allows ssm:GetParameter (and secretsmanager:GetSecretValue if you use Secrets Manager) for your token path. Require IMDSv2 (HttpTokens=required).
- Security group: allow inbound 443 from your API callers and 22 for SSH / SSM administration.
- (Optional) Configure via cloud-init user-data using the 8-key ai-edge schema: model_name, api_auth_token, api_bind_mode, api_port, tls_mode, acme_domain, max_concurrent_requests, max_prompt_bytes. Omitted keys take baked defaults; the bearer token resolves from SSM Parameter Store or Secrets Manager.
- Wait for first-boot convergence (about 1-2 minutes), then check status: ssh ubuntu@<public-ip> sudo ai-edge status
- Call the OpenAI-compatible endpoint: curl -k -H "Authorization: Bearer <token>" -H "Content-Type: application/json" -d '{"model":"qwen3:1.7b-q4_K_M","prompt":"2+2=","stream":false}' https://<public-ip>/api/generate /v1/chat/completions and /v1/embeddings are available on the same port.
- Swap models on a running instance: sudo ai-edge swap-model qwen3:8b-q4_K_M. Rotate the token: sudo ai-edge token show, update the SSM parameter, then restart clearimages-firstboot.service.
Resources
Vendor resources
Support
Vendor support
Email support for this AMI is available through the following: https://clearscale.com/clearimages/support OR clearimages-support@clearscale.com
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.