AI-Ready Ubuntu 24.04 LTS (NVIDIA GPU) | Support by Clearscale

Turn-key GPU Ubuntu 24.04 LTS AMI: NVIDIA drivers (CUDA 13.2), Ollama orchestrator, and a Caddy edge proxy. Launch a private, bearer-token gated, OpenAI-compatible LLM endpoint in your own VPC in minutes. Qwen3-1.7B baked in; swap models via cloud-init.

View purchase options

Overview

Try agent mode

Create proposal

Ask question

ClearImages AI-Ready Ubuntu 24.04 LTS is a turn-key, GPU-enabled AMI that serves a private, bearer-token-gated LLM endpoint - both Ollama-native and OpenAI-compatible - inside your own VPC, with no Kubernetes or hand-assembled GPU stack. It is built on the CIS Level 1 Ubuntu 24.04 LTS baseline; the sections below are how you configure and use it.

Configure with cloud-init (8-key schema) Pass an ai-edge intake JSON via cloud-init user-data to /var/lib/clearimages/firstboot/ai-edge-intake.json. Set any subset; omitted keys take baked defaults.
- model_name - Ollama model tag to serve (default qwen3:1.7b-q4_K_M). First boot runs ollama pull for any non-default tag (needs outbound to ollama.com / huggingface.co); on failure the baked model keeps serving. Examples: qwen3:8b-q4_K_M, llama3.2:3b, phi3.5-mini.
- api_auth_token - bearer token for the endpoint. Omit to auto-generate (stored at /etc/ai-edge/api-token), or resolve from SSM / Secrets Manager (below).
- api_bind_mode - loopback | private | public (default loopback).
- api_port - HTTPS listen port (default 443).
- tls_mode - self-signed | acme | off (default self-signed).
- acme_domain - FQDN for ACME / Let's Encrypt when tls_mode=acme.
- max_concurrent_requests - per-IP concurrency cap (default 2).
- max_prompt_bytes - max request body in bytes (default 131072 / 128 KiB). Example user-data:
#cloud-config
write_files:
- path: /var/lib/clearimages/firstboot/ai-edge-intake.json
content: '{"model_name":"qwen3:8b-q4_K_M","api_bind_mode":"public"}'

Use a token without plaintext (SSM / Secrets Manager) Instead of api_auth_token, set a resolver in /etc/clearimages/firstboot.conf via user-data and attach an IAM instance profile with ssm:GetParameter (or secretsmanager:GetSecretValue):
CLEARIMAGES_RESOLVERS=("env:AI_API_AUTH_TOKEN=ssm:/your/path/api_token")

Access the model (two API schemas, one port) Caddy reverse-proxies both API surfaces on api_port (default 443); every request requires the header Authorization: Bearer <token>. Self-signed TLS means callers pass curl -k (or trust the cert).
- Ollama-native /api/* : /api/generate, /api/chat, /api/embeddings, /api/tags.
- OpenAI-compatible /v1/* : /v1/chat/completions, /v1/completions, /v1/embeddings, /v1/models - a drop-in base_url for OpenAI SDKs. Examples:
curl -k -H "Authorization: Bearer $TOKEN" -d '{"model":"qwen3:1.7b-q4_K_M","prompt":"2+2=","stream":false}' https://<IP>/api/generate
curl -k -H "Authorization: Bearer $TOKEN" -d '{"model":"qwen3:1.7b-q4_K_M","messages":[{"role":"user","content":"hi"}]}' https://<IP>/v1/chat/completions

Manage models at runtime (ai-edge CLI)
sudo ai-edge status - driver / GPU, Ollama, Caddy, and model health.
sudo ai-edge swap-model qwen3:8b-q4_K_M - pull and switch the served model live.
sudo ai-edge token show - print the active bearer token; to rotate, update the SSM parameter then restart clearimages-firstboot.service.

What is included: NVIDIA 580 open-kernel driver + CUDA 13.2 (pinned), Ollama 0.12.11 (loopback:11434 under systemd confinement: NoNewPrivileges, ProtectSystem=strict, scoped /dev/nvidia*), Caddy 443 edge proxy (bearer auth, per-IP rate limit, 128 KiB body cap, TLS 1.2+), and Qwen3-1.7B (Q4_K_M, ~1 GiB) baked in for an instant first response.

Hardened and audit-ready: IMDSv2 enforced, encrypted gp3 root, ENA; CIS Level 1 baseline; root SSH and password authentication disabled.

GPU instances: x86_64 g4dn, g5, g6, g6e (NVIDIA T4 / A10G / L4 / L40S); g4dn.xlarge is a good start. A companion ARM listing covers Graviton g5g (NVIDIA T4G). GPU instance types only.

Note: Preview release. vLLM, Open WebUI, customer-supplied model upload, and Inferentia / Trainium variants are not included in this version.

Highlights

Private LLM Endpoint in Minutes: NVIDIA GPU driver (580-series, CUDA 13.2), the Ollama orchestrator, and a Caddy edge proxy are pre-wired. Qwen3-1.7B is baked into the image, so the first OpenAI-compatible request is served as soon as first boot converges - no external download required.
Secure by Default: Bearer-token authentication on every request, per-IP rate limiting, a 128 KiB body cap, and TLS 1.2+ at the edge. IMDSv2 enforced, encrypted gp3 root, root SSH and password auth disabled, and CIS Level 1 hardening inherited from the ClearImages baseline.
Customer-Tunable, No Plaintext Secrets: An 8-key cloud-init schema swaps models (Qwen3-8B, Llama 3.2, Phi-3.5) and tunes the edge, with the API token resolved from SSM Parameter Store or Secrets Manager. The on-box ai-edge CLI handles status, model swap, token rotation, and health checks.

Details

Sold by

ClearScale

Introducing multi-product solutions

You can now purchase comprehensive solutions tailored to use cases and industries.

Learn more

Explore multi-product solutions

Features and programs

Financing for AWS Marketplace purchases

AWS Marketplace now accepts line of credit payments through the PNC Vendor Finance program. This program is available to select AWS customers in the US, excluding NV, NC, ND, TN, & VT.

View financing details

Pricing

AI-Ready Ubuntu 24.04 LTS (NVIDIA GPU) | Support by Clearscale

Info

View purchase options

Pricing is based on actual usage, with charges varying according to how much you consume. Subscriptions have no end date and may be canceled any time.

Additional AWS infrastructure costs may apply. Use the AWS Pricing Calculator to estimate your infrastructure costs.

Usage costs (30)

Info

Dimension	Cost/hour
g4dn.xlarge Recommended	$0.24
g6.48xlarge	$3.84
g5.48xlarge	$3.84
g6.16xlarge	$3.84
g5.12xlarge	$2.88
g6.24xlarge	$3.84
g4dn.8xlarge	$1.92
g5.24xlarge	$3.84
g6.xlarge	$0.24
g6e.8xlarge	$1.92

Vendor refund policy

Usage is billed by AWS on a pay-as-you-go basis by the hour. The AI-Ready Ubuntu 24.04 LTS GPU instance can be stopped or terminated at any time to stop incurring additional software charges. Refunds are not available once launched. To completely avoid future costs, ensure you terminate the instance and cancel your AWS Marketplace subscription. For refund requests, contact clearimages-support@clearscale.com .

How can we make this page better?

Tell us how we can improve this page, or report an issue with this product.

Legal

Vendor terms and conditions

Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

Content disclaimer

Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

Usage information

Info

Delivery details

64-bit (x86) Amazon Machine Image (AMI)

Amazon Machine Image (AMI)

An AMI is a virtual image that provides the information required to launch an instance. Amazon EC2 (Elastic Compute Cloud) instances are virtual servers on which you can run your applications and workloads, offering varying combinations of CPU, memory, storage, and networking resources. You can launch as many instances from as many different AMIs as you need.

Version release notes

CIS L1 hardened build - latest security patches and improvements

Additional details

Usage instructions

Launch via AWS Marketplace 1-Click or the EC2 console on a GPU instance (g4dn.xlarge or larger). Attach an IAM instance profile that allows ssm:GetParameter (and secretsmanager:GetSecretValue if you use Secrets Manager) for your token path. Require IMDSv2 (HttpTokens=required).
Security group: allow inbound 443 from your API callers and 22 for SSH / SSM administration.
(Optional) Configure via cloud-init user-data using the 8-key ai-edge schema: model_name, api_auth_token, api_bind_mode, api_port, tls_mode, acme_domain, max_concurrent_requests, max_prompt_bytes. Omitted keys take baked defaults; the bearer token resolves from SSM Parameter Store or Secrets Manager.
Wait for first-boot convergence (about 1-2 minutes), then check status: ssh ubuntu@<public-ip> sudo ai-edge status
Call the OpenAI-compatible endpoint: curl -k -H "Authorization: Bearer <token>" -H "Content-Type: application/json" -d '{"model":"qwen3:1.7b-q4_K_M","prompt":"2+2=","stream":false}' https://<public-ip>/api/generate /v1/chat/completions and /v1/embeddings are available on the same port.
Swap models on a running instance: sudo ai-edge swap-model qwen3:8b-q4_K_M. Rotate the token: sudo ai-edge token show, update the SSM parameter, then restart clearimages-firstboot.service.

Resources

Vendor resources

Clearscale Managed Services

Support

Vendor support

Email support for this AMI is available through the following: https://clearscale.com/clearimages/support OR clearimages-support@clearscale.com

AWS infrastructure support

AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

Get support

Similar products

Get AI Ready

By Vivanti Consulting

There is no AI Strategy without a Data Strategy. We've all been saying that for a few years, and it's still true. To make your data ready for AI, you must take it beyond quality and cataloging and into labeling, prompt engineering, and bias mitigation. A strong foundation of high-quality, accessible, and well-governed data is essential for successful Artificial Intelligence (AI) initiatives, as AI relies entirely on data to function effectively and avoid producing biased or flawed results.

View product

Aligned TG AI Readiness Assessment

By Aligned Technology Group

Most AI initiatives stall because organizations move into implementation before understanding their readiness gaps across use cases, data, architecture, and economics. Aligned Technology Group's AI Readiness Assessment gives business and technical leaders a structured, jargon-free view of where their organization stands today, a validated business case, and a clear roadmap to drive measurable value from AI on AWS.

View product

AI Readiness & Infrastructure Assessment for AWS Workloads

By The Server Labs Ltd

The AI Readiness & Infrastructure Assessment for AWS helps organisations determine whether their existing AWS environments are capable of supporting secure, scalable, and production-grade AI workloads. Delivered by The Server Labs with over 20 years of AWS expertise, this service provides an engineering-led evaluation of infrastructure, security, and operational readiness for AI adoption. The outcome is a clear, evidence-based roadmap to enable AI safely and effectively in production.

View product

Generative AI Readiness Assessment Workshop

By TensorIoT Inc

Experience the power of artificial intelligence with TensorIoT's Generative AI Readiness Assessment Workshop. This workshop is specially designed to equip your team with the knowledge and tools required to deploy and fine-tune generative AI models, ensuring alignment with your unique use cases and compliance requirements. Through this readiness assessment, we help you identify key applications, develop a robust architecture, and establish measures to track business value and support AI adoption.

View product

AWS Generative AI Readiness Assessment

By Opti9 Technologies

Ensure your organization is ready to leverage the power of Generative AI with Opti9's comprehensive GenAI Readiness Assessment. We'll help you identify opportunities, mitigate risks, and speed up time-to-market with a clear plan for implementation.

View product

Customer reviews

Leave a review

Ratings and reviews

Info

1 ratings

5 star

4 star

3 star

2 star

1 star

100%

1 AWS reviews

reviewer2855742

Secure private endpoints have simplified deployment and now provide fast, token-gated LLM access

Reviewed on Jun 11, 2026

Review from a verified AWS customer

What is our primary use case?

This product is a good solution for private LLM endpoints.

How has it helped my organization?

There were improvements to the organization, although they were minimal.

What is most valuable?

Setting up a secure, private LLM backend usually means wrestling with NVIDIA drivers, package dependencies, and reverse proxies. ClearScale's AI-Ready Ubuntu image completely eliminates that overhead.

I had an encrypted, token-gated API endpoint serving local models inside my VPC within minutes of launching the instance. The pre-installed Ollama orchestrator and Caddy reverse proxy work seamlessly together right out of the box. If I need a production-ready, secure foundation for localized inference without the typical deployment headache, this AMI is an absolute lifesaver. It deserves 5 stars.

For how long have I used the solution?

I used the solution for less than one hour.

What do I think about the stability of the solution?

There were no stability issues.

What do I think about the scalability of the solution?

There were no scalability issues.

How was the initial setup?

The initial setup was straightforward.

What about the implementation team?

The implementation was handled internally by my team.

What was our ROI?

There was no specific return on investment mentioned.

What other advice do I have?

I have no additional advice to provide.

Which deployment model are you using for this solution?

Private Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)

View all reviews