Overview
GLM-4 9B is a state-of-the-art bilingual large language model by Zhipu AI. Runs locally on your EC2 instance via Ollama with full OpenAI-compatible REST API. Supports 128K context window, function calling, and code generation. Apache 2.0 license allows unlimited commercial use with zero per-token API fees. Recommended: g4dn.xlarge (GPU) for production inference. t3.micro available for free trial (CPU-only, slower). Published by Waltsoft. Model weights download automatically on first boot (~5GB).
Highlights
- 4x faster than original Whisper with CTranslate2
- OpenAI-compatible transcription API
- Voice activity detection and GPU acceleration
Details
Introducing multi-product solutions
You can now purchase comprehensive solutions tailored to use cases and industries.
Features and programs
Financing for AWS Marketplace purchases
Pricing
Free trial
Dimension | Description | Cost/hour |
|---|---|---|
t3.medium Recommended | t3.medium instance | $0.50 |
t3.micro | t3.micro instance | $0.00 |
t3.large | t3.large instance | $0.50 |
m5.large | m5.large instance | $0.50 |
m5.xlarge | m5.xlarge instance | $0.50 |
r5.large | r5.large instance | $0.50 |
Vendor refund policy
No refunds. Cancel anytime. Contact support@waltsoft.net .
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
64-bit (x86) Amazon Machine Image (AMI)
Amazon Machine Image (AMI)
An AMI is a virtual image that provides the information required to launch an instance. Amazon EC2 (Elastic Compute Cloud) instances are virtual servers on which you can run your applications and workloads, offering varying combinations of CPU, memory, storage, and networking resources. You can launch as many instances from as many different AMIs as you need.
Version release notes
Initial release. glm4:9b via Ollama on Ubuntu 24.04. Pull-on-boot.
Additional details
Usage instructions
For GPU inference: launch g4dn.xlarge ($0.53/hr). For 70B models: g5.12xlarge. t3.micro for free trial (CPU-only, very slow). First boot downloads model (~5-8GB). API: http://<public-ip>:11434/api/generate. OpenAI-compatible: http://<public-ip>:11434/v1/chat/completions
Support
Vendor support
For technical support, email support@waltsoft.net or visit
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.