Overview
The NVIDIA GPU-Optimized AMI is a virtual machine image for accelerating your GPU accelerated Machine Learning, Deep Learning, Data Science and HPC workloads. Using this AMI, you can spin up a GPU-accelerated EC2 VM instance in minutes with a pre-installed Ubuntu OS, GPU driver, Docker and NVIDIA container toolkit.
This AMI provides easy access to NVIDIA's NGC Catalog, a hub for GPU-optimized software, for pulling & and running performance-tuned, tested, and NVIDIA certified docker containers. The NGC catalog provides free access to containerized AI, Data Science, and HPC applications, pre-trained models, AI SDKs and other resources to enable data scientists, developers, and researchers to focus on building and deploying solutions.
This GPU-optimized AMI is free with an option to purchase enterprise support offered through NVIDIA AI Enterprise. For how to get support for this AMI, scroll down to 'Support Information'
NVIDIA GPU-Optimized AMI includes:
- Ubuntu Server OS
- NVIDIA Driver
- Docker-ce
- NVIDIA Container Toolkit
- AWS CLI, NGC CLI
- Miniconda, JupyterLab, Git
Highlights
- Provides data scientists and developers fast and easy access to NVIDIA A100, A10, V100 and T4 GPUs in the cloud and GPU-optimized AI/HPC software in an environment that is fully certified by NVIDIA.
- Optimized for highest performance across a wide range of workloads on NVIDIA GPUs
- NVIDIA accelerates innovation by eliminating the complex do-it-yourself task of building and optimizing a complete deep learning software stack tuned specifically for GPUs.
Details
Typical total price
$3.06/hour
Pricing
Instance type | Product cost/hour | EC2 cost/hour | Total/hour |
---|---|---|---|
p3.2xlarge Recommended | $0.00 | $3.06 | $3.06 |
p3.8xlarge | $0.00 | $12.24 | $12.24 |
p3.16xlarge | $0.00 | $24.48 | $24.48 |
p3dn.24xlarge | $0.00 | $31.212 | $31.212 |
p4d.24xlarge | $0.00 | $32.773 | $32.773 |
g4dn.xlarge | $0.00 | $0.526 | $0.526 |
g4dn.2xlarge | $0.00 | $0.752 | $0.752 |
g4dn.4xlarge | $0.00 | $1.204 | $1.204 |
g4dn.8xlarge | $0.00 | $2.176 | $2.176 |
g4dn.12xlarge | $0.00 | $3.912 | $3.912 |
Additional AWS infrastructure costs
Type | Cost |
---|---|
EBS General Purpose SSD (gp3) volumes | $0.08/per GB/month of provisioned storage |
Vendor refund policy
This AMI is provided free of charge.
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
64-bit (x86) Amazon Machine Image (AMI)
Amazon Machine Image (AMI)
An AMI is a virtual image that provides the information required to launch an instance. Amazon EC2 (Elastic Compute Cloud) instances are virtual servers on which you can run your applications and workloads, offering varying combinations of CPU, memory, storage, and networking resources. You can launch as many instances from as many different AMIs as you need.
Version release notes
Ubuntu Server: 22.04 (x86) NVIDIA TRD Driver: 550.127.05 Docker-ce: 27.3.1 NVIDIA Container Toolkit: 1.16.2-1 Latest AWS, CLI Miniconda: latest JupyterLab latest and other Jupyter core packages NGC-CLI: 3.53.0 Git, Python3-PIP
Additional details
Usage instructions
For usage instructions and quick start guide, please refer: https://docs.nvidia.com/ngc/ngc-deploy-public-cloud/ngc-aws/index.html
Resources
Support
Vendor support
This AMI comes with an Enterprise Support option https://www.nvidia.com/en-us/data-center/products/ai-enterprise-suite/support/
For more information please check https://www.nvidia.com/en-us/data-center/products/ai-enterprise-suite/support/
Free support for AWS images is available through forums, technical documentation & FAQs https://devtalk.nvidia.com/default/board/200/nvidia-gpu-cloud-ngc-users/ NVIDIA Developer Forum
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.
Similar products
Customer reviews
Install Drivers
You need to first change the username from root to ubuntu in order to have the drivers be installed! I feel this should have been more specified in the directions!
AMI is not configured as advertised.
None of the advertised utilities are installed in the AMI, neither is CUDA. This is current as of 3/14/24. It appears to be a raw installation of 22.04, by my estimation.
root@ip-172-31-38-109:/cuda-samples/Samples/5_Domain_Specific/nbody# jupyterlab --version/cuda-samples/Samples/5_Domain_Specific/nbody# miniconda --version
jupyterlab: command not found
root@ip-172-31-38-109:
miniconda: command not found
There's been a lot of troubleshooting so far with regard to attempting to get cuda installed, so I won't copy-paste my terminal.
Drives auto installed on login not boot
I wanted to use this AMI in my automation to run ML jobs in our platform. What I needed was a Ubuntu 22.04, because podman is in the repo, and Nvidia drivers installed. The downside of this AMI is, Nvidia drivers are installed via /home/ubuntu/.bashrc and not cloud-init. I looked at /var/tmp/nvidia/driver.sh and there was no variable to set to force driver install at cloud-init. Since my automation runs at the end of cloud-init this doesn't work.
Very good
Older reviews are not valid anymore, now at the date of my review the image is very good, it has all the drivers required to run optimized code on various types of NVIDIA GPUs, it has CUDA 12.1 preinstalled and also miniconda and Jupyterlab.
The machine is ready to run code on GPU very easily with everything you need already in place.
Missing drivers
This should be preconfigured to run NVIDIA GPU Cloud (NGC) containers such as the PyTorch one, however it fails on launch on AWS (on a p3.2xlarge instance).
After sshing in, I see this error message:
<br/>Installing drivers ...<br/>modprobe: FATAL: Module nvidia not found in directory /lib/modules/6.2.0-1011-aws<br/>
And sure enough, running containers such as PyTorch (https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch ) does not work:
<br/>~$ docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:23.11-py3<br/>docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'<br/>nvidia-container-cli: initialization error: nvml error: driver not loaded: unknown.<br/>