Select your cookie preferences

We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.

If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”

Sign in
English
Español
Français
日本語
한국어
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

Missing drivers

  • By AI researcher unhappy with NVIDIA software
  • on 12/18/2023

This should be preconfigured to run NVIDIA GPU Cloud (NGC) containers such as the PyTorch one, however it fails on launch on AWS (on a p3.2xlarge instance).

After sshing in, I see this error message:
```
Installing drivers ...
modprobe: FATAL: Module nvidia not found in directory /lib/modules/6.2.0-1011-aws
```
And sure enough, running containers such as PyTorch (https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch) does not work:

```
~$ docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:23.11-py3
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: nvml error: driver not loaded: unknown.
```


There are no comments to display