Sign in
Categories
Migration Mapping Assistant Your Saved List Partners Sell in AWS Marketplace Amazon Web Services Home Help

Deep Learning AMI with Source Code (CUDA 9, Ubuntu)

Comes with deep learning frameworks configured with CUDA 9. Includes Apache MXNet, TensorFlow, PyTorch, Keras 2.0 and Caffe2. THIS AMI IS NOT UPDATED ANYMORE. For latest AMI, visit <a href="https://aws.amazon.com/marketplace/pp/B077GCH38C" target="_blank">https://aws.amazon.com/marketplace/pp/B077GCH38C</a> Release tags/Branches used: Apache MXNet 1.1 (with Gluon) Caffe2 0.8.1... See more

Customer Reviews

4
Create Your Own Review

no nvidia-smi and nvcc and tensorflow

  • By stni
  • on 01/02/2018

ubuntu@ip-172-31-33-144:/usr/local$ nvcc
The program 'nvcc' is currently not installed. You can install it by typing:
sudo apt install nvidia-cuda-toolkit

nvidia-smi command not found.

tensorflow not in python

How to use it?


Fails to install my ssh key

  • By Bad AMI
  • on 12/20/2017

Unable to connect to the instance. Definitely not a problem with my key. Have tried launching from the marketplace and also directly from the control panel.


Tensorflow Batchnorm Issue but otherwise good

  • By Everett Berry
  • on 11/15/2017

This is a great instance for the CUDA versions and configuration and once I fixed the issue below my training was very fast. HOWEVER you should be very careful with using tensorflow on this instance. It is a Frankenstein's Monster of bleeding edge tensorflow (1.4-rc0) plus some PRs which have not even been merged to master to take advantage of the Voltas and CUDA 9.

My issue was:
'AttributeError: can't set attribute' while using the BatchNormalization layer in Tensorflow. It relates to this PR (https://github.com/tensorflow/tensorflow/pull/13388) where a 'dtype' is added to BatchNorm to allow for FP16 and FP 32 operations. There is an extra line in the tensorflow included in this AMI in /usr/local/lib/python2.7/dist-packages/tensorflow/python/layers/normalization.py on line 145, 'self.dtype = dtype' which causes the error above when using the normal BatchNorm api. Commenting this line out fixes the problem.

Weirdly, this assignment on line 145 is not included in the PR (although the dates and authors match) so I think there must have been a rebase or something. Regardless, the line exists in the tensorflow in this ami and will cause you pain on almost any neural network because they almost all use BatchNorm. I couldn't figure out where I should post this because the code on Github does not have this problem.

Other than that - this is a fine AMI and I'm grateful to AWS for providing it and
for their continued advances in GPUs.


Incomplete

  • By TF adm
  • on 11/08/2017

Trying to run testall but it fails.
Tensorflow test fails with keras not found.
Trying to upgrade Tensorflow and it breaks the install: only CPU mode, no GPU.
This AMI needs work.
Tensorflow is 1.4.0-rc0.


showing 1 - 4