AWS Deep Learning AMI (Amazon Linux)
This document describes the latest changes, additions, known issues, and fixes for Deep Learning AMI (Amazon Linux).
Release Date: February 10, 2021
Created On: February 10, 2021
Last Updated: November 04, 2021
Deprecation Notice
The Amazon Linux AMI ended its standard support on December 31, 2020 and has entered a maintenance support phase. Consequently, there will no longer be updates to the Deep Learning AMI (Amazon Linux) in new releases. Previous releases will continue to be available.
Version 50.0
Release Date: 2021-10-07
Security
- Updated docker version to docker-20.10.7-3
Version 49.1
Release Date: 2021-09-20
Changed
- Updated kernel to 4.14.238-125.422.amzn1.x86_64
Version 49.0
Release Date: 2021-09-07
Changed
- Updated jupyterlab>=3.1.7 and notebook>=6.4.1 versions in all conda environments
Security
- Updated TensorFlow to 2.3.4 in conda environment tensorflow2_latest_p37.
Version 48.0
Release Date: 2021-07-13
Changed
- added utility packages "imageio", "plotly", "smdebug", "shap", "opencv-python", "bokeh","seaborn" in python3 conda environments.
Version 47.0
Release Date: 2021-06-10
Changed
- Updated awscli version to 1.19.88
- Updated Horovod version from 0.21.0 to 0.22.0 in conda environment tensorflow2_latest_p37
Security
- Updated amazonei_tensorflow_p36 environment with tensorflow-1.15.5.
Version 46.0
Release Date: 2021-05-27
Fixed
- Fixed a race conditoin in ec2-user creation process which causing ec2-user to have a UID=501 instead of 500.
Security
- Removed vulnerable CUDA-10.0 componenets (Visual Profiler, Nsight EE, and JRE) from the CUDA-10.0 installation (/usr/local/cuda-10.0).
Version 45.0
Release Date: 2021-05-25
Changed
- Upgraded runc to latest
Version 44.1
Release Date: 2021-05-07
Changed
- Fix an issue where UID for ec2-user changed from 500 to 501.
Version 44.0
Release Date: 2021-04-26
Changed
- Updated Nvidia Tesla driver and Fabric Manager version to 450.119.03.
Version 43.1
Release Date: 2021-04-21
Fixed
- Fixed an issue that slowed down the instance launch speed.
Version 43.0
Release Date: 2021-03-24
Developer Note
The base conda environment is no longer activated by default, and the default python, python3 will now point to /usr/bin/python and /usr/bin/python3 respectively. To automatically activate base conda environment upon login, run: conda config --set auto_activate_base true.
Also, please note that ~/.dlami has been removed to fix a problem where shell environment variables such as PATH, and LD_LIBRARY_PATH contains many repeating configurations and sometime leads to issues. If you used to rely on ~/.dlami to help setting up you PATH or LD_LIBRARY_PATH environment variables, it is highly recommended to perform explicit configurations.
For more detail, please check release note below.
Changed
- Upgraded jupyterlab to version 3.0.8 in all python3 environments.
- Updated amazonei_tensorflow2_p36 conda environment to use Tensorflow 2.3
Fixed
- The old installation of OpenMPI in /usr/local/mpi caused /opt/amazon/openmpi/bin/mpirun to be linked incorrectly. To fix the link issue, we removed /usr/local/mpi installation, OpenMPI installation in /opt/amazon/openmpi is available.
- Add two symlinks (activate, deactivate) to $HOME/anaconda3/condabin to fix a conda issue that removes $HOME/anaconda3/bin from PATH.
- Remove duplicated and non-existing definition of shell environments that has been polluting the shell environment variables such as PATH, and LD_LIBRARY_PATH. As the result, ~/.dlami, and /etc/profile.d/var.sh has been removed, and /etc/profile.d/dlami.sh has been added. Due to this change, the base conda environment is no longer activated by default, and the default python, python3 will now point to /usr/bin/python and /usr/bin/python3 respectively. To automatically activate base conda environment upon login, run: conda config --set auto_activate_base true
Security
- Updated utility packages Flask-Cors, lxml, urllib3, PyYAML, Jinja2, and aiohttp to newer versions to address vulnerabilities associated with these package.
- Updated pakcage cryptography to address CVE-2020-36242
Version 42.0
Release Date: 2021-03-08
Security
- Updated tensorflow2_p36 to use tensorflow 2.1.3 to address security vulnerabilities
- Upgraded Tensorflow in tensorflow_p36 conda environment to version 1.15.5 to address security vulnerabilities
- Updated pyyaml to 2.4.1 to mitigate CVE-2020-14343
Version 41.1
Release Date: 2021-03-04
Changed
- Pinned IPython to major version 7 in all python3.x conda environments.
Version 41.0
Release Date: 2021-02-27
Security
- Patched system python2 and python3 for CVE-2021-3177
- Patched python in most conda environments to address CVE-2021-3177, this change also resulted in dependency upgrades. Python in conda environments amazonei_pytorch_p36, and amazonei_tensorflow2_p36 stays unchanged and will be upgraded in later versions. Conda environments with python2.7 are not patched for CVE-2021-3177 and will be patched when patch is available.
Version 40.0
Release Date: 2021-02-09
Changed
- Added health check tools ei and health_check to PATH for pytorch Elastic Inference.
- Upgraded amazonei_mxnet_p36 conda environment to use elastic inference library eimx-1.7.0.
Fixed
- Improved conda environment isolation by automatically recover environment variables such as PATH, LD_LIBRARY_PATH, CUDA_PATH, and CUDA_HOME during deactivation. Also, activating conda environments automatically link to the correct CUDA bin path, user no longer required to manually setting up symlinks or environment variables. Finally, we fixed a problem during conda environment activation in which certain environment variables are set twice.
Version 39.1
Release Date: 2021-01-26
Security
- Patched CVE-2021-3156.
Version 39.0
Release Date: 2021-01-19
Changed
- Upgraded pytorch_latest_p37 conda environment to use PyTorch 1.7.1, torchvision 0.8.2 with CUDA 10.1
Added
- Added amazonei_pytorch_latest_p36 which provides PyTorch 1.5.1 support on Elastic Inference.
Version 38.0
Release Date: 2020-12-08
Changed
- Added the SageMaker Clarify Binary to calculate the Bias Metrics in python3 conda environments excluding the Neuron environments.
- Upgraded sagemaker-python-sdk, boto3, botocore, awscli in all python3 conda environment to latest versions.
Fixed
- Fixed an issue where the additional bin directories (such as /home/ubuntu/anaconda3/envs/tensorflow2_latest_p37/cpu/bin) are not on PATH by activating the conda environment. This issue can be seen in conda environments tensorflow_p27, tensorflow_p36, tensorflow_p37 tensorflow2_latest_p37, mxnet_latest_p37.
Version 37.0
Release Date: 2020-12-01
Changed
- Upgraded conda to version 4.8.4 that comes with new enhancements and bug-fixes. This change will affect Deep Learning AMI (Amazon Linux), Deep Learning AMI (Amazon Linux 2), Deep Learning AMI (Ubuntu 16.04), and Deep Learning AMI (Ubuntu 18.04) users.
- Upgraded tensorflow from conda env tensorflow2_p36 from version 2.1.0 to 2.1.2 for Deep Learning AMI (Amazon Linux), Deep Learning AMI (Amazon Linux 2), Deep Learning AMI (Ubuntu 16.04), Deep Learning AMI (Ubuntu 18.04).
- Upgraded Tensorflow in tensorflow_p36 conda environment to version 1.15.4. This change applies to Deep Learning AMI (Amazon Linux)
Removed
- Removed broken test scripts present in $HOME/src/bin directory. This change affects Deep Learning AMI (Amazon Linux), Deep Learning AMI (Amazon Linux 2), Deep Learning AMI (Ubuntu 16.04), and Deep Learning AMI (Ubuntu 18.04) users.
Security
- Includes TensorFlow upgrade from 2.1.0 to 2.1.2 - security patch from Google.
- Includes TensorFlow upgrade from 1.15.3 to 1.15.4 security patch from Google.
Version 36.0
Release Date: 2020-11-02
Changed
- Upgraded EFA installer to version 1.10.0.
- Upgraded sagemaker to version 2 in all python3.x conda environments. This upgrade contains breaking changes and sagemaker python sdk maintainers has provided tools to automatically upgrade v1 code to v2.
- Updated pytorch_latest_p36 conda environment to PyTorch 1.7.0
Security
- Updated tensorflow2_latest_p37 conda environment to Tensorflow 2.3.1 to address security vulnerabilities.
Version 35.0
Release Date: 2020-10-08
Changed
- Updated NVIDIA Driver to version 450.80.02
Removed
- Removed CUDA 11.0 and NVIDIA Fabric Manager from Deep Learning AMI (Amazon Linux)
Fixed
- Resolved an issue where python packages installed (using pip) at the system level were conflicting with packages installed by operating system package managers (yum or apt).
Version 34.0
Release Date: 2020-09-11
Changed
- Added torchserve to pytorch_latest_p36conda environment
- An additional environment, mxnet_latest_p37, has been added for MXNet-1.7.0. mxnet_p27 and mxnet_p36 environments continue to provide MXNet 1.6.
Version 33.0
Release Date: 2020-08-19
Changed
- Upgraded ei-tool to version 1.7.0.
- Added tensorflow-serving 2.3.0 to tensorflow2_latest_p37 environment.
- Upgraded tensorflow-serving-api to version 2.3.0 in tensorflow2_latest_p37 environments.
Version 32.0
Release Date: 2020-08-07
Changed
- The Conda environment ‘tensorflow2_latest_p37’ has been upgraded to use TensorFlow v2.3. Note that this environment does not contain TensorFlow Serving 2.3 (TFS-2.3), which is not yet generally available. TFS-2.3 will be added to this environment once TFS-2.3 has been officially released.
- SageMaker Python SDK is currently pinned at version 1.72.0. It will be upgraded in upcoming releases.
Version 31.0
Release Date: 2020-08-03
Changed
- Improved time to activate tensorflow2_latest_p37 Conda environment
- Upgraded Conda environments tensorflow_p27, tensorflow_p36 to Tensorflow v1.15.3
- Upgraded Conda environment pytorch_latest_p36 to use pytorch=1.6.0 and torchvision=0.7.0
- Upgraded Conda environments mxnet_p27, mxnet_p36, amazonei_mxnet_p27, amazonei_mxnet_p36, aws_neuron_mxnet_p36, amazonei_pytorch_p36 to use multi-model-server==1.1.2
- Cuda 8.0/9.0/9.2 have been removed from the AMI
Fixed
- Fixed an error where shared object file: libopencv_dnn.so.4.2 cannot be opened.
Version 30.0
Release Date: 2020-07-19
Changed
- An additional environment, tensorflow2_latest_p37, has been added for TensorFlow 2.2 with Python 3.7, CUDA 10.2 and NCCL 2.6.4. tensorflow2_p27 and tensorflow2_p36 environments continue to provide TensorFlow 2.1.
- For environment tensorflow2_latest_p37,
- Added tensorflow2 serving which is available as tensorflow2_latest_model_server
- As per guidance from the Keras project, customers must switch to using tf.keras instead of keras. The tensorflow2_latest_p37 environment does not contain standalone keras libraries.
- Updated Horovod to version 0.19.4
- DLAMI ships with internally tested Openmpi installed at /opt/amazon/openmpi/bin/mpirun.
- EFA on the DLAMI has only been tested with Openmpi 4.0.2 installed by the EFA Installer version 1.7.1 at /opt/amazon/openmpi.
- AWS OFI NCCL plugin updated for cuda 10.0, 10.1 and 10.2.
- To run tests on Multi-Node Applications with EFA, LD_LIBRARY_PATH needs to be explicitly added to tests as explained in document tutorial-efa-using. For Example for CUDA 10.0: -x LD_LIBRARY_PATH=/usr/local/cuda-10.0/efa/lib:/usr/local/cuda-10.0/lib:/usr/local/cuda-10.0/lib64:/usr/local/cuda-10.0:/opt/amazon/efa/lib64:/opt/amazon/openmpi/lib64:$LD_LIBRARY_PATH
Version 29.1
Release Date: 2020-06-14
Changed
- An additional environment, pytorch_latest_p36, has been added for PyTorch 1.5.0. pytorch_p27 and pytorch_p36 environments continue to provide PyTorch 1.4.
- Upgraded amazonei_mxnet_p27 and amazonei_mxnet_p36 environment with numpy version > 1.16.0
- Enabled use of “conda activate” as an additional environment activation command while “source activate” continues to work. In addition, you can now automatically activate base conda environment upon login by running “conda config --set auto_activate_base true”.
Version 29.0
Release Date: 2020-05-20
Changed
- An additional environment, pytorch_latest_p36, has been added for PyTorch 1.5.0. pytorch_p27 and pytorch_p36 environments continue to provide PyTorch 1.4.
- Upgraded amazonei_mxnet_p27 and amazonei_mxnet_p36 environment with numpy version > 1.16.0
Version 28.1
Release Date: 2020-05-04
Changed
- Downgraded openmpi to version 3.1.0 in the base conda environment.
Version 28.0
Release Date: 2020-04-29
Changed
- Added amazonei_tensorflow2_p27 and amazonei_tensorflow2_p36 environment, with test
- Updated Tensorflow 2.1.0 and 1.15.2
- Includes amazonei_tensorflow_p27 and amazonei_tensorflow_p36 update to 1.15.2
- Removed sagemaker-pyspark
- Includes fix for the below issue:
- On Ubuntu 16 and Ubuntu 18 DLAMI, if the gcc version changes after running apt-get update && apt-get upgrade, nvidia-smi fails with the error message: “NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running. ”. To workaround the issue, the NVIDIA driver needs to be reinstalled.
- Anaconda upgrade:
- Upgraded conda to 4.8.3 latest and upgraded anaconda distribution version. As per latest conda version, if any package from anaconda distribution is uninstalled, then it removes the anaconda distribution from any that environment in DLAMI. Please refer guthub issue for more info: https://github.com/conda/conda/issues/9807#issuecomment-610913759
- For environments having python 2.7, anaconda distribution version is 2019.10
- For environments having python 3.6, anaconda distribution version is 2020.02
- For environments starting with aws_neuron and environment amazonei_pytorch_p36, anaconda distribution is unpinned
Version 27.0
Release Date: 2020-03-04
Changed
- Increased AMI size to 105GB, Available space is 15gb now
- Added aws_neuron_pytorch_p36 environment, with test
- Added amazonei_pytorch_p36 environment, with test
- Includes Tensorflow update to 1.15.2
- Includes Tensorflow update to 2.1.0
- Includes Mxnet update to 1.6.0
- Includes Pytorch update to 1.4.0