AWS Deep Learning AMI (Amazon Linux 2)
This document describes the latest changes, additions, known issues, and fixes for Deep Learning AMI (Amazon Linux 2).
Release Date: February 10, 2021
Created On: February 10, 2021
Last Updated: February 05, 2025
AMI Name format:
- Deep Learning Proprietary Nvidia Driver AMI (Amazon Linux 2) Version ${XX.X}
- Deep Learning OSS Nvidia Driver AMI (Amazon Linux 2) Version ${XX.X}
Supported EC2 Instances:
- Please refer to Important changes to DLAMI
- Deep Learning with OSS Nvidia Driver supports G4dn, G5, G6, Gr6, P4d, P4de, P5.
- Deep Learning with Proprietary Nvidia Driver supports G3 (G3.16x not supported), P3, P3dn
The AMI includes the following:
- Supported AWS Service: EC2
- Operating System: Amazon Linux 2
- Compute Architecture: x86
- Conda environments framework and python versions:
- Deep Learning OSS Nvidia Driver AMI (Amazon Linux 2):
- python3: Python 3.10
- tensorflow2_p310: TensorFlow 2.16, Python 3.10
- pytorch_p310: PyTorch 2.2, Python 3.10
- Deep Learning Proprietary Nvidia Driver AMI (Amazon Linux 2):
- python3: Python 3.10
- tensorflow2_p310: TensorFlow 2.16, Python 3.10
- pytorch_p310: PyTorch 2.2, Python 3.10
- Deep Learning OSS Nvidia Driver AMI (Amazon Linux 2):
- NVIDIA Driver:
- OSS Nvidia driver: 550.144.03
- Proprietary Nvidia driver: 550.144.03
- NVIDIA CUDA12.1-12.4 stack:
- CUDA, NCCL and cuDDN installation path: /usr/local/cuda-xx.x/
- Default CUDA: 12.1
- PATH /usr/local/cuda points to CUDA12.1
- Updated below env vars:
- LD_LIBRARY_PATH to have /usr/local/cuda-12.1/lib:/usr/local/cuda-12.1/lib64:/usr/local/cuda-12.1:/usr/local/cuda-12.1/targets/x86_64-linux/lib
- PATH to have /usr/local/cuda-12.1/bin/:/usr/local/cuda-11.8/include/
- For any different CUDA version, please update LD_LIBRARY_PATH accordingly.
- Compiled NCCL Version for CUDA 12.1-12.4: 2.22.3
- NCCL Tests Location:
- all_reduce, all_gather and reduce_scatter: /usr/local/cuda-xx.x/efa/test-cuda-xx.x/
- To run NCCL tests, LD_LIBRARY_PATH needs to passed having below updates.
- Common PATHs are already added to LD_LIBRARY_PATH:
- /opt/amazon/efa/lib:/opt/amazon/openmpi/lib:/opt/aws-ofi-nccl/lib:/usr/local/lib:/usr/lib
- For any different CUDA version, please update LD_LIBRARY_PATH accordingly.
- Common PATHs are already added to LD_LIBRARY_PATH:
- EFA Installer: 1.38.0
- GDRCopy: 2.4
- AWS OFI NCCL: 1.13.0
- System location: /usr/local/cuda-xx.x/efa
- This is added to run NCCL tests located at /usr/local/cuda-xx.x/efa/test-cuda-xx.x/
- Also, PyTorch package comes with dynamically linked AWS OFI NCCL plugin as a conda package aws-ofi-nccl-dlc package as well and PyTorch will use that package instead of system AWS OFI NCCL.
- NCCL Tests Location: /usr/local/cuda-xx.x/efa/test-cuda-xx.x/
- AWS CLI v2 at /usr/local/bin/aws2 and AWS CLI v1 at /usr/local/bin/aws
- EBS volume type: gp3
- Query AMI-ID with SSM Parameter (example region is us-east-1):
- OSS Nvidia Driver:
- aws ssm get-parameter --name /aws/service/deeplearning/ami/x86_64/multi-framework-oss-nvidia-driver-amazon-linux-2/latest/ami-id --region us-east-1 --query "Parameter.Value" --output text
- Proprietary Nvidia Driver:
- aws ssm get-parameter --name /aws/service/deeplearning/ami/x86_64/multi-framework-proprietary-nvidia-driver-amazon-linux-2/latest/ami-id --region us-east-1 --query "Parameter.Value" --output text
- OSS Nvidia Driver:
- Query AMI-ID with AWSCLI (example region is us-east-1):
- OSS Nvidia Driver:
- aws ec2 describe-images --region us-east-1 --owners amazon --filters 'Name=name,Values=Deep Learning OSS Nvidia Driver AMI (Amazon Linux 2) Version ??.?' 'Name=state,Values=available' --query 'reverse(sort_by(Images, &CreationDate))[:1].ImageId' --output text
- Proprietary Nvidia Driver:
- aws ec2 describe-images --region us-east-1 --owners amazon --filters 'Name=name,Values=Deep Learning Proprietary Nvidia Driver AMI (Amazon Linux 2) Version ??.?' 'Name=state,Values=available' --query 'reverse(sort_by(Images, &CreationDate))[:1].ImageId' --output text
- OSS Nvidia Driver:
Notice
EFA Updates from 1.37 to 1.38 (Release on 2025-02-05)
- EFA now bundles the AWS OFI NCCL plugin, which can now be found in /opt/amazon/ofi-nccl rather than the original /opt/aws-ofi-nccl/. If updating your LD_LIBRARY_PATH variable, please ensure that you modify your OFI NCCL location properly.
Neuron Conda Environment Removal
- Deep Learning Proprietary Nvidia Driver AMIs released after July 18, 2024 will be shipped without neuron conda environments for PyTorch and TensorFlow. Please use the Neuron DLAMIs on the DLAMI Release Notes instead, to utilize neuron environments.
Audit Package Removal
- DLAMI’s released between March 26,2024 (2024-03-26) and April 12, 2024 (2024-04-12) were shipped without the audit package. If you require this specific package for your logging and monitoring needs, please migrate your workflows to the latest DLAMI in order to consume those with the audit package installed.
Horovod
- Horovod is removed from the current pytorch_p310 and tensorflow2_p310 conda environments on the DLAMI. Customers will be able install the horovod libraries by following the horovod guidelines and install them on their DLAMIs for their distributed training jobs.
Release Date: 2025-02-05
AMI Names:
- Deep Learning Proprietary Nvidia Driver AMI (Amazon Linux 2) Version 80.2
- Deep Learning OSS Nvidia Driver AMI (Amazon Linux 2) Version 80.4
Updated
- Upgraded EFA version from 1.37.0 to 1.38.0
- EFA now bundles the AWS OFI NCCL plugin, which can now be found in /opt/amazon/ofi-nccl rather than the original /opt/aws-ofi-nccl/. If updating your LD_LIBRARY_PATH variable, please ensure that you modify your OFI NCCL location properly.
- Upgraded Nvidia Container Toolkit from 1.17.3 to 1.17.4
Release Date: 2025-01-17
AMI Names:
- Deep Learning Base OSS Nvidia Driver AMI (Amazon Linux 2) Version 68.3
- Deep Learning Base Proprietary Nvidia Driver AMI (Amazon Linux 2) Version 66.0
Updated
- Upgraded Nvidia driver from version 550.127.05 to 550.144.03 to address CVE’s present in the NVIDIA GPU Display Driver Security Bulletin for January 2025
Release Date: 2024-12-09
AMI Names:
- Deep Learning OSS Nvidia Driver AMI (Amazon Linux 2) Version 80.1
- Deep Learning Proprietary Nvidia Driver AMI (Amazon Linux 2) Version 79.9
Updated
- Upgraded Nvidia Container Toolkit from version 1.17.0 to 1.17.3
Release Date: 2024-11-11
AMI Names:
- Deep Learning OSS Nvidia Driver AMI (Amazon Linux 2) Version 79.9
- Deep Learning Proprietary Nvidia Driver AMI (Amazon Linux 2) Version 79.7
Updated
- Upgraded Nvidia Container Toolkit from version 1.16.2 to 1.17.0, addressing the security vulnerability CVE-2024-0134.
Release Date: 2024-10-22
AMI Names:
- Deep Learning OSS Nvidia Driver AMI (Amazon Linux 2) Version 79.6
- Deep Learning Proprietary Nvidia Driver AMI (Amazon Linux 2) Version 79.6
Updated
- Upgraded Nvidia driver from version 550.90.07 to 550.127.05 to address CVE’s present in the NVIDIA GPU Display Security Bulletin for October 2024
Release Date: 2024-10-03
AMI Names:
- Deep Learning OSS Nvidia Driver AMI (Amazon Linux 2) Version 79.3
- Deep Learning Proprietary Nvidia Driver AMI (Amazon Linux 2) Version 79.3
Updated
- Upgraded Nvidia Container Toolkit from version 1.16.1 to 1.16.2, addressing the security vulnerability CVE-2024-0133.
Release Date: 2024-07-18
AMI Names:
- Deep Learning OSS Nvidia Driver AMI (Amazon Linux 2) Version 78.6
- Deep Learning Proprietary Nvidia Driver AMI (Amazon Linux 2) Version 78.7
Updated
- Removed aws_neuron_pytorch_p38 and aws_neuron_tensorflow_p38 conda environments from the Deep Learning Proprietary Nvidia Driver AMI.
- Removed Inf1 instance family support from the Deep Learning Proprietary Nvidia Driver AMI.
Release Date: 2024-06-06
AMI Names:
- Deep Learning OSS Nvidia Driver AMI (Amazon Linux 2) Version 78.5
- Deep Learning Proprietary Nvidia Driver AMI (Amazon Linux 2) Version 78.5
Updated
- Updated Nvidia driver version to 535.183.01 from 535.161.08
Release Date: 2024-05-17
AMI Names:- Deep Learning OSS Nvidia Driver AMI (Amazon Linux 2) Version 78.1
- Deep Learning Proprietary Nvidia Driver AMI (Amazon Linux 2) Version 78.1
Updated
- Updated torchserve from v0.8.2 to v0.11.0 in the pytorch_p310 environment.
Release Date: 2024-05-07
AMI Names:- Deep Learning OSS Nvidia Driver AMI (Amazon Linux 2) Version 78.0
- Deep Learning Proprietary Nvidia Driver AMI (Amazon Linux 2) Version 78.0
Updated
- TensorFlow version updated from 2.15 to 2.16 in the tensorflow2_p310 environment.
- Updated EFA version from version 1.30 to version 1.32
- Updated AWS OFI NCCL plugin from version 1.7.4 to version 1.9.1
- Updated Nvidia container toolkit from version 1.13.5 to version 1.15.0
- NOTE: Version 1.15.0 does NOT include the nvidia-container-runtime and nvidia-docker2packages. It is recommended to use nvidia-container-toolkit packages directly by following Nvidia container toolkit docs.
Added
- Added CUDA12.3 stack with CUDA12.3, NCCL 2.21.5, CuDNN 8.9.7
Removed
- Removed CUDA11.7, CUDA12.0 stacks present at /usr/local/cuda-11.7 and /usr/local/cuda-12.0
- Removed nvidia-docker2 package and its command nvidia-docker as part of Nvidia container toolkit update from 1.13.5 to 1.15.0 which does NOT include the nvidia-container-runtime and nvidia-docker2 packages.
Release Date: 2024-04-04
AMI Names:- Deep Learning OSS Nvidia Driver AMI (Amazon Linux 2) Version 77.0
- Deep Learning Proprietary Nvidia Driver AMI (Amazon Linux 2) Version 77.0
Updated
- PyTorch version updated from 2.1 to 2.2 in the pytorch_p310 environment.
- For OSS Nvidia driver DLAMIs, added G6 and Gr6 EC2 instances support. Please refer EC2 instance selection page for more information.
Release Date: 2024-03-29
AMI Names:- Deep Learning OSS Nvidia Driver AMI (Amazon Linux 2) Version 76.8
- Deep Learning Proprietary Nvidia Driver AMI (Amazon Linux 2) Version 76.9
Updated
- Updated Nvidia driver from 535.104.12 to 535.161.08 in both Proprietary and OSS Nvidia driver DLAMIs.
- The new supported instances for each DLAMI are as follows:
- Deep Learning with Proprietary Nvidia Driver supports G3 (G3.16x not supported), P3, P3dn, Inf1
- Deep Learning with OSS Nvidia Driver supports G4dn, G5, P4d, P4de.
Removed
- Removed G4dn, G5, G3.16x EC2 instances support from Proprietary Nvidia driver DLAMI.
Version 76.8
Release Date: 2024-03-20AMI Names:
- Deep Learning Proprietary Nvidia Driver AMI (Amazon Linux 2) Version 76.8
Added
- Added awscliv2 in the AMI as /usr/local/bin/aws2, alongside awscliv1 as /usr/local/bin/aws on Proprietary Nvidia Driver AMI
Version 76.7
Release Date: 2024-03-20AMI Names:
- Deep Learning OSS Nvidia Driver AMI (Amazon Linux 2) Version 76.7
Added
- Added awscliv2 in the AMI as /usr/local/bin/aws2, alongside awscliv1 as /usr/local/bin/aws on OSS Nvidia Driver AMI
- Updated OSS Nvidia driver DLAMI with G4dn and G5 support, based on it current support looks like below:
- Deep Learning Base Proprietary Nvidia Driver AMI (Amazon Linux 2) supports P3, P3dn, G3, G5, G4dn.
- Deep Learning Base OSS Nvidia Driver AMI (Amazon Linux 2) supports G4dn, G5, P4, P5.
- OSS Nvidia driver DLAMIs are recommended to be used for G4dn, G5, P4, P5.
Version 76.3
Release Date: 2024-02-14Updated
- Updated TensorFlow from 2.13.0 to 2.15.0
- Updated EFA from 1.29.0 to 1.30.0
- Updated AWS-OFI-NCCL from 1.7.3-aws to 1.7.4-aws
- Updated Nvidia Driver to 535.104.12 on Deep Learning Proprietary Nvidia Driver AMI
- Updated Nvidia Driver to 535.154.05 on Deep Learning OSS Nvidia Driver AMI
Version 76.2
Release Date: 2024-02-02AMI Names:
- Deep Learning Proprietary Nvidia Driver AMI (Amazon Linux 2) Version 76.2
- Deep Learning OSS Nvidia Driver AMI (Amazon Linux 2) Version 76.4
Security
- Updated runc package version to consume patch for CVE-2024-21626.
Version 76.1
Release Date: 2023-12-27Updated
- Updated PyTorch from 2.0.1 to 2.1.0
Version 75.1
Release Date: 2023-11-17AMI Names:
Please refer to Important changes to DLAMI
- Deep Learning OSS Nvidia Driver AMI (Amazon Linux 2) Version 75.1
- Deep Learning Proprietary Nvidia Driver AMI (Amazon Linux 2) Version 75.1
Added
- AWS Deep Learning AMI (DLAMI) is split into two separate groups:
- DLAMI that uses Nvidia Proprietary Driver (to support P3, P3dn, G3, G5, G4dn).
- DLAMI that uses Nvidia OSS Driver to enable EFA (to support P4, P5).
- Please refer to public annoucement for more information on DLAMI split.
- AWS cli queries for above are in the release notes under bullet point Query AMI-ID with AWSCLI (example region is us-east-1)
Updated
- EFA updated from 1.26.1 to 1.29.0
- GDRCopy updated from 2.3 to 2.4
Version 74.4
Release Date: 2023-10-27
Updated
- AWS OFI NCCL Plugin updated from version 1.7.2 to version 1.7.3
- Updated CUDA 12.0-12.1 directories with NCCL version 2.18.5
- CUDA12.1 updated as the default CUDA Version
- Updated LD_LIBRARY_PATH to have /usr/local/cuda-12.1/targets/x86_64-linux/lib/:/usr/local/cuda-12.1/lib:/usr/local/cuda-12.1/lib64:/usr/local/cuda-12.1 and PATH to have /usr/local/cuda-12.1/bin/
- For customers looking to change to any different CUDA version, please define the LD_LIBRARY_PATH and PATH variables accordingly.
- Updated Pillow from version 9.4.0 to 10.1.0 to fix SNYK-PYTHON-PILLOW-5918878 in all conda environments
- Updated opencv-python from 4.8.0.74 to 4.8.1.78 to fix SNYK-PYTHON-OPENCVPYTHON-5926695 in all conda environments
Added
- Kernel Live Patching is now enabled. Live patching enables customers to apply security vulnerability and critical bug patches to a running Linux kernel, without reboots or disruptions to running applications.
- Please note that live patching support for kernel 5.10.192 will end on 11/30/23.
- For more information please reference the official AWS documents here - https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/al2-live-patching.html
Version 74.0
Release Date: 2023-07-19Updated
- Updated TensorFlow from 2.12 to 2.13
- Horovod has been removed from the conda environment in this release. See Notice for details on installing horovod.
Version 73.1
Release Date: 2023-06-12Updated
- Updated PyTorch from 2.0.0 to 2.0.1
Version 73.0
Release Date: 2023-05-30Removed
- Removed out of support frameworks as below:
- All MXNet conda environments such as mxnet_p38 and aws_neuron_mxnet_p37
- All Elastic Inference conda environments such as amazonei_mxnet_p36, amazonei_pytorch_latest_p36 and amazonei_tensorflow2_p36
Version 72.0
Release Date: 2023-05-10Update
- Updated conda environment name from aws_neuron_pytorch_p37 to aws_neuron_pytorch_p38 to update PyTorch version to 1.13 with python3.8 support
- Updated conda environment name from aws_neuron_tensorflow2_p37 to aws_neuron_tensorflow2_p38 to update TensorFlow version to 2.10 with python3.8 support
Version 71.0
Release Date: 2023-03-30Update
- Updated pytorch_p39 to pytorch_p310
- PyTorch version updated from 1.13.1 to 2.0 in the pytorch_p310 environment.
- PyTorch package comes with statically linked custom NCCL 2.16.2 supporting the dynamic buffer depth patch and it won’t use custom NCCL. Custom NCCL source code available at: https://github.com/NVIDIA/nccl/tree/inc_nsteps
- PyTorch package comes with dynamically linked AWS OFI NCCL plugin as a conda package aws-ofi-nccl-dlc package as well and PyTorch will use that package instead of system AWS OFI NCCL.
- Torch.Compile Support
- PT 2.0 includes the use of torch.compile() for training. Please refer to PyTorch Documentation here on usage.
- Torch.compile is a beta feature of PyTorch 2.0 and we are continuing to test it extensively on AWS. As of this release:
- Torch.compile has been tested on P4, P3 and G5 instances.
- Torch.compile has been tested with its default setting using TorchInductor as backend.
- Torch.compile has been tested at float32 precision for training
- NOTE: The “Triton” based Torch Inductor is not supported on G3 instance. In this case, “eager mode” will need to be used. Please see OpenAI Triton GPU compatibility here.
- Horovod is supported in the current pytorch_p310 conda environment on the DLAMI. However, Horovod will be removed from the conda environment for upcoming version of PyTorch v2.1. Customers will be able install the horovod libraries by following the guidelines and install them on their DLAMIs for their distributed training jobs.
Removed
- We have temporarily not added fastai package due to their pending inductor backend support on their end. Once fastai adds support, we will add it back in upcoming releases.
Version 70.0
Release Date: 2023-03-24Update
- Updated Nvidia Driver version from 515.65.01 to 525.85.12
- Added support for G5 instance type.
- TensorFlow version updated from 2.11.0 to 2.12.0 in the tensorflow2_p310 environment.
- Removed `tensorflow-serving-api` package since there is no v2.12.0 available at this time.
- Horovod is supported in the current `tensorflow2_p310` conda environment on the DLAMI. However, Horovod will be removed from the conda environment for upcoming version of TensorFlow v2.13. Customers will be able install the horovod libraries by following the guidelines and install them on their DLAMIs for their distributed training jobs.
Version 69.1
Release Date: 2022-12-28Update
- PyTorch version updated from 1.13.0 to 1.13.1 in the pytorch_p39 environment.
Version 69.0
Release Date: 2022-12-05Update
- TensorFlow version updated from 2.10.0 to 2.11.0 in the tensorflow2_p310 environment.
Version 68.0
Release Date: 2022-11-09Added
- Added cuda-11.7 at /usr/local/cuda-11.7/
Update
- PyTorch from 1.12 to 1.13 in pytorch_p39 environment.
- Updated NVIDIA Driver from 510.47.03 to 515.65.01
Version 67.0
Release Date: 2022-11-01Update
- Updated base environment from Python 3.9 to Python 3.10
- Updated python3 environment from Python 3.9 to Python 3.10
- Updated tensorflow_p39 environment to tensorflow_p310, from Python 3.9 to Python 3.10
Version 66.4
Release Date: 2022-09-26Update
- PyTorch version updated from 1.12.0 to 1.12.1 in the pytorch_p39 environment.
Version 66.2
Release Date: 2022-09-19Update
- TensorFlow version updated from 2.9.2 to 2.10.0 in the tensorflow2_p39 environment.
Version 66.0
Release Date: 2022-09-15Update
- TensorFlow patch version updated from 2.9.1 to 2.9.2 in the tensorflow2_p39 environment.
Version 65.0
Release Date: 2022-07-20Update
- Updated TensorFlow to 2.9 in the tensorflow2_p39 environment and PyTorch to 1.12 in the pytorch_p39 environment
- Updated TensorFlow-Neuron 1.15 to TensorFlow-Neuron 2.8 in the aws_neuron_tensorflow2_p37 environment
- Removed TFS from the tensorflow2_p39 environment
Version 64.0
Release Date: 2022-07-15Update
- Updated TensorFlow to 2.8
- Updated conda env tensorflow2_p38 to tensorflow2_p39 to update TensorFlow version from 2.7 with python 3.8 to 2.8 with python 3.9
- TFS will be deprecated in upcoming release as it is recommend by TFS to use docker https://github.com/tensorflow/serving#set-up
- EFA version updated to 1.16.0
Version 63.0
Release Date: 2022-07-01Update
- Update Tensorflow Neuron Conda framework
- Updated conda env aws_neuron_tensorflow_p36 (Python 3.6) to aws_neuron_tensorflow_p37 (Python 3.7).
- Update Mxnet Neuron Conda framework
- Updated conda env aws_neuron_mxnet_p36 (Python 3.6) to aws_neuron_mxnet_p37 (Python 3.7).
- Update PyTorch Neuron Conda framework
- Updated conda env aws_neuron_pytorch_p36 (Python 3.6) to aws_neuron_pytorch_p37 (Python 3.7).
Version 62.1
Release Date: 2022-06-22Update
- Updated opencv-python to >=4.6.0 in all conda environments
Version 62.0
Release Date: 2022-06-09Update
- Updated Apache MXNet to 1.9
- Updated conda env mxnet_p37 to mxnet_p38 to update Apache MXNet version from 1.8 with python 3.7 to 1.9 with python 3.8.
- Updated PyTorch to 1.11
- Updated conda env pytorch_p38 to pytorch_p39 to update PyTorch version from 1.10 with python 3.8 to 1.11 with python 3.9
- Updated conda env python3 from python version 3.8 to 3.9
Version 61.2
Release Date: 2022-05-20Fixed
- For conda environment tensorflow2_p38, fixed 'GLIBC_2.27' not found error in tensorflow2_model_server
Version 61.1
Release Date: 2022-04-28Added
- Added Amazon CloudWatch Agent, for more details please refer https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Install-CloudWatch-Agent.html
- Added three systemd services which uses predefined json files available at path /opt/aws/amazon-cloudwatch-agent/etc/ to configure GPU metrics using linux user cwagent
- dlami-cloudwatch-agent@minimal
- Commands to enable GPU metrics:
- sudo systemctl enable dlami-cloudwatch-agent@minimal
- sudo systemctl start dlami-cloudwatch-agent@minimal
- It creates below metrics:
- “utilization_gpu”,
- “utilization_memory”
- Commands to enable GPU metrics:
- dlami-cloudwatch-agent@partial
- Commands to enable GPU metrics:
- sudo systemctl enable dlami-cloudwatch-agent@partial
- sudo systemctl start dlami-cloudwatch-agent@partial
- It creates below metrics:
- "utilization_gpu",
- "utilization_memory",
- "memory_total",
- "memory_used",
- "memory_free"
- Commands to enable GPU metrics:
- dlami-cloudwatch-agent@all
- Commands to enable GPU metrics:
- sudo systemctl enable dlami-cloudwatch-agent@all
- sudo systemctl start dlami-cloudwatch-agent@all
- It creates all available GPU metrics
- Commands to enable GPU metrics:
- dlami-cloudwatch-agent@minimal
Version 60.0
Release Date: 2022-03-18Added
- Updated kernel version from 4.14 to 5.10. Current version is 5.10.102-99.473.amzn2.x86_64
Version 59.0
Release Date: 2022-03-04Added
- Updated Nvidia Driver to 510.47.03
Version 58.0
Release Date: 2022-02-17
Updated
- Locked aws-neuron-dkms and tensorflow-model-server-neuron as they get updated to newer versions which are not supported by Neuron packages present in AMI
- Commands if customer would like to unlock the package to update them to latest:
sudo yum versionlock delete aws-neuron-dkms
sudo yum versionlock delete tensorflow-model-server-neuron
- Commands if customer would like to unlock the package to update them to latest:
- Updated multi-model-server==1.1.8 in conda envs mxnet_p37, aws_neuron_mxnet_p36, amazonei_mxnet_p36
Version 57.0
Release Date: 2022-01-13
Added
- Added CUDA11.2 with the following components:
- cuDNN v8.1.1.33
- NCCL 2.8.4
- CUDA 11.2.2
Updated
- Updated symlink pip to pip3
- Updated the conda environment tensorflow2_p37 to tensorflow2_p38 which has the following components:
- TensorFlow 2.7.0
- Python 3.8
- Cuda 11.2
- Updated the conda environment pytorch_p37 to pytorch_p38 which has the following components:
- PyTorch 1.10.0
- Python 3.8
- Cuda 11.1
- Updated the conda environment mxnet_p36 to mxnet_p37 which has the following components:
- MXNet 1.8.0
- Python 3.7
- Cuda 11.0
- Updated the conda environment python3 with the following components:
- Python 3.8
- CUDA 11.0
Deprecations
- Deprecated support for the P2 instance type
- Deprecated and removed the tensorflow_p37 conda environment which has TF1.15.5
- Deprecated and removed the mxnet_latest_p37 conda environment which has MXNet 1.8
- Deprecated and removed the tensorflow2_latest_p37 conda environment which has TensorFlow 2.4
- Deprecated and removed the pytorch_latest_p37 conda environment which has PyTorch 1.8
- Deprecated and removed the amazonei_tensorflow_p36 conda environment which has TF1.15.5
- Deprecated python2.7 and removed related python2.7 packages such as "python-dev", "python-pip", and "python-tk"
Version 56.0
Release Date: 2021-12-27
Updated
- Updated multi-model-server package to version 1.1.7, please refer for more information https://github.com/awslabs/multi-model-server/releases/tag/v1.1.7
- Removed org.apache.ant_1.9.2.v201404171502\lib\ant-apache-log4j.jar from cuda versions as it is not being used and there is no risk to users who have the Log4j files. Please refer for more information https://nvidia.custhelp.com/app/answers/detail/a_id/5294
Version 55.0
Release Date: 2021-12-01
Updated
- Updated Sagemaker PySDK to 2.70.0
Version 54.0
Release Date: 2021-11-24
Updated
- Updated EFA to 1.14.1
Version 53.0
Release Date: 2021-11-12
Updated
- Updated Neuron packages from aws-neuron-dkms=1.5.*, aws-neuron-runtime-base=1.5.*, aws-neuron-tools=1.6.* to aws-neuron-dkms=2.2.*, aws-neuron-runtime-base=1.6.*, aws-neuron-tools=2.0.*.
- Removed Neuron package aws-neuron-runtime=1.5.* as Neuron no longer have a runtime running as daemon and runtime is now integrated with framework as a library.
- Updated MXNet to 1.8 in aws_neuron_mxnet_p36 conda environment.
Version 52.0
Release Date:2021-10-21
Added
- Security scan reports are available at /opt/aws/dlami/info/. These reports contain scan results in JSON format for all included Conda environments.
Security
- Updated werkzeug to 2.0.2 in all the conda environments
Version 51.0
Release Date: 2021-10-08
Changed
- For every instance launch using DLAMI, tag "aws-dlami-autogenerated-tag-do-not-delete" will be added which will allow AWS to collect instance type, instance ID, DLAMI type, and OS information. No information on the commands used within the DLAMI is collected or retained. No other information about the DLAMI is collected or retained. To opt out of usage tracking for your DLAMI, add a tag to your Amazon EC2 instance during launch. The tag should use the key OPT_OUT_TRACKING with the associated value set to true. For more information, see Tag your Amazon EC2 resources.
Security
- Updated docker version to docker-20.10.7-3
Version 50.0
Release Date: 2021-09-03
Changed
- Updated DLAMI conda environment pytorch_p36 from pytorch 1.4.0 to 1.7.1
- Updated jupyterlab>=3.1.7 and notebook>=6.4.1 versions in all conda environments
- Updated nvidia driver and fabric manager version to 450.142.00.
Security
- Updated TensorFlow to 2.4.3 in conda environment tensorflow2_latest_p37.
- Updated conda env from tensorflow2_p36 to tensorflow2_p37 and used TensorFlow to 2.3.4 in conda environment tensorflow2_p37.
Version 49.0
Release Date: 2021-07-13
Changed
- added utility packages "imageio", "plotly", "smdebug", "shap", "opencv-python", "bokeh","seaborn" in python3 conda environments.
Version 48.0
Release Date: 2021-06-24
Changed
- Switched to pip instead of conda to install Neuron packages in conda environments `aws_neuron_pytorch_p36` `aws_neuron_mxnet_p36` `aws_neuron_tensorflow_p36`
Version 47.0
Release Date: 2021-06-10
Changed
- Updated awscli version to 1.19.89
- Updated Horovod version from 0.21.0 to 0.22.0 in conda environment tensorflow2_latest_p37
Security
- Updated amazonei_tensorflow_p36 environment with tensorflow-1.15.5.
Version 46.0
Release Date: 2021-05-27
Security
- Removed vulnerable CUDA-10.0 componenets (Visual Profiler, Nsight EE, and JRE) from the CUDA-10.0 installation (/usr/local/cuda-10.0).
Version 45.0
Release Date: 2021-05-25
Changed
- Add python 3.7 environment for EIA PyTorch 1.5.1 in Deep Learning AMI (Amazon Linux 2).
- Add python 3.6 environment for EIA Tensorflow 1.5 and 2.3 in Deep Learning AMI (Amazon Linux 2).
- Upgraded runc to latest
Version 44.0
Release Date: 2021-04-26
Changed
- Updated Nvidia Tesla driver and Fabric Manager version to 450.119.03.
Version 43.1
Release Date: 2021-04-21
Fixed
- Fixed an issue that slowed down the instance launch speed.
Version 43.0
Release Date: 2021-03-24
Developer Note
The base conda environment is no longer activated by default, and the default python, python3 will now point to /usr/bin/python and /usr/bin/python3 respectively. To automatically activate base conda environment upon login, run: conda config --set auto_activate_base true.
Also, please note that ~/.dlami has been removed to fix a problem where shell environment variables such as PATH, and LD_LIBRARY_PATH contains many repeating configurations and sometime leads to issues. If you used to rely on ~/.dlami to help setting up you PATH or LD_LIBRARY_PATH environment variables, it is highly recommended to perform explicit configurations.
For more detail, please check release note below.
Changed
- Upgraded jupyterlab to version 3.0.8 in all python3 environments.
Fixed
- The old installation of OpenMPI in /usr/local/mpi caused /opt/amazon/openmpi/bin/mpirun to be linked incorrectly. To fix the link issue, we removed /usr/local/mpi installation, OpenMPI installation in /opt/amazon/openmpi is available.
- Add two symlinks (activate, deactivate) to $HOME/anaconda3/condabin to fix a conda issue that removes $HOME/anaconda3/bin from PATH.
- Remove duplicated and non-existing definition of shell environments that has been polluting the shell environment variables such as PATH, and LD_LIBRARY_PATH. As the result, ~/.dlami, and /etc/profile.d/var.sh has been removed, and /etc/profile.d/dlami.sh has been added. Due to this change, the base conda environment is no longer activated by default, and the default python, python3 will now point to /usr/bin/python and /usr/bin/python3 respectively. To automatically activate base conda environment upon login, run: conda config --set auto_activate_base true
Security
- Updated utility packages Flask-Cors, lxml, urllib3, PyYAML, Jinja2, and aiohttp to newer versions to address vulnerabilities associated with these package.
- Updated pakcage cryptography to address CVE-2020-36242
Version 42.0
Release Date: 2021-03-08
Added
- Added TensorRT CUDA 11.0 installation
Changed
- Upgraded pytorch_latest_p37 environment from 1.7.1 to PyTorch 1.8.0, check PyTorch release note for detail.
Security
- Upgraded tensorflow2_p36 to use tensorflow 2.1.3 to address security vulnerabilities
- Upgraded Tensorflow in tensorflow_p37 conda environment to version 1.15.5 to address security vulnerabilities
- Updated pyyaml to 2.4.1 to mitigate CVE-2020-14343
Version 41.0
Release Date: 2021-02-27
Security
- Patched system python2 and python3 for CVE-2021-3177
- Patched python in all conda environments to address CVE-2021-3177, this change also resulted in dependency upgrades.
Version 40.0
Release Date: 2021-02-09
Added
- Add python 3.6 environment amazonei_mxnet_p36 for EIA MXNet 1.7.0
Changed
- Upgraded tensorflow2_latest_p37 to use tensorflow 2.4.1 binaries with cuda11.0 and cuDNN8
- Added health check tools ei and health_check to PATH for pytorch Elastic Inference.
- Upgraded jupyterlab to major version 2
Fixed
- Improved conda environment isolation by automatically recover environment variables such as PATH, LD_LIBRARY_PATH, CUDA_PATH, and CUDA_HOME during deactivation. Also, activating conda environments automatically link to the correct CUDA bin path, user no longer required to manually setting up symlinks or environment variables. Finally, we fixed a problem during conda environment activation in which certain environment variables are set twice.
Version 39.0
Release Date: 2021-01-19
Changed
- Upgraded pytorch_latest_p37 conda environment to use PyTorch 1.7.1, torchvision 0.8.2 with CUDA 11.0
- Updated cuDNN version to v8.0.5.39 in CUDA11.0 and CUDA11.1.
- Upgraded tensorflow2_latest_p37 to use tensorflow 2.4 binary with cuda11.0 and cuDNN8
Fixed
- Fixed an issue where amazon-linux-extras cannot be invoked correctly.