AWS Deep Learning Containers v7.0 for TensorFlow
Release Date: May 07, 2020
Created On: May 06, 2020
Last Updated: May 06, 2020
The AWS Deep Learning Containers for TensorFlow include containers for Training and Inference for CPU and GPU, optimized for performance and scale on AWS. These Docker images have been tested with Amazon SageMaker, EC2, ECS, and EKS and provide stable versions of NVIDIA CUDA, cuDNN, Intel MKL, Horovod and other required software components to provide a seamless user experience for deep learning workloads. All software components in these images are scanned for security vulnerabilities and updated or patched in accordance with AWS Security best practices.
Detailed Release Note Changes
Security Advisory
- AWS recommends that customers monitor critical security updates in the AWS Security Bulletin
Highlights of the Release
- Initial release of TensorFlow Deep Learning containers for TensorFlow-1.15 with python 3.7 support
- Updated sagemaker-tensorflow-training for TensorFlow 1.15.2 Training to v10.1.0
Prepackaged Deep Learning Frameworks Included
- TensorFlow: TensorFlow is an open source software library for numerical computation using data flow graphs.
- branch/tag used : v1.15.2
- Justification : Stable and well tested
- Supported with CUDA 10.0 and Intel MKL-DNN v0.20-rc
- branch/tag used : v1.15.2
- Horovod: Horovod is a distributed training framework. The goal of Horovod is to easily take single-GPU deep learning program and train it on multiple GPUs. Horovod nodes communicate directly with each other instead of going through a centralized node and average gradients using the ring-allreduce algorithm.
- branch/tag used : v0.18.2
- Justification : Stable and well tested
- branch/tag used : v0.18.2
Bill of Materials: List of all components
- CPU: Training container
- sagemaker-tensorflow-training==10.1.0
- sagemaker-tensorflow==1.15.2.1.0.0
- sagemaker-experiments==0.1.7
- numpy==1.17.4
- OpenMPI=4.0.1
- Horovod=0.18.2
- scipy==1.2.2
- scikit-learn==0.20.3
- pandas==0.24.2
- Pillow==7.0.0
- h5py==2.10.0
- requests==2.22.0
- awscli==1.18.52
- smdebug==0.7.2
- GPU: Training Container
- sagemaker-tensorflow-training==10.1.0
- sagemaker-tensorflow==1.15.2.1.0.0
- cuda-command-line-tools-10-0
- cuda-cublas-10-0
- cuda-cufft-10-0
- cuda-curand-10-0
- cuda-cusolver-10-0
- cuda-cusparse-10-0
- libcudnn7=7.5.1.10-1+cuda10.0
- libnccl2=2.4.7-1+cuda10.0
- libnccl-dev=2.4.7-1+cuda10.0
- OpenMPI=4.0.1
- Horovod=0.18.2
- numpy==1.17.4
- scipy==1.2.2
- scikit-learn==0.20.3
- pandas==0.24.2
- Pillow==7.0.0
- h5py==2.10.0
- requests==2.22.0
- awscli==1.18.52
- smdebug==0.7.2
Python Support
Python 3.7 is supported in the containers for all of the installed deep learning frameworks.
End of Life Notices
The Python open source community has officially ended support for Python 2 on January 1, 2020. The TensorFlow community has also announced that the TensorFlow 1.15 and TensorFlow 2.1 releases will be the last ones supporting Python 2. DLC releases with the next versions of the TensorFlow frameworks will not contain the Python 2 containers. Updates to the Python 2 DLC will be provided on previously published DLC versions only if there are security fixes published by the open source community for those versions. Previous releases of the TensorFlow DLC that contain Python 2 will continue to be available.
CPU Instance Type Support
The containers supports CPU instance types. TensorFlow is built with support for Intel MKL2019 DNN library support.
GPU Instance Type support
The containers support GPU instance types and contain the following software components for GPU support.
- CUDA 10.0 / cuDNN 7.5.1.10-1+cuda10.0 / NCCL 2.4.7-1+cuda10.0
AWS Regions support
Available in the following regions:
Region |
Code |
US East (Ohio) |
us-east-2 |
US East (N. Virginia) |
us-east-1 |
US West (Oregon) |
us-west-2 |
US West (SFO) |
us-west-1 |
Asia Pacific (Mumbai) |
ap-south-1 |
Asia Pacific (Seoul) |
ap-northeast-2 |
Asia Pacific (Singapore) |
ap-southeast-1 |
Asia Pacific (Sydney) |
ap-southeast-2 |
Asia Pacific (Tokyo) |
ap-northeast-1 |
Central (Canada) |
ca-central-1 |
EU (Frankfurt) |
eu-central-1 |
EU (Ireland) |
eu-west-1 |
EU (London) |
eu-west-2 |
EU(Paris) |
eu-west-3 |
SA (Sau Paulo) |
sa-east-1 |
EU (Stockholm) |
eu-north-1 |
AP East (Hong Kong) | ap-east-1 |
ME South (Bahrain) | me-south-1 |
Build and Test
- Built on: c5.18xlarge
- Tested on: c4.8xlarge, c5.18xlarge, g3.16xlarge, m4.16xlarge, p2.16xlarge, p3.16xlarge, p3dn.24xlarge
- Tested with MNIST and Resnet50/ImageNet datasets on EC2, ECS AMI (Amazon Linux AMI 2.0.20190614) and EKS AMI (1.11-v20190614) and Amazon Sagemaker.
Known Issue
- Issue: Keras support for python 3.7 - keras does not have support for python 3.7. The keras support built into TensorFlow (tf.keras) must be used instead of keras.
- Sagemaker Python SDK is not available for python 3.7 and is not included with this version of the container. We will update the container with python 3.7 compatible SageMaker Python SDK in a future release.