AWS Deep Learning Containers with TensorFlow 2.3.0


Release Date: August 07, 2020
Created On: August 08, 2020
Last Updated: September 21, 2020


The AWS Deep Learning Containers are available today with TensorFlow 2.3.0 support. You can launch the new versions of the Deep Learning Containers on Amazon SageMaker, Amazon Elastic Kubernetes Service (Amazon EKS), self-managed Kubernetes on Amazon EC2, and Amazon Elastic Container Service (Amazon ECS). For a complete list of frameworks and versions supported by the AWS Deep Learning Containers, see the release notes below.

The AWS Deep Learning Containers for TensorFlow include containers for Training and Inference for CPU and GPU, optimized for performance and scale on AWS. These Docker images have been tested with Amazon SageMaker, EC2, ECS, and EKS and provide stable versions of NVIDIA CUDA, cuDNN, Intel MKL, Horovod, and other required software components to provide a seamless user experience for deep learning workloads. All software components in these images are scanned for security vulnerabilities and updated or patched in accordance with AWS Security best practices.

More details can be found in marketplace, and a list of available containers can be found in our documentation. Get started quickly with the AWS Deep Learning Containers using the getting-started guides and beginner to advanced level tutorials in our developer guide. You can also subscribe to our discussion forum to get launch announcements and post your questions.

Release Notes

Security Advisory

  1. AWS recommends that customers monitor critical security updates in the AWS Security Bulletin

Highlights of the Release

  • Upgraded TensorFlow to version 2.3.0
  • Upgraded TensorFlow-Serving to version 2.3.0
  • Upgraded SageMaker Debugger to version 0.9.3 for Py3 training images

For latest updates, please refer to the aws/deep-learning-containers GitHub repo

Prepackaged Deep Learning Frameworks Included

  • TensorFlow: TensorFlow is an open source software library for numerical computation using data flow graphs.
    • branch/tag used : v2.3.0
    • Supported with CUDA 10.2 and Intel MKL-DNN v0.21
  • Horovod: Horovod is a distributed training framework. The goal of Horovod is to easily take single-GPU deep learning program and train it on multiple GPUs. Horovod nodes communicate directly with each other instead of going through a centralized node and average gradients using the ring-allreduce algorithm.
  • SageMaker Python SDK: The SDK is an open source library for training and deploying machine learning models on Amazon SageMaker. With the SDK, you can train and deploy models using popular deep learning frameworks Apache MXNet and TensorFlow. You can also train and deploy models with Amazon algorithms, which are scalable implementations of core machine learning algorithms that are optimized for SageMaker and GPU training. If you have your own algorithms built into SageMaker compatible Docker containers, you can train and host models using these as well.

Bill of Materials: List of all components

  • CPU: Training container
    • awscli==1.18.114
    • h5py==2.10.0
    • horovod==0.19.5
    • numpy==1.18.5
    • pandas==1.1.0
    • requests==2.24.0
    • sagemaker==1.72.0
    • sagemaker-experiments==0.1.24
    • sagemaker-tensorflow==2.3.0.1.0.0
    • sagemaker-tensorflow-training==20.1.0
    • scikit-learn==0.23.0
    • scipy==1.4.1
    • smdebug==0.9.2
  • CPU: Inference Container
    • awscli==1.18.121
    • requests==2.22.0
    • tensorflow-serving-api==2.3.0
  • GPU: Training Container
    • awscli==1.18.114
    • h5py==2.10.0
    • horovod==0.19.5
    • numpy==1.18.5
    • pandas==1.1.0
    • requests==2.24.0
    • sagemaker==1.72.0
    • sagemaker-experiments==0.1.24
    • sagemaker-tensorflow==2.3.0.1.0.0
    • sagemaker-tensorflow-training==20.1.0
    • scikit-learn==0.23.0
    • scipy==1.4.1
    • smdebug==0.9.2
    • tensorflow-gpu==2.3.0
  • GPU: Inference Container
    • awscli==1.18.121
    • requests==2.22.0
    • tensorflow-serving-api-gpu==2.3.0

Python Support

Python 3.7 is supported in the containers for the installed deep learning frameworks.

CPU Instance Type Support

The containers supports CPU instance types. TensorFlow is built with support for Intel MKL2019 DNN library support.

GPU Instance Type support

The containers supports GPU instance types and contain the following software components for GPU support.

  • CUDA 10.2 / cuDNN 7.6.5.32-1+cuda10.2 / NCCL 2.7.6-1+cuda10.2

AWS Regions support

Region Code
US East (Ohio) us-east-2
US East (N. Virginia) us-east-1
US West (Oregon) us-west-2
US West (N. California) us-west-1
Asia Pacific (Mumbai) ap-south-1
Asia Pacific (Seoul) ap-northeast-2
Asia Pacific (Singapore) ap-southeast-1
Asia Pacific (Sydney) ap-southeast-2
Asia Pacific (Tokyo) ap-northeast-1
Central (Canada) ca-central-1
EU (Frankfurt) eu-central-1
EU (Ireland) eu-west-1
EU (London) eu-west-2
EU(Paris) eu-west-3
SA (Sau Paulo) sa-east-1
EU (Stockholm) eu-north-1
AP East (Hong Kong) ap-east-1
ME South (Bahrain) me-south-1
China (Beijing) cn-north-1
China (Ningxia) cn-northwest-1

Build and Test

  • Built on: c5.18xlarge
  • Tested on: c4.8xlarge, c5.18xlarge, g3.16xlarge, m4.16xlarge, p2.16xlarge, p3.16xlarge, p3dn.24xlarge
  • Tested with MNIST and Resnet50/ImageNet datasets on EC2, ECS AMI (Amazon Linux AMI 2.0.20190614) and EKS AMI (1.11-v20190614) and Amazon Sagemaker.

Known Issues

  • No known issues

End of Life Notices

The Python open source community has officially ended support for Python 2 on January 1, 2020. The TensorFlow community has also announced that the TensorFlow 2.1 release will be the last one supporting Python 2. DLC releases with the next versions of the TensorFlow frameworks will not contain the Python 2 containers. Updates to the Python 2 DLC will be provided on previously published DLC versions only if there are security fixes published by the open source community for those versions. Previous releases of the TensorFlow DLC that contain Python 2 will continue to be available.