Optimized TensorFlow 1.8 Now Available in the AWS Deep Learning AMIs to Accelerate Training on Amazon EC2 C5 and P3 Instances

Posted on: May 15, 2018

The AWS Deep Learning AMIs for Ubuntu and Amazon Linux now come with advanced optimizations for TensorFlow 1.8 to deliver higher-performance training for Amazon EC2 C5 and P3 instances.

For CPU-based training scenarios, the AMIs now include TensorFlow 1.8 built with Intel’s Advanced Vector Instructions (AVX), SSE, and FMA instruction sets to accelerate vector and floating point computations. The AMIs are also fully configured with Intel MKL-DNN to accelerate math routines used in neural network training on Amazon EC2 C5 instances. Training a ResNet-50 benchmark with the ImageNet dataset was 7X faster than training on the stock TensorFlow 1.8 binaries when we used an optimized build on a c5.18xlarge instance type with a batch size of 32.

In addition, to improve training performance for GPU-based scenarios, the AMIs include an optimized build of TensorFlow 1.8 fully configured with NVIDIA CUDA 9 and cuDNN 7 to take advantage of mixed-precision training on Volta V100 GPUs powering Amazon EC2 P3 instances.

When you activate the virtual environment, the Deep Learning AMIs automatically deploy higher-performance builds of TensorFlow, as well as other deep learning frameworks such as Chainer and CNTK optimized for the EC2 instance of your choice.

Get started with the AWS Deep Learning AMIs using our quick getting started tutorial and our developer guide for more tutorials and resources. You can also subscribe to our discussion forum to get launch announcements and post your questions.