AWS Machine Learning Blog

Category: AWS Deep Learning AMIs

Model serving in Java with AWS Elastic Beanstalk made easy with Deep Java Library

Deploying your machine learning (ML) models to run on a REST endpoint has never been easier. Using AWS Elastic Beanstalk and Amazon Elastic Compute Cloud (Amazon EC2) to host your endpoint and Deep Java Library (DJL) to load your deep learning models for inference makes the model deployment process extremely easy to set up. Setting […]

How to run distributed training using Horovod and MXNet on AWS DL Containers and AWS  Deep Learning AMIs

Distributed training of large deep learning models has become an indispensable way of model training for computer vision (CV) and natural language processing (NLP) applications. Open source frameworks such as Horovod provide distributed training support to Apache MXNet, PyTorch, and TensorFlow. Converting your non-distributed Apache MXNet training script to use distributed training with Horovod only […]

Multi-GPU distributed deep learning training at scale with Ubuntu18 DLAMI, EFA on P3dn instances, and Amazon FSx for Lustre

AWS Deep Learning AMI (Ubuntu 18.04) is optimized for deep learning on EC2 Accelerated Computing Instance types, allowing you to scale out to multiple nodes for distributed workloads more efficiently and easily. It has a prebuilt Elastic Fabric Adapter (EFA), Nvidia GPU stack, and many deep learning frameworks (TensorFlow, MXNet, PyTorch, Chainer, Keras) for distributed […]

AWS Deep Learning AMIs now come with TensorFlow 1.13, MXNet 1.4, and support Amazon Linux 2

The AWS Deep Learning AMIs now come with MXNet 1.4.0, Chainer 5.3.0, and TensorFlow 1.13.1, which is custom-built directly from source and tuned for high-performance training across Amazon EC2 instances. AWS Deep Learning AMIs are now available on Amazon Linux 2 Developers can now use the AWS Deep Learning AMIs and Deep Learning Base AMI on […]

Deploy TensorFlow models with Amazon Elastic Inference using a flexible new Python API available in EI-enabled TensorFlow 1.12

Amazon Elastic Inference (EI) now supports the latest version of TensorFlow­–1.12. It provides EIPredictor, a new easy-to-use Python API function for deploying TensorFlow models using EI accelerators. You can now use this new Python API function within your inference scripts as an alternative to using TensorFlow Serving when running TensorFlow models with EI. EIPredictor allows […]

Scalable multi-node training with TensorFlow

We’ve heard from customers that scaling TensorFlow training jobs to multiple nodes and GPUs successfully is hard. TensorFlow has distributed training built-in, but it can be difficult to use. Recently, we made optimizations to TensorFlow and Horovod to help AWS customers scale TensorFlow training jobs to multiple nodes and GPUs. With these improvements, any AWS customer […]

PyTorch 1.0 preview now available in Amazon SageMaker and the AWS Deep Learning AMIs

Amazon SageMaker and the AWS Deep Learning AMIs (DLAMI) now provide an easy way to evaluate the PyTorch 1.0 preview release. PyTorch 1.0 adds seamless research-to-production capabilities, while retaining the ease-of-use that has enabled PyTorch to rapidly gain popularity. The AWS Deep Learning AMI comes pre-built with PyTorch 1.0, Anaconda, and Python packages, with CUDA and […]

New speed record set for training deep learning models on AWS, a research lab dedicated to making deep learning more accessible, has announced that they successfully trained the ResNet-50 deep learning model on a million images in 18 minutes using 16 Amazon EC2 P3.16xlarge instances. They accomplished this milestone by spending just $40. This new speed record illustrates how you can drastically cut down the training times for deep learning models, enabling you to bring your innovations to market faster and at a lower cost.

AWS Deep Learning AMIs now include ONNX, enabling model portability across deep learning frameworks

The AWS Deep Learning AMIs (DLAMI) for Ubuntu and Amazon Linux are now pre-installed and fully configured with Open Neural Network Exchange (ONNX), enabling model portability across deep learning frameworks. In this blog post we’ll introduce ONNX, and demonstrate how ONNX can be used on the DLAMI to port models across frameworks. What is ONNX? ONNX is an open […]

AWS Deep Learning AMIs now with optimized TensorFlow 1.9 and Apache MXNet 1.2 with Keras 2 support to accelerate deep learning on Amazon EC2 instances

The AWS Deep Learning AMIs for Ubuntu and Amazon Linux now come with an optimized build of TensorFlow 1.9 custom-built directly from source and fine-tuned for high performance training across Amazon EC2 instances. In addition, the AMIs come with the latest Apache MXNet 1.2 with several performance and usability improvements, the new Keras 2-MXNet backend […]