AWS Machine Learning Blog

Category: AWS Deep Learning AMIs

AWS Deep Learning AMIs: New framework-specific DLAMIs for production complement the original multi-framework DLAMIs

Since its launch in November 2017, the AWS Deep Learning Amazon Machine Image (DLAMI) has been the preferred method for running deep learning frameworks on Amazon Elastic Compute Cloud (Amazon EC2). For deep learning practitioners and learners who want to accelerate deep learning in the cloud, the DLAMI comes pre-installed with AWS-optimized deep learning (DL) frameworks […]

Read More

Run AlphaFold v2.0 on Amazon EC2

After the article in Nature about the open-source of AlphaFold v2.0 on GitHub by DeepMind, many in the scientific and research community have wanted to try out DeepMind’s AlphaFold implementation firsthand. With compute resources through Amazon Elastic Compute Cloud (Amazon EC2) with Nvidia GPU, you can quickly get AlphaFold running and try it out yourself. […]

Read More

Train and deploy deep learning models using JAX with Amazon SageMaker

Amazon SageMaker is a fully managed service that enables developers and data scientists to quickly and easily build, train, and deploy machine learning (ML) models at any scale. Typically, you can use the pre-built and optimized training and inference containers that have been optimized for AWS hardware. Although those containers cover many deep learning workloads, you may have […]

Read More

Join AWS at NVIDIA GTC 21, April 12–16

Starting Monday, April 12, 2021, the NVIDIA GPU Technology Conference (GTC) is offering online sessions for you to learn AWS best practices to accomplish your machine learning (ML), virtual workstations, high performance computing (HPC), and Internet of Things (IoT) goals faster and more easily. Amazon Elastic Compute Cloud (Amazon EC2) instances powered by NVIDIA GPUs […]

Read More

Model serving in Java with AWS Elastic Beanstalk made easy with Deep Java Library

Deploying your machine learning (ML) models to run on a REST endpoint has never been easier. Using AWS Elastic Beanstalk and Amazon Elastic Compute Cloud (Amazon EC2) to host your endpoint and Deep Java Library (DJL) to load your deep learning models for inference makes the model deployment process extremely easy to set up. Setting […]

Read More

How to run distributed training using Horovod and MXNet on AWS DL Containers and AWS  Deep Learning AMIs

Distributed training of large deep learning models has become an indispensable way of model training for computer vision (CV) and natural language processing (NLP) applications. Open source frameworks such as Horovod provide distributed training support to Apache MXNet, PyTorch, and TensorFlow. Converting your non-distributed Apache MXNet training script to use distributed training with Horovod only […]

Read More

Multi-GPU distributed deep learning training at scale with Ubuntu18 DLAMI, EFA on P3dn instances, and Amazon FSx for Lustre

AWS Deep Learning AMI (Ubuntu 18.04) is optimized for deep learning on EC2 Accelerated Computing Instance types, allowing you to scale out to multiple nodes for distributed workloads more efficiently and easily. It has a prebuilt Elastic Fabric Adapter (EFA), Nvidia GPU stack, and many deep learning frameworks (TensorFlow, MXNet, PyTorch, Chainer, Keras) for distributed […]

Read More

AWS Deep Learning AMIs now come with TensorFlow 1.13, MXNet 1.4, and support Amazon Linux 2

The AWS Deep Learning AMIs now come with MXNet 1.4.0, Chainer 5.3.0, and TensorFlow 1.13.1, which is custom-built directly from source and tuned for high-performance training across Amazon EC2 instances. AWS Deep Learning AMIs are now available on Amazon Linux 2 Developers can now use the AWS Deep Learning AMIs and Deep Learning Base AMI on […]

Read More

Deploy TensorFlow models with Amazon Elastic Inference using a flexible new Python API available in EI-enabled TensorFlow 1.12

Amazon Elastic Inference (EI) now supports the latest version of TensorFlow­–1.12. It provides EIPredictor, a new easy-to-use Python API function for deploying TensorFlow models using EI accelerators. You can now use this new Python API function within your inference scripts as an alternative to using TensorFlow Serving when running TensorFlow models with EI. EIPredictor allows […]

Read More

Scalable multi-node training with TensorFlow

We’ve heard from customers that scaling TensorFlow training jobs to multiple nodes and GPUs successfully is hard. TensorFlow has distributed training built-in, but it can be difficult to use. Recently, we made optimizations to TensorFlow and Horovod to help AWS customers scale TensorFlow training jobs to multiple nodes and GPUs. With these improvements, any AWS customer […]

Read More