AWS Machine Learning Blog

Category: Amazon Elastic Inference

Reduce inference costs on Amazon EC2 for PyTorch models with Amazon Elastic Inference

You can now use Amazon Elastic Inference to accelerate inference and reduce inference costs for PyTorch models in both Amazon SageMaker and Amazon EC2. PyTorch is a popular deep learning framework that uses dynamic computational graphs. This allows you to easily develop deep learning models with imperative and idiomatic Python code. Inference is the process […]

Increasing performance and reducing the cost of MXNet inference using Amazon SageMaker Neo and Amazon Elastic Inference

When running deep learning models in production, balancing infrastructure cost versus model latency is always an important consideration. At re:Invent 2018, AWS introduced Amazon SageMaker Neo and Amazon Elastic Inference, two services that can make models more efficient for deep learning. In most deep learning applications, making predictions using a trained model—a process called inference—can […]

Reduce ML inference costs on Amazon SageMaker for PyTorch models using Amazon Elastic Inference

Today, we are excited to announce that you can now use Amazon Elastic Inference to accelerate inference and reduce inference costs for PyTorch models in both Amazon SageMaker and Amazon EC2. PyTorch is a popular deep learning framework that uses dynamic computational graphs. This allows you to easily develop deep learning models with imperative and […]

Optimizing TensorFlow model serving with Kubernetes and Amazon Elastic Inference

This post offers a dive deep into how to use Amazon Elastic Inference with Amazon Elastic Kubernetes Service. When you combine Elastic Inference with EKS, you can run low-cost, scalable inference workloads with your preferred container orchestration system. Elastic Inference is an increasingly popular way to run low-cost inference workloads on AWS. It allows you […]

Serving deep learning at Curalate with Apache MXNet, AWS Lambda, and Amazon Elastic Inference

This is a guest blog post by Jesse Brizzi, a computer vision research engineer at Curalate. At Curalate, we’re always coming up with new ways to use deep learning and computer vision to find and leverage user-generated content (UGC) and activate influencers. Some of these applications, like Intelligent Product Tagging, require deep learning models to […]

Optimizing costs in Amazon Elastic Inference with TensorFlow

Amazon Elastic Inference allows you to attach low-cost GPU-powered acceleration to Amazon EC2 and Amazon SageMaker instances, and reduce the cost of running deep learning inference by up to 75 percent. The EIPredictorAPI makes it easy to use Elastic Inference. In this post, we use the EIPredictor and describe a step-by-step example for using TensorFlow with Elastic Inference. Additionally, we […]

Running Java-based deep learning with MXNet and Amazon Elastic Inference

The new release of MXNet 1.4 for Amazon Elastic Inference now includes Java and Scala support. Apache MXNet is an open source deep learning framework used to build, train, and deploy deep neural networks. Amazon Elastic Inference (EI) is a service that allows you to attach low-cost GPU-powered acceleration to Amazon EC2 and Amazon SageMaker […]

Launch EI accelerators in minutes with the Amazon Elastic Inference setup tool for EC2

The Amazon Elastic Inference (EI) setup tool is a Python script that enables you to quickly get started with EI. Elastic Inference allows you to attach low-cost GPU-powered acceleration to Amazon EC2 and Amazon SageMaker instances to reduce the cost of running deep learning inference by up to 75 percent. If you are using EI for the […]

Reducing deep learning inference cost with MXNet and Amazon Elastic Inference

Amazon Elastic Inference (Amazon EI) is a service that allows you to attach low-cost GPU-powered acceleration to Amazon EC2 and Amazon SageMaker instances. MXNet has supported Amazon EI since its initial release at AWS re:Invent 2018. In this blog post, we’ll explore the cost and performance benefits of using Amazon EI with MXNet. We’ll walk […]

Model serving with Amazon Elastic Inference

Amazon Elastic Inference (EI) is a service that allows you to attach low-cost GPU-powered acceleration to Amazon EC2 and Amazon SageMaker instances. EI reduces the cost of running deep learning inference by up to 75%. Model Server for Apache MXNet (MMS) enables deployment of MXNet- and ONNX-based models for inference at scale. In this blog […]