AWS Open Source Blog

Category: Amazon Machine Learning

Virtual GPU device plugin for inference workloads in Kubernetes

Machine learning (ML) has become a centerpiece for enterprise transformation. AWS provides a broad and deep set of ML capabilities for builders with all levels of expertise. Developers with no prior ML experience can seamlessly build sophisticated AI-driven applications using AWS AI services. Developers and data scientists can use Amazon SageMaker, a managed machine learning […]

Read More
workflow: how to deploy TorchServe on an Amazon EKS cluster for inference, which will allow you to quickly deploy a pre-trained machine learning model as a scalable, fault-tolerant web-service for low latency inference

Running TorchServe on Amazon Elastic Kubernetes Service

This article was contributed by Josiah Davis, Charles Frenzel, and Chen Wu. TorchServe is a model serving library that makes it easy to deploy and manage PyTorch models at scale in production environments. TorchServe removes the heavy lifting of deploying and serving PyTorch models with Kubernetes. TorchServe is built and maintained by AWS in collaboration […]

Read More
planet earth from space

How Amazon retail systems run machine learning predictions with Apache Spark using Deep Java Library

Today more and more companies are taking a personalized approach to content and marketing. For example, retailers are personalizing product recommendations and promotions for customers. An important step toward providing personalized recommendations is to identify a customer’s propensity to take action for a certain category. This propensity is based on a customer’s preferences and past […]

Read More
computing illustration via pixabay

Deploy machine learning models to Amazon SageMaker using the ezsmdeploy Python package and a few lines of code

Customers on AWS deploy trained machine learning (ML) and deep learning (DL) models in production using Amazon SageMaker, and using other services such as AWS Lambda, AWS Fargate, AWS Elastic Beanstalk, and Amazon Elastic Compute Cloud (Amazon EC2) to name a few. Amazon SageMaker provides SDKs and a console-only workflow to deploy trained models, and […]

Read More

Adopting machine learning in your microservices with DJL (Deep Java Library) and Spring Boot

Many AWS customers—startups and large enterprises—are on a path to adopt machine learning and deep learning in their existing applications. The reasons for machine learning adoption are dictated by the pace of innovation in the industry, with business use cases ranging from customer service (including object detection from images and video streams, sentiment analysis) to […]

Read More
AutoGluon how-to tutorial

Machine learning with AutoGluon, an open source AutoML library

If you work in data science, you might think that the hardest thing about machine learning is not knowing when you’ll be done. You start with a problem, a dataset, and an idea about how to solve it, but you never know whether your approach is going to work until later, after you’ve wasted time. […]

Read More
diagram of host machine, container, code, and datasets and checkpoints

Why use Docker containers for machine learning development?

I like prototyping on my laptop, as much as the next person. When I want to collaborate, I push my code to GitHub and invite collaborators. And when I want to run experiments and need more compute power, I rent CPU and GPU instances in the cloud, copy my code and dependencies over, and run […]

Read More
Andy Jassy giving the 2019 re:Invent keynote.

re:Cap part one – open source at re:Invent 2019

As the dust settles after another re:Invent closes, I wanted to put together a quick summary of all the open source-related announcements that happened in the run up to this year’s re:Invent and the week itself. If you are interested in open source in mobile web development, devops, containers, security, big data and data analytics, […]

Read More
GluonTS-graph.

Announcing Gluon Time Series, an Open-Source Time Series Modeling Toolkit

Today, we announce the availability of Gluon Time Series (GluonTS), an Apache MXNet-based toolkit for time series analysis using the Gluon API. We are excited to give researchers and practitioners working with time series data access to this toolkit, which we have built for our own needs as applied scientists working on real-world industrial time […]

Read More
EKS performance - resnet50.

Best Practices for Optimizing Distributed Deep Learning Performance on Amazon EKS

中文版 – In this post, we will demonstrate how to create a fully-managed Kubernetes cluster on AWS using Amazon Elastic Container Service for Kubernetes (Amazon EKS), and how to run distributed deep learning training jobs using Kubeflow and the AWS FSx CSI driver. We then will discuss best practices to optimize machine learning training performance […]

Read More