AWS Machine Learning Blog

Category: Amazon SageMaker

The following diagram illustrates the high-level workflow of Model Monitor.

Monitoring in-production ML models at large scale using Amazon SageMaker Model Monitor

Machine learning (ML) models are impacting business decisions of organizations around the globe, from retail and financial services to autonomous vehicles and space exploration. For these organizations, training and deploying ML models into production is only one step towards achieving business goals. Model performance may degrade over time for several reasons, such as changing consumer […]

Read More

Training a reinforcement learning Agent with Unity and Amazon SageMaker RL

Unity is one of the most popular game engines that has been adopted not only for video game development but also by industries such as film and automotive. Unity offers tools to create virtual simulated environments with customizable physics, landscapes, and characters. The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables […]

Read More
The same survey highlights that the top three biggest roadblocks to deploying a model in production are managing dependencies and environments, security, and skill gaps.

Exploratory data analysis, feature engineering, and operationalizing your data flow into your ML pipeline with Amazon SageMaker Data Wrangler

According to The State of Data Science 2020 survey, data management, exploratory data analysis (EDA), feature selection, and feature engineering accounts for more than 66% of a data scientist’s time (see the following diagram). The same survey highlights that the top three biggest roadblocks to deploying a model in production are managing dependencies and environments, […]

Read More

Identifying training bottlenecks and system resource under-utilization with Amazon SageMaker Debugger

At AWS re:Invent 2020, AWS released the profiling functionality for Amazon SageMaker Debugger. In this post, we expand on the importance of profiling deep neural network (DNN) training, review some of the common performance bottlenecks you might encounter, and demonstrate how to use the profiling feature in Debugger to detect such bottlenecks. In the context […]

Read More

Using streaming ingestion with Amazon SageMaker Feature Store to make ML-backed decisions in near-real time

Businesses are increasingly using machine learning (ML) to make near-real time decisions, such as placing an ad, assigning a driver, recommending a product, or even dynamically pricing products and services. ML models make predictions given a set of input data known as features, and data scientists easily spend more than 60% of their time designing […]

Read More

AWS and NVIDIA achieve the fastest training times for Mask R-CNN and T5-3B

Note: At the AWS re:Invent Machine Learning Keynote we announced performance records for T5-3B and Mask-RCNN. This blog post includes updated numbers with additional optimizations since the keynote aired live on 12/8. At re:Invent 2019, we demonstrated the fastest training times on the cloud for Mask R-CNN, a popular instance segmentation model, and BERT, a […]

Read More

Customizing and reusing models generated by Amazon SageMaker Autopilot

Amazon SageMaker Autopilot automatically trains and tunes the best machine learning (ML) models for classification or regression problems while allowing you to maintain full control and visibility. This not only allows data analysts, developers, and data scientists to train, tune, and deploy models with little to no code, but you can also review a generated […]

Read More

Identify bottlenecks, improve resource utilization, and reduce ML training costs with the deep profiling feature in Amazon SageMaker Debugger

Machine learning (ML) has shown great promise across domains such as predictive analysis, speech processing, image recognition, recommendation systems, bioinformatics, and more. Training ML models is a time- and compute-intensive process, requiring multiple training runs with different hyperparameters before a model yields acceptable accuracy. CPU- and GPU-based distributed training with frameworks such as Horovod and […]

Read More

New Amazon SageMaker Neo features to run more models faster and more efficiently on more hardware platforms

Amazon SageMaker Neo enables developers to train machine learning (ML) models once and optimize them to run on any Amazon SageMaker endpoints in the cloud and supported devices at the edge. Since Neo was first announced at re:Invent 2018, we have been continuously working with the Neo-AI open-source communities and several hardware partners to increase […]

Read More

Model dynamism Support in Amazon SageMaker Neo

Amazon SageMaker Neo was launched at AWS re:Invent 2018. It made notable performance improvement on models with statically known input and output data shapes, typically image classification models. These models are usually composed of a stack of blocks that contain compute-intensive operators, such as convolution and matrix multiplication. Neo applies a series of optimizations to […]

Read More