AWS Machine Learning Blog

Category: Amazon SageMaker Neo

ML inferencing at the edge with Amazon SageMaker Edge and Ambarella CV25

Ambarella builds computer vision SoCs (system on chips) based on a very efficient AI chip architecture and CVflow that provides the Deep Neural Network (DNN) processing required for edge inferencing use cases like intelligent home monitoring and smart surveillance cameras. Developers convert models trained with frameworks (such as TensorFlow or MXNET) to Ambarella CVflow format […]

Read More

Unlock near 3x performance gains with XGBoost and Amazon SageMaker Neo

October 2021: This post has been updated with a new sample notebook for Amazon SageMaker Studio users.  When a model gets deployed to a production environment, inference speed matters. Models with fast inference speeds require less resources to run, which translates to cost savings, and applications that consume the models’ predictions benefit from the improved […]

Read More

Build reusable, serverless inference functions for your Amazon SageMaker models using AWS Lambda layers and containers

December 1, 2021: Amazon SageMaker Serverless Inference (preview) is a new inference option that enables you to easily deploy machine learning models for inference without having to configure or manage the underlying infrastructure. Watch for upcoming further details and check out the official announcement to learn more! In AWS, you can host a trained model […]

Read More

Reduce ML inference costs on Amazon SageMaker with hardware and software acceleration

Amazon SageMaker is a fully-managed service that enables data scientists and developers to build, train, and deploy machine learning (ML) models at 50% lower TCO than self-managed deployments on Elastic Compute Cloud (Amazon EC2). Elastic Inference is a capability of SageMaker that delivers 20% better performance for model inference than AWS Deep Learning Containers on […]

Read More

Monitor and Manage Anomaly Detection Models on a fleet of Wind Turbines with Amazon SageMaker Edge Manager

September 8, 2021: Amazon Elasticsearch Service has been renamed to Amazon OpenSearch Service. See details. In industrial IoT, running machine learning (ML) models on edge devices is necessary for many use cases, such as predictive maintenance, quality improvement, real-time monitoring, process optimization, and security. The energy industry, for instance, invests heavily in ML to automate […]

Read More

New Amazon SageMaker Neo features to run more models faster and more efficiently on more hardware platforms

Amazon SageMaker Neo enables developers to train machine learning (ML) models once and optimize them to run on any Amazon SageMaker endpoints in the cloud and supported devices at the edge. Since Neo was first announced at re:Invent 2018, we have been continuously working with the Neo-AI open-source communities and several hardware partners to increase […]

Read More

Model dynamism Support in Amazon SageMaker Neo

Amazon SageMaker Neo was launched at AWS re:Invent 2018. It made notable performance improvement on models with statically known input and output data shapes, typically image classification models. These models are usually composed of a stack of blocks that contain compute-intensive operators, such as convolution and matrix multiplication. Neo applies a series of optimizations to […]

Read More

Amazon SageMaker Neo makes it easier to get faster inference for more ML models with NVIDIA TensorRT

Amazon SageMaker Neo now uses the NVIDIA TensorRT acceleration library to increase the speedup of machine learning (ML) models on NVIDIA Jetson devices at the edge and AWS g4dn and p3 instances in the AWS Cloud. Neo compiles models from TensorFlow, TFLite, MXNet, PyTorch, ONNX, and DarkNet to make optimal use of NVIDIA GPUs, providing […]

Read More

Optimizing ML models for iOS and MacOS devices with Amazon SageMaker Neo and Core ML

Core ML is a machine learning (ML) model format created and supported by Apple that compiles, deploys, and runs on Apple devices. Developers who train their models in popular frameworks such as TensorFlow and PyTorch convert models to Core ML format to deploy them on Apple devices. AWS has automated the model conversion to Core […]

Read More

Speeding up TensorFlow, MXNet, and PyTorch inference with Amazon SageMaker Neo

Various machine learning (ML) optimizations are possible at every stage of the flow during or after training. Model compiling is one optimization that creates a more efficient implementation of a trained model. In 2018, we launched Amazon SageMaker Neo to compile machine learning models for many frameworks and many platforms. We created the ML compiler […]

Read More