AWS Machine Learning Blog

Category: Tensorflow on AWS

Deploy variational autoencoders for anomaly detection with TensorFlow Serving on Amazon SageMaker

Anomaly detection is the process of identifying items, events, or occurrences that have different characteristics from the majority of the data. It has many applications in various fields, like fraud detection for credit cards, insurance, or healthcare; network intrusion detection for cybersecurity; KPI metrics monitoring for critical systems; and predictive maintenance for in-service equipment. There […]

Read More

How Contentsquare reduced TensorFlow inference latency with TensorFlow Serving on Amazon SageMaker

In this post, we present the results of a model serving experiment made by Contentsquare scientists with an innovative DL model trained to analyze HTML documents. We show how the Amazon SageMaker TensorFlow Serving solution helped Contentsquare address several serving challenges. Contentsquare’s challenge Contentsquare is a fast-growing French technology company empowering brands to build better […]

Read More

Maximize TensorFlow performance on Amazon SageMaker endpoints for real-time inference

Machine learning (ML) is realized in inference. The business problem you want your ML model to solve is the inferences or predictions that you want your model to generate. Deployment is the stage in which a model, after being trained, is ready to accept inference requests. In this post, we describe the parameters that you […]

Read More

Creating an end-to-end application for orchestrating custom deep learning HPO, training, and inference using AWS Step Functions

Amazon SageMaker hyperparameter tuning provides a built-in solution for scalable training and hyperparameter optimization (HPO). However, for some applications (such as those with a preference of different HPO libraries or customized HPO features), we need custom machine learning (ML) solutions that allow retraining and HPO. This post offers a step-by-step guide to build a custom deep […]

Read More

Implement checkpointing with TensorFlow for Amazon SageMaker Managed Spot Training

Customers often ask us how can they lower their costs when conducting deep learning training on AWS. Training deep learning models with libraries such as TensorFlow, PyTorch, and Apache MXNet usually requires access to GPU instances, which are AWS instances types that provide access to NVIDIA GPUs with thousands of compute cores. GPU instance types […]

Read More

Using container images to run TensorFlow models in AWS Lambda

TensorFlow is an open-source machine learning (ML) library widely used to develop neural networks and ML models. Those models are usually trained on multiple GPU instances to speed up training, resulting in expensive training time and model sizes up to a few gigabytes. After they’re trained, these models are deployed in production to produce inferences. […]

Read More
This dataset contains 500 images of bees that have been uploaded by iNaturalist users for the purposes of recording the observation and identification.

Training and deploying models using TensorFlow 2 with the Object Detection API on Amazon SageMaker

With the rapid growth of object detection techniques, several frameworks with packaged pre-trained models have been developed to provide users easy access to transfer learning. For example, GluonCV, Detectron2, and the TensorFlow Object Detection API are three popular computer vision frameworks with pre-trained models. In this post, we use Amazon SageMaker to build, train, and […]

Read More

AWS and NVIDIA achieve the fastest training times for Mask R-CNN and T5-3B

Note: At the AWS re:Invent Machine Learning Keynote we announced performance records for T5-3B and Mask-RCNN. This blog post includes updated numbers with additional optimizations since the keynote aired live on 12/8. At re:Invent 2019, we demonstrated the fastest training times on the cloud for Mask R-CNN, a popular instance segmentation model, and BERT, a […]

Read More

Visualizing TensorFlow training jobs with TensorBoard

TensorBoard is an open source toolkit for TensorFlow users that allows you to visualize a wide range of useful information about your model, from model graphs; to loss, accuracy, or custom metrics; to embedding projections, images, and histograms of weights and biases. This post demonstrates how to use TensorBoard with Amazon SageMaker training jobs, write […]

Read More

Introducing Amazon SageMaker Components for Kubeflow Pipelines

Today we’re announcing Amazon SageMaker Components for Kubeflow Pipelines. This post shows how to build your first Kubeflow pipeline with Amazon SageMaker components using the Kubeflow Pipelines SDK. Kubeflow is a popular open-source machine learning (ML) toolkit for Kubernetes users who want to build custom ML pipelines.  Kubeflow Pipelines is an add-on to Kubeflow that lets […]

Read More