AWS Machine Learning Blog

Category: Tensorflow on AWS

Train and deploy deep learning models using JAX with Amazon SageMaker

Amazon SageMaker is a fully managed service that enables developers and data scientists to quickly and easily build, train, and deploy machine learning (ML) models at any scale. Typically, you can use the pre-built and optimized training and inference containers that have been optimized for AWS hardware. Although those containers cover many deep learning workloads, […]

Read More

Run your TensorFlow job on Amazon SageMaker with a PyCharm IDE

As more machine learning (ML) workloads go into production, many organizations must bring ML workloads to market quickly and increase productivity in the ML model development lifecycle. However, the ML model development lifecycle is significantly different from an application development lifecycle. This is due in part to the amount of experimentation required before finalizing a […]

Read More

Deploy variational autoencoders for anomaly detection with TensorFlow Serving on Amazon SageMaker

Anomaly detection is the process of identifying items, events, or occurrences that have different characteristics from the majority of the data. It has many applications in various fields, like fraud detection for credit cards, insurance, or healthcare; network intrusion detection for cybersecurity; KPI metrics monitoring for critical systems; and predictive maintenance for in-service equipment. There […]

Read More

How Contentsquare reduced TensorFlow inference latency with TensorFlow Serving on Amazon SageMaker

In this post, we present the results of a model serving experiment made by Contentsquare scientists with an innovative DL model trained to analyze HTML documents. We show how the Amazon SageMaker TensorFlow Serving solution helped Contentsquare address several serving challenges. Contentsquare’s challenge Contentsquare is a fast-growing French technology company empowering brands to build better […]

Read More

Maximize TensorFlow performance on Amazon SageMaker endpoints for real-time inference

Machine learning (ML) is realized in inference. The business problem you want your ML model to solve is the inferences or predictions that you want your model to generate. Deployment is the stage in which a model, after being trained, is ready to accept inference requests. In this post, we describe the parameters that you […]

Read More

Creating an end-to-end application for orchestrating custom deep learning HPO, training, and inference using AWS Step Functions

Amazon SageMaker hyperparameter tuning provides a built-in solution for scalable training and hyperparameter optimization (HPO). However, for some applications (such as those with a preference of different HPO libraries or customized HPO features), we need custom machine learning (ML) solutions that allow retraining and HPO. This post offers a step-by-step guide to build a custom deep […]

Read More

Implement checkpointing with TensorFlow for Amazon SageMaker Managed Spot Training

Customers often ask us how can they lower their costs when conducting deep learning training on AWS. Training deep learning models with libraries such as TensorFlow, PyTorch, and Apache MXNet usually requires access to GPU instances, which are AWS instances types that provide access to NVIDIA GPUs with thousands of compute cores. GPU instance types […]

Read More

Using container images to run TensorFlow models in AWS Lambda

TensorFlow is an open-source machine learning (ML) library widely used to develop neural networks and ML models. Those models are usually trained on multiple GPU instances to speed up training, resulting in expensive training time and model sizes up to a few gigabytes. After they’re trained, these models are deployed in production to produce inferences. […]

Read More
This dataset contains 500 images of bees that have been uploaded by iNaturalist users for the purposes of recording the observation and identification.

Training and deploying models using TensorFlow 2 with the Object Detection API on Amazon SageMaker

With the rapid growth of object detection techniques, several frameworks with packaged pre-trained models have been developed to provide users easy access to transfer learning. For example, GluonCV, Detectron2, and the TensorFlow Object Detection API are three popular computer vision frameworks with pre-trained models. In this post, we use Amazon SageMaker to build, train, and […]

Read More

AWS and NVIDIA achieve the fastest training times for Mask R-CNN and T5-3B

Note: At the AWS re:Invent Machine Learning Keynote we announced performance records for T5-3B and Mask-RCNN. This blog post includes updated numbers with additional optimizations since the keynote aired live on 12/8. At re:Invent 2019, we demonstrated the fastest training times on the cloud for Mask R-CNN, a popular instance segmentation model, and BERT, a […]

Read More