AWS Machine Learning Blog

Tag: Amazon SageMaker

Introduction to the Amazon SageMaker Neural Topic Model

Structured and unstructured data are being generated at an unprecedented rate, so you need the right tools to help organize, search, and understand this vast amount of information, it’s challenging to make the data useful. This is especially true for unstructured data, and it’s estimated that over 80% of the data in enterprises is unstructured. Text analytics […]

AWS internal use-case: Evaluating and adopting Amazon SageMaker within AWS Marketing

We’re the AWS Marketing Data Science team. We use advanced analytical and machine learning (ML) techniques so we can share insights into business problems across the AWS customer lifecycle, such as ML-driven scoring of sales leads, ML-based targeting segments, and econometric models for downstream impact measurement. Within Amazon, each team operates independently and owns the […]

Amazon SageMaker console now supports training job cloning

Today we are launching the training job cloning feature on the Amazon SageMaker console, which makes it much easier for you to create training jobs based on existing ones. When you use Amazon SageMaker, it’s common to run multiple training jobs using different training sets and identical configuration. It’s also common to adjust a specific […]

Using R with Amazon SageMaker

July, 2022: This post was reviewed and updated for relevancy and accuracy, with an updated AWS CloudFormation Template. December 2020: Post updated with changes required for Amazon SageMaker SDK v2 This blog post describes how to train, deploy, and retrieve predictions from a machine learning (ML) model using Amazon SageMaker and R. The model predicts abalone age […]

Using Pipe input mode for Amazon SageMaker algorithms

Today, we are introducing Pipe input mode support for the Amazon SageMaker built-in algorithms. With Pipe input mode, your dataset is streamed directly to your training instances instead of being downloaded first. This means that your training jobs start sooner, finish quicker, and need less disk space. Amazon SageMaker algorithms have been engineered to be […]

Perform a large-scale principal component analysis faster using Amazon SageMaker

In this blog post, we conduct a performance comparison for PCA using Amazon SageMaker, Spark ML, and Scikit-Learn on high-dimensional datasets. SageMaker consistently showed faster computational performance. Refer Figures (1) and (2) at the bottom to see the speed improvements. Principal Component Analysis Principal Component Analysis (PCA) is an unsupervised learning algorithm that attempts to […]

Running fast.ai notebooks with Amazon SageMaker

Update 25 JAN 2019: fast.ai has released a new version of their library and MOOC making the following blog post outdated. For the latest instructions on setting up the library and course on a SageMaker Notebook instance please refer to the instructions outlined here: https://course.fast.ai/start_sagemaker.html fast.ai is an organization dedicated to making the power of deep learning accessible […]

Create a Word-Pronunciation sequence-to-sequence model using Amazon SageMaker

Amazon SageMaker seq2seq offers you a very simple way to make use of the state-of-the-art encoder-decoder architecture (including the attention mechanism) for your sequence to sequence tasks. You just need to prepare your sequence data in recordio-protobuf format and your vocabulary mapping files in JSON format. Then you need to upload them to Amazon Simple […]