AWS Machine Learning Blog

Category: Amazon SageMaker

The following diagram illustrates the architecture for our experiments.

Building predictive disease models using Amazon SageMaker with Amazon HealthLake normalized data

In this post, we walk you through the steps to build machine learning (ML) models in Amazon SageMaker with data stored in Amazon HealthLake using two example predictive disease models we trained on sample data using the MIMIC-III dataset. This dataset was developed by the MIT lab for Computational Physiology and consists of de-identified healthcare […]

Read More
The following image shows multiple vessel voyages of the same vessel in different colors.

Using machine learning to predict vessel time of arrival with Amazon SageMaker

According to the International Chamber of Shipping, 90% of world commerce happens at sea. Vessels are transporting every possible kind of commodity, including raw materials and semi-finished and finished goods, making ocean transportation a key component of the global supply chain. Manufacturers, retailers, and the end consumer are reliant on hundreds of thousands of ships […]

Read More

Creating high-quality machine learning models for financial services using Amazon SageMaker Autopilot

Machine learning (ML) is used throughout the financial services industry to perform a wide variety of tasks, such as fraud detection, market surveillance, portfolio optimization, loan solvency prediction, direct marketing, and many others. This breadth of use cases has created a need for lines of business to quickly generate high-quality and performant models that can […]

Read More

How to train procedurally generated game-like environments at scale with Amazon SageMaker RL

A gym is a toolkit for developing and comparing reinforcement learning algorithms. Procgen Benchmark is a suite of 16 procedurally-generated gym environments designed to benchmark both sample efficiency and generalization in reinforcement learning.  These environments are associated with the paper Leveraging Procedural Generation to Benchmark Reinforcement Learning (citation). Compared to Gym Retro, these environments have […]

Read More
The following diagram illustrates this architecture.

Hosting a private PyPI server for Amazon SageMaker Studio notebooks in a VPC

Amazon SageMaker Studio notebooks provide a full-featured integrated development environment (IDE) for flexible machine learning (ML) experimentation and development. Security measures secure and support a versatile and collaborative environment. In some cases, such as to protect sensitive data or meet regulatory requirements, security protocols require that public internet access be disabled in the development environment. […]

Read More

Artificial intelligence and machine learning continues at AWS re:Invent

A fresh new year is here, and we wish you all a wonderful 2021. We signed off last year at AWS re:Invent on the artificial intelligence (AI) and machine learning (ML) track with the first ever machine learning keynote and over 50 AI/ML focused technical sessions covering industries, use cases, applications, and more. You can […]

Read More

Accelerating MLOps at Bayer Crop Science with Kubeflow Pipelines and Amazon SageMaker

This is a guest post by the data science team at Bayer Crop Science.  Farmers have always collected and evaluated a large amount of data with each growing season: seeds planted, crop protection inputs applied, crops harvested, and much more. The rise of data science and digital technologies provides farmers with a wealth of new […]

Read More

Implementing a custom labeling GUI with built-in processing logic with Amazon SageMaker Ground Truth

Amazon SageMaker Ground Truth is a fully managed data labeling service that makes it easy to build highly accurate training datasets for machine learning. It offers easy access to Amazon Mechanical Turk and private human labelers, and provides them with built-in workflows and interfaces for common labeling tasks. A labeling team may wish to use […]

Read More

Extracting buildings and roads from AWS Open Data using Amazon SageMaker

Sharing data and computing in the cloud allows data users to focus on data analysis rather than data access. Open Data on AWS helps you discover and share public open datasets in the cloud. The Registry of Open Data on AWS hosts a large amount of public open data. The datasets range from genomics to climate to transportation […]

Read More
For an existing data lake registered with Lake Formation, the following diagram illustrates the proposed implementation.

Control and audit data exploration activities with Amazon SageMaker Studio and AWS Lake Formation

Certain industries are required to audit all access to their data. This includes auditing exploratory activities performed by data scientists, who usually query data from within machine learning (ML) notebooks. This post walks you through the steps to implement access control and auditing capabilities on a per-user basis, using Amazon SageMaker Studio notebooks and AWS […]

Read More