AWS Machine Learning Blog

Hyperparameter optimization for fine-tuning pre-trained transformer models from Hugging Face

Large attention-based transformer models have obtained massive gains on natural language processing (NLP). However, training these gigantic networks from scratch requires a tremendous amount of data and compute. For smaller NLP datasets, a simple yet effective strategy is to use a pre-trained transformer, usually trained in an unsupervised fashion on very large datasets, and fine-tune […]

Diagnose model performance before deployment for Amazon Fraud Detector

With the growth in adoption of online applications and the rising number of internet users, digital fraud is on the rise year over year. Amazon Fraud Detector provides a fully managed service to help you better identify potentially fraudulent online activities using advanced machine learning (ML) techniques, and more than 20 years of fraud detection […]

Create audio for content in multiple languages with the same TTS voice persona in Amazon Polly

Amazon Polly is a leading cloud-based service that converts text into lifelike speech. Following the adoption of Neural Text-to-Speech (NTTS), we have continuously expanded our portfolio of available voices in order to provide a wide selection of distinct speakers in supported languages. Today, we are pleased to announce four new additions: Pedro speaking US Spanish, […]

New built-in Amazon SageMaker algorithms for tabular data modeling: LightGBM, CatBoost, AutoGluon-Tabular, and TabTransformer

July 2023: You can also use the newly launched JumpStart APIs, an extension of the SageMaker Python SDK. These APIs allow you to programmatically deploy and fine-tune a vast selection of JumpStart-supported pre-trained models on your own datasets. Please refer to Amazon SageMaker JumpStart models and algorithms now available via API for more details on how […]

Semantic segmentation data labeling and model training using Amazon SageMaker

In computer vision, semantic segmentation is the task of classifying every pixel in an image with a class from a known set of labels such that pixels with the same label share certain characteristics. It generates a segmentation mask of the input images. For example, the following images show a segmentation mask of the cat […]

Deep demand forecasting with Amazon SageMaker

Every business needs the ability to predict the future accurately in order to make better decisions and give the company a competitive advantage. With historical data, businesses can understand trends, make predictions of what might happen and when, and incorporate that information into their future plans, from product demand to inventory planning and staffing. If […]

Inspect your data labels with a visual, no code tool to create high-quality training datasets with Amazon SageMaker Ground Truth Plus

Launched at AWS re:Invent 2021, Amazon SageMaker Ground Truth Plus helps you create high-quality training datasets by removing the undifferentiated heavy lifting associated with building data labeling applications and managing the labeling workforce. All you do is share data along with labeling requirements, and Ground Truth Plus sets up and manages your data labeling workflow […]

Choose specific timeseries to forecast with Amazon Forecast

Today, we’re excited to announce that Amazon Forecast offers the ability to generate forecasts on a selected subset of items. This helps you to leverage the full value of your data, and apply it selectively on your choice of items reducing the time and effort to get forecasted results. Generating a forecast on ‘all’ items of the […]

Improve ML developer productivity with Weights & Biases: A computer vision example on Amazon SageMaker

July 2023: This post was reviewed for accuracy. This post is co-written with Thomas Capelle at Weights & Biases. As more organizations use deep learning techniques such as computer vision and natural language processing, the machine learning (ML) developer persona needs scalable tooling around experiment tracking, lineage, and collaboration. Experiment tracking includes metadata such as […]

How Cepsa used Amazon SageMaker and AWS Step Functions to industrialize their ML projects and operate their models at scale

This blog post is co-authored by Guillermo Ribeiro, Sr. Data Scientist at Cepsa. Machine learning (ML) has rapidly evolved from being a fashionable trend emerging from academic environments and innovation departments to becoming a key means to deliver value across businesses in every industry. This transition from experiments in laboratories to solving real-world problems in […]