AWS Machine Learning Blog

Category: Artificial Intelligence

Break through language barriers with Amazon Transcribe, Amazon Translate, and Amazon Polly

April 2024: This post was reviewed and updated for accuracy. Imagine a surgeon taking video calls with patients across the globe without the need of a human translator. What if a fledgling startup could easily expand their product across borders and into new geographical markets by offering fluid, accurate, multilingual customer support and sales, all […]

Use Amazon SageMaker Data Wrangler in Amazon SageMaker Studio with a default lifecycle configuration

If you use the default lifecycle configuration for your domain or user profile in Amazon SageMaker Studio and use Amazon SageMaker Data Wrangler for data preparation, then this post is for you. In this post, we show how you can create a Data Wrangler flow and use it for data preparation in a Studio environment […]

Secure Amazon SageMaker Studio presigned URLs Part 1: Foundational infrastructure

You can access Amazon SageMaker Studio notebooks from the Amazon SageMaker console via AWS Identity and Access Management (IAM) authenticated federation from your identity provider (IdP), such as Okta. When a Studio user opens the notebook link, Studio validates the federated user’s IAM policy to authorize access, and generates and resolves the presigned URL for […]

Secure Amazon SageMaker Studio presigned URLs Part 2: Private API with JWT authentication

In part 1 of this series, we demonstrated how to resolve an Amazon SageMaker Studio presigned URL from a corporate network using Amazon private VPC endpoints without traversing the internet. In this post, we will continue to build on top of the previous solution to demonstrate how to build a private API Gateway via Amazon API […]

Use a custom image to bring your own development environment to RStudio on Amazon SageMaker

March 2023: This post was reviewed and updated to comply with service version update.. RStudio on Amazon SageMaker is the industry’s first fully managed RStudio Workbench in cloud. You can quickly launch the familiar RStudio integrated development environment (IDE), and dial up and down the underlying compute resources without interrupting your work, making it easy […]

Text classification for online conversations with machine learning on AWS

Online conversations are ubiquitous in modern life, spanning industries from video games to telecommunications. This has led to an exponential growth in the amount of online conversation data, which has helped in the development of state-of-the-art natural language processing (NLP) systems like chatbots and natural language generation (NLG) models. Over time, various NLP techniques for […]

Hyperparameter optimization for fine-tuning pre-trained transformer models from Hugging Face

Large attention-based transformer models have obtained massive gains on natural language processing (NLP). However, training these gigantic networks from scratch requires a tremendous amount of data and compute. For smaller NLP datasets, a simple yet effective strategy is to use a pre-trained transformer, usually trained in an unsupervised fashion on very large datasets, and fine-tune […]

Diagnose model performance before deployment for Amazon Fraud Detector

With the growth in adoption of online applications and the rising number of internet users, digital fraud is on the rise year over year. Amazon Fraud Detector provides a fully managed service to help you better identify potentially fraudulent online activities using advanced machine learning (ML) techniques, and more than 20 years of fraud detection […]

Create audio for content in multiple languages with the same TTS voice persona in Amazon Polly

Amazon Polly is a leading cloud-based service that converts text into lifelike speech. Following the adoption of Neural Text-to-Speech (NTTS), we have continuously expanded our portfolio of available voices in order to provide a wide selection of distinct speakers in supported languages. Today, we are pleased to announce four new additions: Pedro speaking US Spanish, […]

New built-in Amazon SageMaker algorithms for tabular data modeling: LightGBM, CatBoost, AutoGluon-Tabular, and TabTransformer

July 2023: You can also use the newly launched JumpStart APIs, an extension of the SageMaker Python SDK. These APIs allow you to programmatically deploy and fine-tune a vast selection of JumpStart-supported pre-trained models on your own datasets. Please refer to Amazon SageMaker JumpStart models and algorithms now available via API for more details on how […]