AWS Machine Learning Blog

Category: Technical How-to

How to evaluate the quality of the synthetic data – measuring from the perspective of fidelity, utility, and privacy

In an increasingly data-centric world, enterprises must focus on gathering both valuable physical information and generating the information that they need but can’t easily capture. Data access, regulation, and compliance are an increasing source of friction for innovation in analytics and artificial intelligence (AI). For highly regulated sectors such as Financial Services, Healthcare, Life Sciences, […]

Augment fraud transactions using synthetic data in Amazon SageMaker

Developing and training successful machine learning (ML) fraud models requires access to large amounts of high-quality data. Sourcing this data is challenging because available datasets are sometimes not large enough or sufficiently unbiased to usefully train the ML model and may require significant cost and time. Regulation and privacy requirements further prevent data use or […]

Automatically identify languages in multi-lingual audio using Amazon Transcribe

If you operate in a country with multiple official languages or across multiple regions, your audio files can contain different languages. Participants may be speaking entirely different languages or may switch between languages. Consider a customer service call to report a problem in an area with a substantial multi-lingual population. Although the conversation could begin […]

Deploy Amazon SageMaker Autopilot models to serverless inference endpoints

Amazon SageMaker Autopilot automatically builds, trains, and tunes the best machine learning (ML) models based on your data, while allowing you to maintain full control and visibility. Autopilot can also deploy trained models to real-time inference endpoints automatically. If you have workloads with spiky or unpredictable traffic patterns that can tolerate cold starts, then deploying […]

Improve scalability for Amazon Rekognition stateless APIs using multiple regions

In previous blog post, we described an end-to-end identity verification solution in a single AWS Region. The solution uses the Amazon Rekognition APIs DetectFaces for face detection and CompareFaces for face comparison. We think of those APIs as stateless APIs because they don’t depend on an Amazon Rekognition face collection. They’re also idempotent, meaning repeated […]

Use your own training scripts and automatically select the best model using hyperparameter optimization in Amazon SageMaker

The success of any machine learning (ML) pipeline depends not just on the quality of model used, but also the ability to train and iterate upon this model. One of the key ways to improve an ML model is by choosing better tunable parameters, known as hyperparameters. This is known as hyperparameter optimization (HPO). However, […]

Build a robust text-based toxicity predictor

With the growth and popularity of online social platforms, people can stay more connected than ever through tools like instant messaging. However, this raises an additional concern about toxic speech, as well as cyber bullying, verbal harassment, or humiliation. Content moderation is crucial for promoting healthy online discussions and creating healthy online environments. To detect […]

Optimize hyperparameters with Amazon SageMaker Automatic Model Tuning

Machine learning (ML) models are taking the world by storm. Their performance relies on using the right training data and choosing the right model and algorithm. But it doesn’t end here. Typically, algorithms defer some design decisions to the ML practitioner to adopt for their specific data and task. These deferred design decisions manifest themselves […]

Implementing Amazon Forecast in the retail industry: A journey from POC to production

Amazon Forecast is a fully managed service that uses statistical and machine learning (ML) algorithms to deliver highly accurate time-series forecasts. Recently, based on Amazon Forecast, we helped one of our retail customers achieve accurate demand forecasting, within 8 weeks. The solution improved the manual forecast by an average of 10% in regards to the […]

Build a cross-account MLOps workflow using the Amazon SageMaker model registry

A well-designed CI/CD pipeline is essential to scale any software development workflow effectively. When designing production CI/CD pipelines, AWS recommends leveraging multiple accounts to isolate resources, contain security threats and simplify billing-and data science pipelines are no different. At AWS, we’re continuing to innovate to simplify the MLOps workflow. In this post, we discuss some […]