Artificial Intelligence
Category: Analytics
Prepare data from Snowflake for machine learning with Amazon SageMaker Data Wrangler
Data preparation remains a major challenge in the machine learning (ML) space. Data scientists and engineers need to write queries and code to get data from source data stores, and then write the queries to transform this data, to create features to be used in model development and training. All of this data pipeline development […]
Build regression models with Amazon Redshift ML
June 2023: This post was reviewed and updated for accuracy. With the rapid growth of data, many organizations are finding it difficult to analyze their large datasets to gain insights. As businesses rely more and more on automation algorithms, machine learning (ML) has become a necessity to stay ahead of the competition. Amazon Redshift, a […]
Build BI dashboards for your Amazon SageMaker Ground Truth labels and worker metadata
This is the second in a two-part series on the Amazon SageMaker Ground Truth hierarchical labeling workflow and dashboards. In Part 1: Automate multi-modality, parallel data labeling workflows with Amazon SageMaker Ground Truth and AWS Step Functions, we looked at how to create multi-step labeling workflows for hierarchical label taxonomies using AWS Step Functions. In […]
How Genworth built a serverless ML pipeline on AWS using Amazon SageMaker and AWS Glue
This post is co-written with Liam Pearson, a Data Scientist at Genworth Mortgage Insurance Australia Limited. Genworth Mortgage Insurance Australia Limited is a leading provider of lenders mortgage insurance (LMI) in Australia; their shares are traded on Australian Stock Exchange as ASX: GMA. Genworth Mortgage Insurance Australia Limited is a lenders mortgage insurer with over […]
Intelligent governance of document processing pipelines for regulated industries
Processing large documents like PDFs and static images is a cornerstone of today’s highly regulated industries. From healthcare information like doctor-patient visits and bills of health, to financial documents like loan applications, tax filings, research reports, and regulatory filings, these documents are integral to how these industries conduct business. The mechanisms by which these documents […]
Enable cross-account access for Amazon SageMaker Data Wrangler using AWS Lake Formation
Amazon SageMaker Data Wrangler is the fastest and easiest way for data scientists to prepare data for machine learning (ML) applications. With Data Wrangler, you can simplify the process of feature engineering and complete each step of the data preparation workflow, including data selection, cleansing, exploration, and visualization through a single visual interface. Data Wrangler […]
Perform interactive data processing using Spark in Amazon SageMaker Studio Notebooks
Amazon SageMaker Studio is the first fully integrated development environment (IDE) for machine learning (ML). With a single click, data scientists and developers can quickly spin up Studio notebooks to explore datasets and build models. You can now use Studio notebooks to securely connect to Amazon EMR clusters and prepare vast amounts of data for […]
Translate, redact, and analyze text using SQL functions with Amazon Athena, Amazon Translate, and Amazon Comprehend
October 2021 Update (v0.3.0): Added support for Amazon Comprehend DetectKeyPhrases You have Amazon Simple Storage Service (Amazon S3) buckets full of files containing incoming customer chats, product reviews, and social media feeds, in many languages. Your task is to identify the products that people are talking about, determine if they’re expressing happy thoughts or sad […]
Setting up Amazon Personalize with AWS Glue
Data can be used in a variety of ways to satisfy the needs of different business units, such as marketing, sales, or product. In this post, we focus on using data to create personalized recommendations to improve end-user engagement. Most ecommerce applications consume a huge amount of customer data that can be used to provide […]
Applying voice classification in an Amazon Connect telemedicine contact flow
Given the rising demand for fast and effective COVID-19 detection, customers are exploring the usage of respiratory sound data, like coughing, breathing, and counting, to automatically diagnose COVID-19 based on machine learning (ML) models. University of Cambridge researchers built a COVID-19 sound application and demonstrated that a simple binary ML classifier can classify healthy and […]








