AWS Machine Learning Blog

Category: Amazon Athena

Use Amazon SageMaker Canvas to build machine learning models using Parquet data from Amazon Athena and AWS Lake Formation

Data is the foundation for machine learning (ML) algorithms. One of the most common formats for storing large amounts of data is Apache Parquet due to its compact and highly efficient format. This means that business analysts who want to extract insights from the large volumes of data in their data warehouse must frequently use […]

Transform, analyze, and discover insights from unstructured healthcare data using Amazon HealthLake

Healthcare data is complex and siloed, and exists in various formats. An estimated 80% of data within organizations is considered to be unstructured or “dark” data that is locked inside text, emails, PDFs, and scanned documents. This data is difficult to interpret or analyze programmatically and limits how organizations can derive insights from it and […]

How Marubeni is optimizing market decisions using AWS machine learning and analytics

This post is co-authored with Hernan Figueroa, Sr. Manager Data Science at Marubeni Power International. Marubeni Power International Inc (MPII) owns and invests in power business platforms in the Americas. An important vertical for MPII is asset management for renewable energy and energy storage assets, which are critical to reduce the carbon intensity of our […]

Extract non-PHI data from Amazon HealthLake, reduce complexity, and increase cost efficiency with Amazon Athena and Amazon SageMaker Canvas

In today’s highly competitive market, performing data analytics using machine learning (ML) models has become a necessity for organizations. It enables them to unlock the value of their data, identify trends, patterns, and predictions, and differentiate themselves from their competitors. For example, in the healthcare industry, ML-driven analytics can be used for diagnostic assistance and […]

Automate your time series forecasting in Snowflake using Amazon Forecast

This post is a joint collaboration with Andries Engelbrecht and James Sun of Snowflake, Inc. The cloud computing revolution has enabled businesses to capture and retain corporate and organizational data without capacity planning or data retention constraints. Now, with diverse and vast reserves of longitudinal data, companies are increasingly able to find novel and impactful […]

Configure a custom Amazon S3 query output location and data retention policy for Amazon Athena data sources in Amazon SageMaker Data Wrangler

Amazon SageMaker Data Wrangler reduces the time that it takes to aggregate and prepare data for machine learning (ML) from weeks to minutes in Amazon SageMaker Studio, the first fully integrated development environment (IDE) for ML. With Data Wrangler, you can simplify the process of data preparation and feature engineering, and complete each step of […]

Receive notifications for image analysis with Amazon Rekognition Custom Labels and analyze predictions

Amazon Rekognition Custom Labels is a fully managed computer vision service that allows developers to build custom models to classify and identify objects in images that are specific and unique to your business. Rekognition Custom Labels doesn’t require you to have any prior computer vision expertise. You can get started by simply uploading tens of […]

Build a system for catching adverse events in real-time using Amazon SageMaker and Amazon QuickSight

Social media platforms provide a channel of communication for consumers to talk about various products, including the medications they take. For pharmaceutical companies, monitoring and effectively tracking product performance provides customer feedback about the product, which is vital to maintaining and improving patient safety. However, when an unexpected medical occurrence resulting from a pharmaceutical product […]

You already know how to use Amazon Athena to transform data in Amazon S3 using simple SQL commands

Translate, redact, and analyze text using SQL functions with Amazon Athena, Amazon Translate, and Amazon Comprehend

October 2021 Update (v0.3.0): Added support for Amazon Comprehend DetectKeyPhrases You have Amazon Simple Storage Service (Amazon S3) buckets full of files containing incoming customer chats, product reviews, and social media feeds, in many languages. Your task is to identify the products that people are talking about, determine if they’re expressing happy thoughts or sad […]

For an existing data lake registered with Lake Formation, the following diagram illustrates the proposed implementation.

Control and audit data exploration activities with Amazon SageMaker Studio and AWS Lake Formation

Certain industries are required to audit all access to their data. This includes auditing exploratory activities performed by data scientists, who usually query data from within machine learning (ML) notebooks. This post walks you through the steps to implement access control and auditing capabilities on a per-user basis, using Amazon SageMaker Studio notebooks and AWS […]