Artificial Intelligence
Simplify data prep for generative AI with Amazon SageMaker Data Wrangler
Generative artificial intelligence (generative AI) models have demonstrated impressive capabilities in generating high-quality text, images, and other content. However, these models require massive amounts of clean, structured training data to reach their full potential. Most real-world data exists in unstructured formats like PDFs, which requires preprocessing before it can be used effectively. According to IDC, […]
Apply fine-grained data access controls with AWS Lake Formation in Amazon SageMaker Data Wrangler
We are happy to announce that SageMaker Data Wrangler now supports using Lake Formation with Amazon EMR to provide this fine-grained data access restriction.
Publish predictive dashboards in Amazon QuickSight using ML predictions from Amazon SageMaker Canvas
April 2024: This post was reviewed and updated for accuracy. Understanding business trends, customer behavior, sales revenue, increase in demand, and buyer propensity all start with data. Exploring, analyzing, interpreting, and finding trends in data is essential for businesses to achieve successful outcomes. Business analysts play a pivotal role in facilitating data-driven business decisions through […]
Access Snowflake data using OAuth-based authentication in Amazon SageMaker Data Wrangler
In this post, we show how to configure a new OAuth-based authentication feature for using Snowflake in Amazon SageMaker Data Wrangler. Snowflake is a cloud data platform that provides data solutions for data warehousing to data science. Snowflake is an AWS Partner with multiple AWS accreditations, including AWS competencies in machine learning (ML), retail, and […]
Accelerate time to insight with Amazon SageMaker Data Wrangler and the power of Apache Hive
Amazon SageMaker Data Wrangler reduces the time it takes to aggregate and prepare data for machine learning (ML) from weeks to minutes in Amazon SageMaker Studio. Data Wrangler enables you to access data from a wide variety of popular sources (Amazon S3, Amazon Athena, Amazon Redshift, Amazon EMR and Snowflake) and over 40 other third-party sources. […]
Achieve rapid time-to-value business outcomes with faster ML model training using Amazon SageMaker Canvas
Machine learning (ML) can help companies make better business decisions through advanced analytics. Companies across industries apply ML to use cases such as predicting customer churn, demand forecasting, credit scoring, predicting late shipments, and improving manufacturing quality. In this blog post, we’ll look at how Amazon SageMaker Canvas delivers faster and more accurate model training times enabling […]
Prepare data from Amazon EMR for machine learning using Amazon SageMaker Data Wrangler
Data preparation is a principal component of machine learning (ML) pipelines. In fact, it is estimated that data professionals spend about 80 percent of their time on data preparation. In this intensive competitive market, teams want to analyze data and extract more meaningful insights quickly. Customers are adopting more efficient and visual ways to build […]






