Posted On: Jun 8, 2021
Amazon SageMaker Data Wrangler reduces the time it takes to aggregate and prepare data for machine learning (ML) from weeks to minutes. With SageMaker Data Wrangler, you can simplify the process of data preparation and feature engineering, and complete each step of the data preparation workflow, including data selection, cleansing, exploration, and visualization from a single visual interface. Starting today, you can now use Snowflake as a data source in Amazon SageMaker Data Wrangler to easily prepare data in Snowflake for machine learning.
With Snowflake as a data source for Amazon SageMaker Data Wrangler, you can now quickly and easily connect to Snowflake without writing a single line of code. Additionally, you can now join your data in Snowflake with data stored in Amazon S3, and data queried through Amazon Athena and Amazon Redshift to prepare data for machine learning. Once connected, you can interactively query data stored in Snowflake, easily transform data with 300+ pre-configured data transformations, understand data, and identify potential errors and extreme values with a set of robust pre-configured visualization templates. You can also quickly identify inconsistencies in your data preparation workflow and diagnose issues before models are deployed into production. Finally, you can export your data preparation workflow to Amazon S3 for use with other SageMaker features such as Amazon SageMaker Autopilot, Amazon SageMaker Feature Store, and Amazon SageMaker Pipelines.
To learn more about Snowflake integration with Amazon SageMaker Data Wrangler, view the blog. To get started with Amazon SageMaker Data Wrangler, visit our documentation and webpage.