Posted On: Jun 26, 2023

Amazon SageMaker Data Wrangler now enables direct connection to Snowflake to prepare data and create features for machine learning (ML). SageMaker Data Wrangler reduces the time it takes to aggregate and prepare data for ML from weeks to minutes using a visual interface in Amazon SageMaker Studio. 

Starting today, customers can connect to Snowflake from SageMaker Data Wrangler without providing an Amazon Simple Storage Service (Amazon S3) storage integration, or managing durable data copies in S3. This feature reduces the time spent on configuration and simplifies the connection between SageMaker Data Wrangler and Snowflake, making it easy to scale to a large number of users across your organization. You can browse databases, tables, schemas and query data from Snowflake in SageMaker Data Wrangler, and join with data from other popular data sources such as S3, Amazon Athena, Amazon Redshift, Amazon EMR and over 50 SaaS applications to create the right data set for ML. You can then quickly understand data quality, clean the data, and create features with 300+ built in analyses and data transformations using SageMaker Data Wrangler’s visual interface. You can also train and deploy models with Amazon SageMaker Autopilot, and automate the data preparation process in a feature engineering, training or deployment pipelines using Amazon SageMaker Pipelines. 

SageMaker Data Wrangler supports direct connection to Snowflake in all the regions currently supported by SageMaker Data Wrangler at no additional charge. To learn more, see this blog post and the AWS technical documentation.