Posted On: Sep 8, 2022

Amazon SageMaker Canvas announces additional capabilities for exploratory data analysis (EDA) with advanced visualizations, enabling you to explore and analyze your data better before building machine learning (ML) models. SageMaker Canvas is a visual point-and-click interface that enables business analysts to generate accurate ML predictions on their own — without requiring any machine learning experience or having to write a single line of code. 

Starting today, Amazon SageMaker Canvas provides new visualizations for EDA that enable you to understand your data better before model building. These visualizations add to the range of capabilities for data preparation and exploration already offered by Canvas such as flexible sizes for data sampling, impute missing values, replace outliers, filter, join, and modify datasets, and expanded timestamp formats. The visualizations help you analyze the relationships between features in your data sets and comprehend your data better. This is done in an easy-to-read visual format, with the ability to interact with the data and discover insights that may go unnoticed with ad-hoc querying. They can be created quickly through the Data Visualizer within SageMaker Canvas prior to building and training ML models. The new visualizations include:

  • Scatter Plots: These plots can be used to observe relationships between different numeric variables in your data. Dots are used to present values for two different numeric variables, with the position of each dot indicating the value for a particular data point on the horizontal and vertical axes.
  • Bar Charts: These charts can be used to summarize a set of categorical data represented by bars for instant data comparison. The height of each bar represents the proportion of a specific aggregation of the data.
  • Box Plots: These plots represent groups of numerical data through their quartiles. Box plots help you determine how the values from your data are spread out. The graphical view represents the distribution of one or more groups of numeric data.

All the EDA capabilities including the new visualizations are supported in all AWS regions where SageMaker Canvas is available. To learn more about Canvas, the supported regions, and get started, please see the Canvas documentation, the product page, and the FAQ page.