Posted On: May 5, 2022
Amazon SageMaker Canvas is a visual point-and-click interface that enables business analysts to generate accurate ML predictions on their own — without requiring any machine learning experience or having to write a single line of code. SageMaker Canvas makes it easy to access and combine data from a variety of sources, automatically clean data and apply a variety of data adjustments, and build ML models to generate accurate predictions with a few clicks.
Today, Amazon SageMaker Canvas is announcing new data preparation features that make it easier to manage, explore, and modify datasets before building ML models. Key features include:
- Filtering rows to explore and modify datasets: You can now preview and remove rows with missing values and outliers. You can also specify additional conditions to preview and remove rows from your dataset. For example, for numeric data types, you can specify conditions such as greater than, less than, equal to, in between and more. The list of supported conditions vary by data type and are documented here
- Expanded timestamp formats and transforms to extract date, time: You can now extract date and time information from the timestamp column and create new columns. This makes it easy to prepare and transform your time series data and add the transform to your model recipe with just a few clicks. Additionally, SageMaker Canvas now supports multiple timestamp formats making it easy to work with timeseries data for forecasting problems. For a full list of timestamp formats and date, time extraction capabilities, please see here
In addition to the above listed data preparation and transformation features, you can now rename datasets and columns for ease of use. Other usability updates include improved user facing messages with recommended actions, and visibility into dataset cell count before building an ML model.