Posted On: Oct 6, 2021
Amazon SageMaker Data Wrangler reduces the time it takes to aggregate and prepare data for machine learning (ML) from weeks to minutes. With SageMaker Data Wrangler, you can simplify the process of data preparation and feature engineering, and complete each step of the data preparation workflow, including data selection, cleansing, exploration, and visualization from a single visual interface.
Starting today, you can use new capabilities of Amazon SageMaker Data Wrangler that help make it easier and faster to prepare data for ML including a new collection of time series transformations and two new time series visualizations to quickly generate insights from your time series data. The new time series transformations support missing value imputations, featurization of time series (e.g. Fourier coefficients, autocorrelation statistics, entropy, etc.), resampling operators to downsample or upsample data sets to a uniform frequency, time lag features, and rolling window functions. The new transformations also support more general operations such as grouping, unifying length, flattening, and exporting of vector-valued columns.
Additionally, you can now visualize seasonality and trends in your data and identify anomalies with new time series visualizations in Amazon SageMaker Data Wrangler. For example, with the seasonality and trend visualization, you can separate seasonal effects from trends in your sales data. Additionally, with the outlier detection visualization, you can identify outliers within your customer purchase datasets to detect changes in customer purchase behavior.
To get started with new capabilities of Amazon SageMaker Data Wrangler, you can open Amazon SageMaker Studio after upgrading to the latest release and click File > New > Flow from the menu or “new data flow” from the SageMaker Studio launcher. To learn more about the new time series transformations and visualizations, view the documentation.