Posted On: Sep 1, 2022
SageMaker Autopilot automatically builds, trains and tunes the best machine learning models based on your data, while allowing you to maintain full control and visibility. Starting today, when creating Autopilot experiment to train a machine learning model, you can customize the splits of data used for training and validation of models. By default Autopilot splits the specified dataset into 80-20 percent splits reserved for training and validation respectively. With this release, you can customize the training and validation data split percentages or alternatively provide two datasets, one for training and another for validation. This feature is available for use in both Amazon SageMaker Studio and SageMaker Autopilot API.
To make selection of training and validation dataset more efficient, this release also includes an improved user interface that provides a friendly S3 browsing experience and a guided step-by-step workflow that helps you gain full control and visibility into the advanced settings.
To get started, update Amazon SageMaker Studio to the latest release and launch SageMaker Autopilot either from SageMaker Studio Launcher or from Amazon SageMaker Data Wrangler Train Model workflow. To learn more on how to update studio please see documentation.
These new features and experiences are now available in all regions where SageMaker Autopilot is available. To get started, see Creating an Experiment with Autopilot and SageMaker Autopilot API reference. To learn more, visit the SageMaker Autopilot product page.