Posted On: Jun 7, 2023
Today, AWS announced a new feature in SageMaker Pipelines, the ML workflow management service, to enable users run their desired steps in a pipeline as a sub-workflow. The new feature, called Selective Execution, allows you to run your selected steps in a pipeline while avoiding to rerun the entire pipeline. As a Data Scientist, Applied Scientist or an ML Engineer iterating on a pipeline for experimentation and deployment of ML models at scale, you can use this feature to initiate a pipeline execution on your desired steps and save hours of processing time, and simplify managing the code used for executions.
When iterating on your ML model workflow in SageMaker Pipelines, you can use Selective Execution feature to try various configurations of run-time parameters such as instance type and count. You can select the steps in a pipeline and provide any past execution as a reference. The outputs of non-selected steps are taken from the reference execution automatically, thereby avoiding rerunning them. As a result, selective executions help you save time and infrastructure resource costs when you run the workflow over multiple iterations during experimentation and production stages of an ML model.
You can run selective executions in SageMaker Studio notebooks via PythonSDK and collaborate using shareable and repeatable code. The new feature can be accessed in all public regions of AWS where SageMaker Pipelines is available. Learn more about Amazon SageMaker Pipelines here, and find the detailed developer guide in the Selective Execution section here.