Posted On: Nov 29, 2023
Today, we are excited to announce the general availability of a simplified developer experience for Amazon SageMaker Pipelines. The improved Python SDK enables you to build Machine Learning (ML) workflows quickly with familiar Python syntax. Key features of the SDK include a new Python decorator (@step) for custom steps, a Notebook Jobs step type, and a workflow scheduler.
ML development often starts with a monolithic Python code for experimentation in your local development environment (ex. Jupyter notebooks) before you decide to automate its execution through decoupled pipeline steps. With the new Amazon SageMaker Pipelines developer experience, you can convert your ML code into an automated Directed Acyclic Graph (DAG) of various ML steps in a few minutes. To create an ML workflow, annotate your existing Python functions with ‘@step’ decorators and pass the final step to pipeline creation API. Amazon SageMaker will automatically interpret the dependencies between the annotated Python functions, create custom pipeline steps for each of them, and generate the Pipeline DAG. If your ML code is spread across multiple Python notebooks, you can chain them together to orchestrate a workflow of Notebook Jobs. Later if you want to automatically execute the workflow on a recurring basis, you can configure an execution schedule using a single function call in the new Python SDK.
To get started, create an ML workflow using one of the pre-built sample notebooks on GitHub and visualize it in the Amazon SageMaker Studio UI. Visit the Amazon SageMaker Pipelines developer guide for additional information.