Why Amazon SageMaker MLOps
Amazon SageMaker provides purpose-built tools for machine learning operations (MLOps) to help you automate and standardize processes across the ML lifecycle. Using SageMaker MLOps tools, you can easily train, test, troubleshoot, deploy, and govern ML models at scale to boost productivity of data scientists and ML engineers while maintaining model performance in production.
How it works
Benefits of SageMaker MLOps
Accelerate model development
Provision standardized data science environments
Standardizing ML development environments increases data scientist productivity and ultimately the pace of innovation by making it easy to launch new projects, rotate data scientists across projects, and implement ML best practices. Amazon SageMaker Projects offers templates to quickly provision standardized data scientist environments with well-tested and up-to-date tools and libraries, source control repositories, boilerplate code, and CI/CD pipelines.
Read the developer guide to automate MLOps with SageMaker Projects
Collaborate using MLflow during ML experimentation
ML model building is an iterative process, involving the training of hundreds of models to find the best algorithm, architecture, and parameters for optimal model accuracy. MLflow enables you to track the inputs and outputs across these training iterations, improving repeatability of trials and fostering collaboration among data scientists. With fully managed MLflow capabilities, you can create MLflow Tracking Servers for each team, facilitating efficient collaboration during ML experimentation.
Amazon SageMaker with MLflow manages the end-to-end machine learning lifecycle, streamlining efficient model training, tracking experiments, and reproducibility across different frameworks and environments. It offers a single interface where you can visualize in-progress training jobs, share experiments with colleagues, and register models directly from an experiment.
Automate GenAI model customization workflows
With Amazon SageMaker Pipelines you can automate the end-to-end ML workflow of data processing, model training, fine-tuning, evaluation, and deployment. Build your own model or customize a foundation model from SageMaker Jumpstart with a few clicks in the Pipelines visual editor. You can configure SageMaker Pipelines to run automatically at regular intervals or when certain events are triggered (e.g. new training data in S3)
Easily deploy and manage models in production
Quickly reproduce your models for troubleshooting
Often, you need to reproduce models in production to troubleshoot model behavior and determine the root cause. To help with this, Amazon SageMaker logs every step of your workflow, creating an audit trail of model artifacts, such as training data, configuration settings, model parameters, and learning gradients. Using lineage tracking, you can recreate models to debug potential issues.
Centrally track and manage model versions
Building an ML application involves developing models, data pipelines, training pipelines, and validation tests. Using Amazon SageMaker Model Registry, you can track model versions, their metadata such as use case grouping, and model performance metrics baselines in a central repository where it is easy to choose the right model for deployment based on your business requirements. In addition, SageMaker Model Registry automatically logs approval workflows for audit and compliance.
Learn more about the Register and Deploy Models with Model Registry