How it works
Amazon SageMaker helps you streamline the machine learning (ML) lifecycle by automating and standardizing MLOps practices across your organization. You can easily build, train, deploy, and manage ML models, whether it’s only a few, hundreds of thousands, or even millions. With purpose-built tools for ML lifecycle management and built-in integrations with other AWS services, you can boost productivity of data scientists and ML engineers while maintaining high model accuracy and enhancing security and compliance.
Automate ML workflows to scale model development
Amazon SageMaker Pipelines is a fully managed feature that helps you automate and orchestrate different steps of the ML workflow, including data loading, data transformation, model building, training, and tuning. With SageMaker Pipelines, you can process massive amounts of training data, run large-scale experiments, and build models and retrain models at any scale. Share and reuse workflows to recreate or optimize models, helping you scale ML throughout your organization.
Collaborate across data science teams on large-scale experiments
Model building is an iterative process, that invovles training hundreds of different models in search of the best model architecture and parameters to achieve the required level of prediction accuracy. You can use SageMaker Experiments to automatically generate trials at scale based on parameters you select. SageMaker Experiments tracks model and training iterations by capturing and storing the input parameters, configurations, and results. You can then use SageMaker Studio to browse active experiments, search for previous experiments, and review and compare experiment results. This enables repeatability of trials and improves collaboration between data scientists in model development.
Catalog model artifacts for traceability, reusability, and baselining
Many customers train hundreds of models for a given use case, each with multiple versions. Tracking these models and their versions and the associated metadata is critical for repeatability and discoverability to enable model reuse and meet compliance requirements. With the SageMaker Model Registry, you can track model versions, their metadata such as use case grouping, and model performance metrics baselines in a central repository where it is easy to choose the right model for deployment based on your business requirements. Model Registry automatically logs the approval workflows for audit and compliance. You can also use SageMaker Studio to browse and discover models in the model registry or access them through the SageMaker SDK.
Build CI/CD for ML to accelerate model deployment
Amazon SageMaker Projects brings CI/CD practices to ML, such as maintaining parity between development and production environments, source and version control, automated testing, and end-to-end automation. These help you standardize ML deployment processes and accelerate model deployment from days to minutes.
Track lineage for troubleshooting and compliance
Amazon SageMaker logs every step of your workflow, creating an audit trail of model artifacts, such as training data, platform configurations, model parameters, and learning gradients. Use audit trails to recreate models to debug potential issues and help support compliance requirements.
Maintain quality of predictions
The accuracy of ML models can deteriorate over time, a phenomenon known as model drift. Many factors can cause model drift, such as changes in model features. The accuracy of ML models can also be affected by concept drift, the difference between data used to train models and data used during inference. Amazon SageMaker Model Monitor helps you maintain quality by detecting model drift and concept drift in production in real time, and sending you alerts so you can take immediate action. SageMaker Model Monitor constantly monitors model performance characteristics such as accuracy, which measures the number of correct predictions compared to the total number of predictions, so you can address anomalies.
Detect potential bias in deployed models
Amazon SageMaker Model Monitor is integrated with Amazon SageMaker Clarify to improve visibility into potential bias. Although your initial data or model may not have been biased, changes in the world may cause bias to develop over time in a model that has already been trained. For example, a substantial change in home buyer demographics could cause a home loan application model to become biased if certain populations were not present in the original training data. Integration with SageMaker Clarify enables you to configure alerting systems such as Amazon CloudWatch to notify you if your model begins to develop bias.
Enhance security of data and models
Amazon SageMaker offers a comprehensive set of security features—including infrastructure security, data protection, authorization, authentication, monitoring, and auditability—to help your organization with security requirements that may apply to ML workloads. Using SageMaker, you can standardize security policies across the entire ML development process to increase your security posture and reduce the time it takes to provide data scientists with access to the data they need, while complying with your organization’s data security requirements.