Amazon SageMaker Clarify
Detect bias in ML data and models, and explain model predictions
Explain how input features contributed to your model predictions in real time.
Detect potential bias during data preparation, after model training, and in your deployed model.
Identify any shifts in bias and feature importance after deployment.
Amazon SageMaker Clarify provides machine learning (ML) developers with purpose-built tools to gain greater insights into their ML training data and models. SageMaker Clarify detects and measures potential bias using a variety of metrics so that ML developers can address potential bias and explain model predictions.
SageMaker Clarify can detect potential bias during data preparation, after model training, and in your deployed model. For instance, you can check for bias related to age in your dataset or in your trained model and receive a detailed report that quantifies different types of potential bias. SageMaker Clarify also includes feature importance scores that help you explain how your model makes predictions and produces explainability reports in bulk or real time through online explainability. You can use these reports to support customer or internal presentations or to identify potential issues with your model.
Detect bias in your data and model predictions
Identify imbalances in data
With SageMaker Clarify, you can identify potential bias during data preparation without having to write your own code as part of Amazon SageMaker Data Wrangler. You specify input features, such as gender or age, and SageMaker Clarify runs an analysis job to detect potential bias in those features. SageMaker Clarify then provides a visual report with a description of the metrics and measurements of potential bias so that you can identify steps to remediate the bias. For example, in a financial dataset that contains only a few examples of business loans to one age group as compared to others, the bias metrics will indicate the imbalance so that you can address the imbalances in your dataset and potentially reduce the risk of having a model that is disproportionately inaccurate for a specific age group.
In case of imbalances, you can use SageMaker Data Wrangler to balance your data. SageMaker Data Wrangler offers three balancing operators: random undersampling, random oversampling, and SMOTE to rebalance data in your unbalanced datasets. Read our blog post to learn more.
Check your trained model for bias
After you’ve trained your model, you can run a SageMaker Clarify bias analysis through Amazon SageMaker Experiments to check your model for potential bias such as predictions that produce a negative result more frequently for one group than they do for another. You specify input features with respect to which you would like to measure bias in the model outcomes, such as age, and SageMaker runs an analysis and provides you with a visual report that identifies the different types of bias for each feature, such as whether older groups receive more positive predictions compared to younger groups.
AWS open-source method Fair Bayesian Optimization can help mitigate bias by tuning a model’s hyperparameters. Read our blog post to learn how to apply Fair Bayesian Optimization to mitigate bias while optimizing the accuracy of an ML model.
Monitor your model for bias
SageMaker Clarify helps data scientists and ML engineers monitor predictions for bias on a regular basis. Bias can be introduced or exacerbated in deployed ML models when the training data differs from the live data that the model sees during deployment. For example, the outputs of a model for predicting home prices can become biased if the mortgage rates used to train the model differ from current mortgage rates. SageMaker Clarify bias detection capabilities are integrated into Amazon SageMaker Model Monitor so that when SageMaker detects bias beyond a certain threshold, it automatically generates metrics that you can view in Amazon SageMaker Studio and through Amazon CloudWatch metrics and alarms.
Explain model predictions
Understand which features contributed the most to model prediction
Explain your computer vision and NLP models
Monitor your model for changes in behavior
Explain individual model predictions in real time
Data scientists and ML engineers need tools to generate the insights required to debug and improve ML models through better feature engineering. These insights help them determine whether a model is making inferences based on noisy or irrelevant features and understand the limitations of their models and failure modes their models might encounter.
Bundesliga Match Facts, powered by AWS, provides a more engaging fan experience during soccer matches for Bundesliga fans around the world. With Amazon SageMaker Clarify, the Bundesliga can now interactively explain what some of the key, underlying components are in determining what led the ML model to predict a certain xGoals value. Knowing respective feature attributions and explaining outcomes helps in model debugging and increasing confidence in ML algorithms, which results in higher-quality predictions.
"Amazon SageMaker Clarify seamlessly integrates with the rest of the Bundesliga Match Facts digital platform and is a key part of our long-term strategy of standardizing our ML workflows on Amazon SageMaker. By using AWS’s innovative technologies, such as machine learning, to deliver more in-depth insights and provide fans a better understanding of the split-second decisions made on the pitch, Bundesliga Match Facts enables viewers to gain deeper insights into the key decisions in each match."
Andreas Heyden, Executive Vice President of Digital Innovations, DFL Group
CAPCOM is a Japanese game company famous for game titles such as the Monster Hunter series and Street Fighter. In order to keep users' satisfaction, CAPCOM needed to assure game quality and identify possible churners and their trends.
"The combination of AutoGluon and Amazon SageMaker Clarify enabled our customer churn model to predict customer churn with 94% accuracy. SageMaker Clarify helps us understand the model behavior by providing explainability through SHAP values. With SageMaker Clarify, we reduced the computation cost of SHAP values by up to 50% compared to a local calculation. The joint solution gives us the ability to better understand the model and improve customer satisfaction at a higher rate of accuracy with significant cost savings."
Masahiro Takamoto, Head of Data Group, CAPCOM
Domo is the Business Cloud, transforming the way business is managed by delivering Modern BI for All. With Domo, critical processes that took weeks, months, or more can now be done on-the-fly, in minutes or seconds, at unbelievable scale.
"Domo offers a scalable suite of data science solutions that are easy for anyone in an organization to use and understand. With Clarify, our customers are enabled with important insights on how their AI models are making predictions. The combination of Clarify with Domo helps to increase AI speed and intelligence for our customers by putting the power of AI into the hands of everyone across their business and ecosystems."
Ben Ainscough, Ph.D., Head of AI and Data Science, Domo
Varo Bank is a US-based digital bank and uses AI/ML to help make rapid, risk-based decisions to deliver its innovative products and services to customers.
"Varo has a strong commitment to the explainability and transparency of our ML models and we're excited to see the results from Amazon SageMaker Clarify in advancing these efforts."
Sachin Shetty, Head of Data Science, Varo Money