AWS Architecture Blog

Introducing the new AWS Well-Architected Machine Learning Lens

The AWS Well-Architected Framework provides you with a formal approach to compare your workloads against best practices. It also includes guidance on how to make improvements.

Machine learning (ML) algorithms discover and learn patterns in data, and construct mathematical models to predict future data. These solutions can revolutionize lives through better diagnoses of diseases, environmental protections, products and services transformation, and more.

Your ML models depend on the quality of input data to generate accurate results. As data changes over time, monitoring is required to continuously detect, correct, and mitigate issues. This improves accuracy and performance. It also may require you to retrain your model with the latest refined data.

Application workloads rely on step-by-step instructions to solve a problem. ML workloads enable algorithms to learn from data through an iterative and continuous cycle. We are announcing a brand-new version of the AWS Well-Architected Machine Learning Lens whitepaper. It complements and builds upon the Well-Architected Framework to address this difference between these two types of workloads.

The whitepaper provides you with a set of established cloud and technology agnostic best practices. You can apply this guidance and architectural principles when designing your ML workloads, or after your workloads have entered production as part of continuous improvement. The paper includes guidance and resources to help you implement these best practices on AWS.

The Well-Architected Machine Learning Lens components

The Lens includes four focus areas:

1. The Well-Architected Machine Learning Design Principles — A set of considerations that are used as the basis for a Well-Architected ML workload. These design principles are the guiding light for the collection of the best practices in the ML Lens.

2. The Well-Architected Machine Learning Lifecycle — This integrates the Well-Architected Framework into the Machine Learning Lifecycle as can be seen in figure 1.

    • The Well-Architected Framework pillars includes:
      1. Operational Excellence
      2. Security
      3. Reliability
      4. Performance Efficiency
      5. Cost Optimization
    • The Machine Learning Lifecycle phases referenced in the ML Lens include:
      1. Business goal identification
      2. ML problem framing
      3. Data processing (data collection, data pre-processing, feature engineering)
      4. Model development (training, tuning, evaluation)
      5. Model deployment (prediction, inference)
      6. Model monitoring
Figure 1. Well-Architected Machine Learning Lifecycle

Figure 1. Well-Architected Machine Learning Lifecycle

In the Well-Architected ML Lens whitepaper, the Well-Architected Machine Learning Lifecycle applies the Well-Architected Framework pillars to each of the lifecycle phases.

3. Cloud and technology agnostic best practices — These are best practices for each ML lifecycle phase across the Well-Architected Framework pillars. Best practices are accompanied by:

    • Implementation guidance that provides AWS implementation plans for each best practice with references to AWS technologies and resources.
    • Resources as a set of links to AWS documents, blogs, videos, and code examples as supporting resources to the best practices and their implementation plans.

4. ML Lifecycle architecture diagrams — These illustrate processes, technologies, and components that support many of the best practices, shown in Figure 2. They include: Feature stores, Model Registry, lineage tracker, alarm manager, scheduler, and more. Different pipeline technologies are illustrated using these architecture diagrams.

Figure 2. Machine Learning Lifecycle phases with expanded components

Figure 2. Machine Learning Lifecycle phases with expanded components

Where should you apply the Well-Architected Machine Learning Lens?

Use the Well-Architected ML Lens to:

  • Make informed decisions — Plan early and make informed decisions by reviewing best practices before a new workload design begins.
  • Build and deploy faster — Use the best practices to guide you through building new Well-Architected workloads across the ML lifecycle.
  • Lower or mitigate risks — Evaluate existing workloads regularly to identify, mitigate, and address potential issues early.
  • Learn AWS best practices — Use the provided implementation plans as guidance on implementing the best practices on AWS.

Conclusion

The new Well-Architected Machine Learning Lens whitepaper is available now. Use the Lens to help ensure that your ML workloads are architected with operational excellence, security, reliability, performance efficiency, and cost optimization in mind.

Special thanks to everyone across the AWS Solution Architecture and Machine Learning communities who contributed. These contributions encompassed diverse perspectives, expertise, and experiences in developing the new AWS Well-Architected Machine Learning Lens.

Haleh Najafzadeh

Haleh Najafzadeh

Haleh is a principal solutions architect leading the Machine Learning Lens content and strategy development at AWS Well-Architected. She has over 23 years of experience in applying scientific techniques to challenging industrial problems. Haleh holds a Ph.D. in Computer Science. She has a passion for sharing technology best practices, while enabling customers with architecting and implementing machine learning solutions at scale.