AWS Cloud Operations Blog
How to perform a Well-Architected Framework Review- Part 1
Is my workload well-architected? Is my team following cloud best practices? How do other customers implement solution X? What is the best way to configure service Y?
These are examples of questions I usually get from my customers who want to validate if their architecture is aligned with AWS best practices. The answers to these questions vary depending on the type of technology domain the customer operates in, but in general, there are proven design principles if customers follow, it’s likely the systems they build will deliver its functionality as expected. These design principles and best practices are the core part of the AWS Well-Architected Framework and they span six pillars: Operational Excellence, Security, Reliability, Performance Efficiency, Cost Optimization and Sustainability.
At AWS, there are best practices for everything, and conducting a Well-Architected Framework Review (WAFR) for your workload is not an exception. WAFR might be a big-time commitment, depending on multiple factors such as the team experience, the workload complexity, the pillars to review and other factors that will be discussed later. Being aware of these best practices is key to ensure the time your team is investing in the review will result in the expected outcome of identifying architecture risks and address them. In this 3-part blog series, I share some of the lessons we learned from running many WAFRs with customers. In the first part, I show you how to prepare for a review. The second blog covers how to run it, and the third covers how to identify architecture risks and create plans to remediate them.
Before we start, what is WAFR?
Building a technology system is not different from building any other product. There are practices and codes to follow when building a product to ensure its aligned with industry standards. However, just having practices in place is not enough. You also need to implement mechanisms to ensure that your teams are aware of these practices, and they are following them.
The consistent process of learning AWS best practices, measuring architecture against these best practices, identifying architecture risks and creating an improvement plan to address them is what we call AWS Well Architected Framework Review.
Figure 1- Well-Architected Framework Review Cycle
What is the goal of WAFR? Why would I need do it?
The ultimate goal of WAFR is to improve your systems’ architectures so that these systems can better support business needs. The architecture improvement process starts by reviewing the current architecture and comparing it against best practices. You do so by answering the review questions. A set of questions for each pillar . The questions validate whether a specific best practice is implemented in your architecture or not. Based on your answers, and with the help of the AWS Well-Architected Tool (AWS WA Tool), you identify areas in the architecture that represent high, medium or low risks – More on that later. The next step is to start working on resolving the risks on a priority-based approach by identifying the highest impact of these risks. Then you create an improvement plan to address them. We will go through the details of each of these steps in this post and the following ones in the series.
Phases of WAFR
There are three phases for WAFR: Prepare, Review, and Improve. In the following sections, I will dive deep into each phase and give you some best practices to get the most out of it.
Figure 2- Well-Architected Framework Review Phases
Prepare
The preparation for WAFR, on average, starts about 3 weeks before the actual review date. This depends on factors like time to assemble the review team, how many pillars to review, or the priority given to complete the review by the organization. During this phase, you decide on the team to invite to the review session, the workload to review and the review’s session format. You also need to collect the necessary data about architecture to help you answer the review questions.
Let’s dive deeper into each one.
1- Define a workload
The first step to prepare for WAFR is to identify the workload you want to review. A workload is a set of components (technology, people, processes) that delivers a business value to your organization. It’s the level of business and technology that leaders communicate about. For example, a website where your customers place and track orders, along with the infrastructure and processes that support its back-end, is a workload.
2- Define core team (sponsors)
A key component to a successful WARF is to engage the right people from the beginning.
After identifying a workload to review, you need to identify workloads’ owners. We sometimes call them sponsors. A workload sponsor is a person (or team) who is ultimately responsible for the success (or failure) of the workload. This person should have the right level of authority to influence and take actions to address the risks identified in the architecture as a result of the reviews. Example actions could be shifting teams’ priorities, hiring an external party or else.
You also need to identify a sponsor for each pillar. Depending on your organization’s structure and size, you may have one person responsible for multiple pillars, or multiple teams responsible for one pillar, or a mix of both. The goal here is to ensure that you have the right person to answer the review questions for each pillar, and later, to address any risks identified on that pillar as part of the treatment plan.
You may also need to invite individuals from different teams to get more holistic view of the pillar to be reviewed. For example, to review the Reliability pillar, you may need to include SMEs on: database, networking, security and operations. To review the Operational Excellence pillar, you may need to include Enterprise Architects and Application Development, or business/finance…etc.
3- Decide on the pillars and lenses
It’s most ideal to get a comprehensive look at the workload from the six pillars perspective. However, there may be situations where you may need to focus only on specific pillars. For example, you may have changed your security practice and you want to make sure you’re still aligned with best practice. In this case, you may choose to review only the Security Pillar.
It’s also recommended that you follow the pillars order as they are listed in the Well-Architected Framework. Start with the Operational Excellence Pillar and finish with the Sustainability Pillar. However, your organization’s priority might be different. In this phase, you need to decide on the pillar’s order to review as well.
One more thing to decide about is whether you want to use AWS Well-Architected Lenses. The lenses extend the guidance offered by Well-Architected to specific industry and technology domain. For example, if your workload primarily uses serverless, then you may need to review it against the Serverless Application Lens. If you run data analytics workload, you may need to include the Data Analytics Lens in your review. And so on. Check the list of available lenses here.
4- Decide on session’s type
Depending on selected pillars and teams’ availability, you need to decide on the review’s session format. Your options include a full one day for six pillars, or several sessions in several days for selected pillars. Having a full day review is usually harder to schedule but it’s the most valuable because all stakeholders get together to discuss best practices. Usually, this format helps uncover improvement opportunities the most. Having several sessions is a good option if you have geographically distributed teams, or larger teams, and it’s hard to bring them together at the same time. This approach is easier to maintain, but it may require you to do a little extra work to update different teams on different milestones.
Communication across teams during the review is key because it helps to answer the questions and uncover issues collectively. For that reason, it’s recommended to conduct WAFR as a live session, not asynchronously, by having teams answering the questions in the AWS WA Tool and share it later.
5- Collect the necessary data for the review
Before conducting the review session, it’s recommended that you collect details about the workload you’re reviewing. For example, check any architecture diagrams or documents that explain the main components of the system, it’s back-end, and the main processes and teams responsible for operating it.
For more complex workloads, you can leverage AWS Trusted Advisor checks for automatic evaluation against best practices across cost optimization, performance, security fault tolerance and service limits. You can enable Trusted Advisor and have it check for all accounts in your AWS Organization. Check more details here. You can then use actions recommended by Trusted Advisor to get better understating of your compliance with some of the best practices and incorporate these details in your review and also later when developing a treatment plans. Here is an example on how to use AWS Well Architected with AWS Trusted Advisor to achieve data-driven cost optimization.
Summary
In this blog post, we dived deep into the first phase for the Well-Architected Framework Review: The preparation phase. I shared with you some of the steps and lessons we learned from performing my reviews with customers. These recommendations will help your review to go smoother and will help you getting the most out of every participant’s time. The steps include defining the workloads to review, defining the right core teams and sponsors, selecting the pillars and session types and lastly, collecting the data you need in advance. These steps will make you ready for the review date. I will dive deeper into that in part 2 and part 3 of this blog series.
About the author