Use Amazon Athena and Amazon QuickSight to build custom reports of AWS Well-Architected Reviews

AWS Well-Architected helps cloud architects build secure, high-performing, resilient, and efficient infrastructure for their applications and workloads. Based on five pillars — operational excellence, security, reliability, performance efficiency, and cost optimization — AWS Well-Architected provides a consistent approach for customers and partners to evaluate architectures, and implement designs that can scale over time.

You can use the AWS Well-Architected Tool (AWS WA Tool) to review the state of your workloads and compare them to the latest AWS architectural best practices. Its API allows customers, ISVs, and AWS Partner Network to extend AWS Well-Architected functionality, best practices, measurements, and lessons learned into their existing architecture governance processes, applications, and workflows.

By building your own integrations with the AWS WA Tool, your enterprise can support a range of use cases, such as integrating AWS Well-Architected data into centralized reporting tools or ticketing and management solutions. The API can also be used to create automation for specific use cases.

This post presents a simple approach for aggregating the data of the workload reviews into a central data lake repository. It helps teams to analyze their organization’s Well-Architected maturity across multiple AWS accounts and workloads and perform centralized reporting on high-risk issues (HRIs).

Architecture overview

Many customers use multiple AWS accounts to provide administrative autonomy for their teams. The AWS WA Tool offers a simple way to share workloads with other AWS accounts. You can share a workload that you own in your account with AWS accounts used by other members of your review team or with a centralized AWS account. For more information, see sharing a Workload.

Figure 1: Architecture diagram

After workloads are defined in the AWS WA Tool, AWS Lambda can poll the AWS Well-Architected Tool API to extract the raw data and store it in an Amazon Simple Storage Service (Amazon S3) bucket.
AWS Glue crawlers are used to discover schema and store it in the AWS Glue Data Catalog.
Amazon Athena is then used for data preparation tasks like building views of the workload report data.
Amazon QuickSight can now query and build visualizations to discover insights into your Well-Architected Reviews.

This pattern can extend the visibility of HRIs identified in the AWS WA Tool to enable custom visualizations and more insights. The central management account would typically be managed by a Cloud Center of Excellence (CCoE) team, who can advise and act on emerging HRIs across their entire AWS application portfolio.

Prerequisites

Complete the following prerequisites:

Define and document a workload in the AWS WA Tool. For better understanding, refer to AWS documentation.
Create or use an existing S3 bucket where you will store the extracted AWS Well-Architected data.

Walkthrough

The AWS Well-Architected Tool API provides programmatic access to the AWS WA Tool in the AWS Management Console. The API can be used for managing:

workloads
milestones
lenses
reviews
workload shares
workload share invitations.

For more information, see the getting started with AWS Well-Architected Tool.

Extract, transform, and load Well-Architected workload data

To extract the Well-Architected workload data, we’ll create an AWS Lambda function to query the Well-Architected API. AWS Lambda makes three calls to the Well-Architected Tool API to obtain workload, lens, and milestone information. It performs some minor transformation of the data to normalize the JSON structure and stores the data in an S3 bucket.

Create Lambda function

Begin by navigating to the Lambda console and choosing the Create function
Choose the option to Author from scratch and then fill in extract-war-reports as the function name, “Python 3.6” as the runtime.
Under Permissions, select the option to “Create new role from template(s)”, assigning a role name of extract-war-reports_role.
Then, choose Create function. Lambda will then create a new function along with a role for executing the function.

Image of creating Lambda function showing data inputs provided in above text.

Figure 2: Lambda configuration

Now, go ahead and paste the following code into the function editor. This code handles calls to obtain the workload data and storing in Amazon S3. Select Deploy to commit the code changes.

import boto3
import json
import os
import logging

logger = logging.getLogger()
logger.setLevel(logging.INFO)

s3_bucket = os.environ['S3_BUCKET']
s3_key = os.environ['S3_KEY']

#################
# Boto3 Clients #
#################
wa_client = boto3.client('wellarchitected')
s3_client = boto3.client('s3')

##############
# Parameters #
##############
# The maximum number of results the API can return in a list workloads call.
list_workloads_max_results_maximum = 50
# The maximum number of results the API can return in a list answers call.
list_answers_max_results_maximum = 50
# The maximum number of results the API can return in a list milestones call.
list_milestone_max_results_maximum = 50


def get_all_workloads():
    # Get a list of all workloads
    list_workloads_result = wa_client.list_workloads(MaxResults=list_workloads_max_results_maximum)
    logger.info(f'Found {len(list_workloads_result)} Well-Archtected workloads.')
    workloads_all = list_workloads_result['WorkloadSummaries']
    while 'NextToken' in list_workloads_result:
        next_token = list_workloads_result['NextToken']
        list_workloads_result = wa_client.list_workloads(
            MaxResults=list_workloads_max_results_maximum, NextToken=next_token
        )
        workloads_all += list_workloads_result['WorkloadSummaries']
    return (workloads_all)

def get_milestones(workload_id):
    # # Get latest milestone review date
    milestones = wa_client.list_milestones(
        WorkloadId=workload_id, MaxResults=list_milestone_max_results_maximum
    )['MilestoneSummaries']

    # If workload has milestone get them.
    logger.info(f'Workload {workload_id} has {len(milestones)} milestones.')
    if milestones:
        for milestone in milestones:
            milestone['RecordedAt'] = milestone['RecordedAt'].isoformat()
    return milestones

def get_lens(workload_id):
    # Which lenses have been activated for this workload
    # print(workload_id)
    lens_reviews_result = wa_client.list_lens_reviews(
        WorkloadId=workload_id
    )['LensReviewSummaries']
    logger.info(f'Workload {workload_id} has used {len(lens_reviews_result)} lens')
    return lens_reviews_result

def get_lens_answers(workload_id, lens_reviews):
    # Loop through each activated lens
    # TODO - List Answers for each milestone
    list_answers_result = []
    for lens in lens_reviews:
        lens_name = lens['LensName']
        logger.info(f'Looking at {lens_name} answers for Workload {workload_id}')

        # Get All answers for the lens
        list_answers_reponse = wa_client.list_answers(
            WorkloadId=workload_id, LensAlias=lens['LensAlias'], MaxResults=list_answers_max_results_maximum
        )

        # Flatten the answer result to include LensAlias and Milestone Number
        for answer_result in list_answers_reponse['AnswerSummaries']:
            answer_result['LensAlias'] = list_answers_reponse['LensAlias']

            # if 'MilestoneNumber' in list_answers_reponse:
            #     print("MILSTONE_DETECTED")
            #     answer_result['MilestoneNumber'] = list_answers_reponse['MilestoneNumber']

            # Append Answers from each lens.  For reporting.
        list_answers_result.extend(list_answers_reponse['AnswerSummaries'])

    return  list_answers_result


def get_lens_summary(workload_id, lens_reviews):
    # Loop through each activated lens
    list_lens_review_summary = []
    for lens in lens_reviews:
        lens_name = lens['LensName']
        logger.info(f'Looking at {lens_name} Summary for Workload {workload_id}')
        list_lens_review_reponse = wa_client.get_lens_review(
            WorkloadId=workload_id, LensAlias=lens['LensAlias']
        )
        list_lens_review_reponse['LensReview']['UpdatedAt'] = list_lens_review_reponse['LensReview']['UpdatedAt'].isoformat()
        list_lens_review_summary.append(list_lens_review_reponse['LensReview'])
    return list_lens_review_summary

def lambda_handler(event, context):
    workloads_all = get_all_workloads()
    # Generate workload JSON file
    logger.info(f'Generate JSON object for each workload.')

    # print (workloads_all)

    for workload in workloads_all:
        # Get workload info from WAR Tool API,
        workload_id = workload['WorkloadId']

        milestones = get_milestones(workload_id)

        lens_reviews = get_lens(workload_id)
        lens_summary_result = get_lens_summary(workload_id, lens_reviews)
        list_answers_result = get_lens_answers(workload_id, lens_reviews)

        # Build JSON of workload data
        workload_report_data = {}
        workload_report_data['workload_id'] = workload['WorkloadId']
        workload_report_data['workload_name'] = workload['WorkloadName']
        workload_report_data['workload_owner'] = workload['Owner']

        workload_report_data['workload_lastupdated'] = workload['UpdatedAt'].isoformat()
        workload_report_data['lens_summary'] = lens_summary_result
        workload_report_data['milestones'] = milestones
        # workload_report_data['report_answers'] = list_answers_result['AnswerSummaries']
        workload_report_data['report_answers'] = list_answers_result
        logger.debug(workload_report_data)
        print(workload_report_data)

        # Write to S3
        file_name = workload_id + '.json'
        logger.info(f'Writing JSON object to s3://{s3_bucket}/{s3_key}{file_name}.')
        s3_client.put_object(
            Body=str(json.dumps(workload_report_data)),
            Bucket=s3_bucket,
            Key=f'{s3_key}{file_name}'
        )

Environment variables

Let’s add some environment variables to pass environment-specific settings to the code.

Select Edit under Environment variables, and add the following environment variables and select Save:

Key	Value
S3_BUCKET	S3 bucket name, where you will store the extracted AWS Well-Architected data, e.g. “well-architected-reporting-blog”
S3_KEY	path in which you want the extracted Well-Architected workload data to be stored, e.g. “WorkloadReports/”

Your environment variables should now look like this:

Image of creating environment variables showing data inputs provided in above text.

Figure 3: Environment variables configuration

Trigger configuration

Let’s configure an Amazon Eventbridge (Amazon CloudWatch Events) schedule to have AWS Lambda poll the Well-Architected Tool API to extract all shared workloads to the AWS WA Tool in your AWS management account. Expand the Designer drop-down list and then select EventBridge (CloudWatch Events) as a trigger. Fill in LambdaExtractWARReportsSchedule as the Rule name. Select Schedule expression, and fill in a suitable expression that meets your requirements, e.g. rate(1 hour) will configure the Lambda function once every hour.

Image of Lambda trigger configuration, showing data inputs provided in above text.

Figure 4: Lambda trigger configuration

Lambda timeout

You should also increase the function timeout to 3 minute to ensure that the function always has time to finish reading all the defined workload data.

IAM role configuration

Finally, navigate to the IAM console and open role “extract-war-reports_role”. Attach policy “WellArchitectedConsoleReadOnlyAccess ” in order to grant the permissions necessary for calling Well-Architected APIs. Also attach the following policy to the role, replacing <S3_BUCKET_NAME> with the name of your S3 bucket where you will store the extracted AWS Well-Architected data. This is the same bucket configured as an environment variable.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "s3:GetObject",
                "s3:PutObject"
            ],
            "Resource": "arn:aws:s3:::<S3_BUCKET_NAME>*",
            "Effect": "Allow"
        }
    ]
}

Use AWS Glue to catalog the Well-Architected workload data

AWS Glue is a fully managed extract, transform, and load (ETL) service that makes it easy to prepare and load your data for analytics. AWS Glue provides crawlers to determine the schema and stores the metadata in the .

Create the Crawler

Open the AWS Glue console, and from the left navigation pane, choose Crawlers.
Select Add crawler and name the crawler well-architected-reporting, select Next.
Select Next to accept the defaults for Specify crawler source type.
Add the S3 path of the where you will store the extracted AWS Well-Architected data e.g. s3://well-architected-reporting-blog. Select
Select No and then Next to on the Add another data store step.
Select Create an IAM role and provide a name, e.g. well-architected-reporting , select Next.
Select Run on demand as the schedule frequency. Select Next.
Next select Add database, and fill-in a name e.g. war-reports. Select Create and then Next.
Review the configuration and select Finish to create the Crawler.

Image of Crawler configuration.

Figure 5: AWS Glue Crawler configuration

Run the Crawler

Find the crawler that was just created, select it, and then choose Run Crawler.
Wait for the crawler to complete running, which should take approximately one minute.
From the left navigation pane, choose Databases.
Find the database that was created during the Crawler creation, select it and choose View Tables.
In the Name field, you should see “workloadreports”. Select this and examine the metadata that was discovered during the crawler run, as shown in Figure 6.

Figure 6: workloadreports table name

Query the Well-Architected workload data with Amazon Athena

Although structured data remains the backbone of many data platforms, unstructured or semi structured data is used to enrich existing information or to create new insights. Amazon Athena enables you to analyze a variety of data, including:

Tabular data in comma-separated value (CSV) or Apache Parquet files
Data extracted from log files using regular expressions
JSON-formatted data

Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run. The AWS WA Tool data is represented as a nested JSON object. For example, running a simple SELECT query on the JSON data in Amazon Athena returns the following, where the underlying report_answers field is selected, but still represented in JSON format.

SELECT workload_id, workload_owner, report_answers 
FROM "YOUR_DATABASE_NAME"."workloadreports" 
LIMIT 4;

report_answers

Figure 7: report_answers

Create a view for the workload answers data

Follow these steps to create a view in Amazon Athena to present the JSON data as a tabular view for reporting and visualizing. A view in Amazon Athena is a logical, not physical table. The query that defines a view runs each time the view is referenced in a query. From within the Amazon Athena console, open a new query tab and execute the following query:

CREATE OR REPLACE VIEW well_architected_reports_view AS
SELECT workload_id,
         workload_name,
         workload_owner,
CAST(from_iso8601_timestamp(workload_lastupdated) AS timestamp) AS "timestamp",
         answers.QuestionTitle,
         answers.LensAlias,
         answers.pillarid,
         answers.risk
FROM "workloadreports"
CROSS JOIN unnest(report_answers) AS t(answers)

The SQL statement, UNNEST, takes the report_answers column from the original table as a parameter. It creates a new dataset with the new column answers, which is later cross-joined. The enclosing SELECT statement can then reference the new answers column directly. You can quickly query the view to see the result to understand how the report_answers are now represented.

SELECT workload_id, workload_owner, questionid, pillarid
FROM "YOUR_DATABASE_NAME"."well_architected_reports_view"
 LIMIT 4;

well_architected_reports_view

Figure 8: well_architected_reports_view

Create a view for the workload risk counts data

Now create a view for a summary of risks associated with each lens for each workload. Open a new query tab and execute the following query:

CREATE
        OR REPLACE VIEW well_architected_workload_lens_risk_view AS
SELECT workload_id,
         workload_name,
         lens.LensAlias,
         lens_pillar_summary.PillarId,
         lens_pillar_summary.RiskCounts.UNANSWERED,
         lens_pillar_summary.RiskCounts.HIGH,
         lens_pillar_summary.RiskCounts.MEDIUM,
         lens_pillar_summary.RiskCounts.NONE,
         lens_pillar_summary.RiskCounts.NOT_APPLICABLE
FROM "workloadreports"
CROSS JOIN unnest(lens_summary) AS t(lens)
CROSS JOIN unnest(lens.PillarReviewSummaries) AS tt(lens_pillar_summary)

Previewing the newly created well_architected_workload_risk_view:

SELECT *
FROM "YOUR_DATABASE_NAME"."well_architected_workload_risk_view"
LIMIT 4;

well_architected_workload_risk_view

Figure 9: well_architected_workload_risk_view

Create a view for the workload’s milestone data

A milestone records the state of a workload at a particular point in time. As a workload changes, you can add more milestones to measure progress. To enable visualization of this improvement across all workloads, create a view for a workload milestone. Open a new query tab and execute the following query:

CREATE
        OR REPLACE VIEW well_architected_workload_milestone_view AS
SELECT CAST(from_iso8601_timestamp(milestone.RecordedAt) AS timestamp) AS "timestamp",
         workload_id,
         workload_name,
         workload_owner,
         milestone.MilestoneName,
         milestone.MilestoneNumber,
         milestone.WorkloadSummary.ImprovementStatus,
         milestone.WorkloadSummary.RiskCounts.HIGH,
         milestone.WorkloadSummary.RiskCounts.MEDIUM,
         milestone.WorkloadSummary.RiskCounts.UNANSWERED,
         milestone.WorkloadSummary.RiskCounts.NONE,
         milestone.WorkloadSummary.RiskCounts.NOT_APPLICABLE
FROM “workloadreports"
CROSS JOIN unnest(milestones) AS t(milestone)

Previewing the newly created well_architected_workload_risk_view:

SELECT * 
FROM "YOUR_DATABASE_NAME"."well_architected_workload_milestone_view" 
 LIMIT 4;

well_architected_workload_milestone_view

Figure 10: well_architected_workload_milestone_view

Visualize the data with Amazon QuickSight

Now that you can query your data in Amazon Athena, you can use Amazon QuickSight to visualize the results.

Grant Amazon QuickSight access to Amazon Athena and your S3 bucket

First, grant Amazon QuickSight access to the S3 bucket where your Well-Architected data is stored.

Sign in to the Amazon QuickSight console.
In the upper right corner of the console, choose Admin/username, and then choose Manage QuickSight
Choose Security and permissions.
Under QuickSight access to AWS services, choose Add or remove.
Choose Amazon Athena, and then choose Next.
Give QuickSight access to the S3 bucket where you will store the extracted AWS Well-Architected data, e.g. “well-architected-reporting-blog”.

Create your datasets

Before you can analyze and visualize the data in QuickSight, you must create datasets for your Athena views and tables for each of the Athena views. Create QuickSight datasets for:

well_architected_workload_milestone_view
well_architected_workload_lens_risk_view
well_architected_reports_view

To create a dataset, on the Datasets page, choose New dataset, and then choose Athena.
Choose the Glue database that was created as part of the Glue Crawler creation step.
Choose Directly Query your data. Figure 11: Choose your table in Amazon QuickSight

Create your analysis

Now create a single-page dashboard using the three Athena datasets you just created.

In the QuickSight console, choose Create Analysis.
Choose one of the Athena datasets, and then choose Create Analysis.
Choose the pencil icon to add the other Athena datasets.
After you have added the datasets, they should be available for analysis.

Build your dashboard

Each dataset can be used to build visualizations for the following:

Number of conducted reviews
Number of HRIs
Top high-risk questions
Top high-risk workloads
Mix of risk across reviews, lenses, and pillars
Trends of risks across reviews, milestones, lenses, and pillars

This list not exhaustive. At AWS, we look forward to seeing the visualizations you’ll build for your organization. Figure 9 shows an example of a dashboard:

The Overview tab of the dashboard shows that 12 reviews have been conducted. It displays the top three high-risk questions and the workloads with the most high-risk items.

Figure 12: Dashboard example

Clean up the deployment

Manually delete the:

S3 bucket and the data stored in the bucket,
Lambda function,
Glue crawler and database,
Athena views, and the
QuickSight resources.

Conclusion

It’s exciting to see the custom integrations with the AWS Well-Architected Tool made possible with the Well-Architected APIs. Here are a few more examples of the functionality available with the new AWS Well-Architected Tool APIs:

Integrate AWS Well-Architected data into centralized reporting tools, or integrate with ticketing and management solutions.
Automation of best practice detection.
Provide insights to AWS customers based on their AWS Well-Architected Review, and recommend remediation steps and guidance.
Pre-populate information in the Well-Architected Tool for customers based on information that’s already known about them—streamlining the review process.

By leveraging the Well-Architected Tool APIs, companies can effectively govern workloads across many AWS accounts, stay up-to-date on the latest best practices, and scale Well-Architected principles across teams and systems.

About the author

Tom McMeekin is an Enterprise Solutions Architect with a career in technology spanning over 20 years. Tom has worked across a number of industry verticals including Telecommunications, Manufacturing, Infrastructure and Development, Utilities, Energy, and Retail. Throughout his career, he has focused on solving complex business problems through innovative technologies that deliver the right business outcomes for his customers.

AWS Cloud Operations Blog