AWS Cloud Operations Blog
Use Amazon Athena and Amazon QuickSight to build custom reports of AWS Well-Architected Reviews
AWS Well-Architected helps cloud architects build secure, high-performing, resilient, and efficient infrastructure for their applications and workloads. Based on five pillars — operational excellence, security, reliability, performance efficiency, and cost optimization — AWS Well-Architected provides a consistent approach for customers and partners to evaluate architectures, and implement designs that can scale over time.
You can use the AWS Well-Architected Tool (AWS WA Tool) to review the state of your workloads and compare them to the latest AWS architectural best practices. Its API allows customers, ISVs, and AWS Partner Network to extend AWS Well-Architected functionality, best practices, measurements, and lessons learned into their existing architecture governance processes, applications, and workflows.
By building your own integrations with the AWS WA Tool, your enterprise can support a range of use cases, such as integrating AWS Well-Architected data into centralized reporting tools or ticketing and management solutions. The API can also be used to create automation for specific use cases.
This post presents a simple approach for aggregating the data of the workload reviews into a central data lake repository. It helps teams to analyze their organization’s Well-Architected maturity across multiple AWS accounts and workloads and perform centralized reporting on high-risk issues (HRIs).
Architecture overview
Many customers use multiple AWS accounts to provide administrative autonomy for their teams. The AWS WA Tool offers a simple way to share workloads with other AWS accounts. You can share a workload that you own in your account with AWS accounts used by other members of your review team or with a centralized AWS account. For more information, see sharing a Workload.
Figure 1: Architecture diagram
- After workloads are defined in the AWS WA Tool, AWS Lambda can poll the AWS Well-Architected Tool API to extract the raw data and store it in an Amazon Simple Storage Service (Amazon S3) bucket.
- AWS Glue crawlers are used to discover schema and store it in the AWS Glue Data Catalog.
- Amazon Athena is then used for data preparation tasks like building views of the workload report data.
- Amazon QuickSight can now query and build visualizations to discover insights into your Well-Architected Reviews.
This pattern can extend the visibility of HRIs identified in the AWS WA Tool to enable custom visualizations and more insights. The central management account would typically be managed by a Cloud Center of Excellence (CCoE) team, who can advise and act on emerging HRIs across their entire AWS application portfolio.
Prerequisites
Complete the following prerequisites:
- Define and document a workload in the AWS WA Tool. For better understanding, refer to AWS documentation.
- Create or use an existing S3 bucket where you will store the extracted AWS Well-Architected data.
Walkthrough
The AWS Well-Architected Tool API provides programmatic access to the AWS WA Tool in the AWS Management Console. The API can be used for managing:
- workloads
- milestones
- lenses
- reviews
- workload shares
- workload share invitations.
For more information, see the getting started with AWS Well-Architected Tool.
Extract, transform, and load Well-Architected workload data
To extract the Well-Architected workload data, we’ll create an AWS Lambda function to query the Well-Architected API. AWS Lambda makes three calls to the Well-Architected Tool API to obtain workload, lens, and milestone information. It performs some minor transformation of the data to normalize the JSON structure and stores the data in an S3 bucket.
Create Lambda function
- Begin by navigating to the Lambda console and choosing the Create function
- Choose the option to Author from scratch and then fill in
extract-war-reports
as the function name, “Python 3.6” as the runtime. - Under Permissions, select the option to “Create new role from template(s)”, assigning a role name of
extract-war-reports_role
. - Then, choose Create function. Lambda will then create a new function along with a role for executing the function.
Figure 2: Lambda configuration
Now, go ahead and paste the following code into the function editor. This code handles calls to obtain the workload data and storing in Amazon S3. Select Deploy to commit the code changes.
import boto3
import json
import os
import logging
logger = logging.getLogger()
logger.setLevel(logging.INFO)
s3_bucket = os.environ['S3_BUCKET']
s3_key = os.environ['S3_KEY']
#################
# Boto3 Clients #
#################
wa_client = boto3.client('wellarchitected')
s3_client = boto3.client('s3')
##############
# Parameters #
##############
# The maximum number of results the API can return in a list workloads call.
list_workloads_max_results_maximum = 50
# The maximum number of results the API can return in a list answers call.
list_answers_max_results_maximum = 50
# The maximum number of results the API can return in a list milestones call.
list_milestone_max_results_maximum = 50
def get_all_workloads():
# Get a list of all workloads
list_workloads_result = wa_client.list_workloads(MaxResults=list_workloads_max_results_maximum)
logger.info(f'Found {len(list_workloads_result)} Well-Archtected workloads.')
workloads_all = list_workloads_result['WorkloadSummaries']
while 'NextToken' in list_workloads_result:
next_token = list_workloads_result['NextToken']
list_workloads_result = wa_client.list_workloads(
MaxResults=list_workloads_max_results_maximum, NextToken=next_token
)
workloads_all += list_workloads_result['WorkloadSummaries']
return (workloads_all)
def get_milestones(workload_id):
# # Get latest milestone review date
milestones = wa_client.list_milestones(
WorkloadId=workload_id, MaxResults=list_milestone_max_results_maximum
)['MilestoneSummaries']
# If workload has milestone get them.
logger.info(f'Workload {workload_id} has {len(milestones)} milestones.')
if milestones:
for milestone in milestones:
milestone['RecordedAt'] = milestone['RecordedAt'].isoformat()
return milestones
def get_lens(workload_id):
# Which lenses have been activated for this workload
# print(workload_id)
lens_reviews_result = wa_client.list_lens_reviews(
WorkloadId=workload_id
)['LensReviewSummaries']
logger.info(f'Workload {workload_id} has used {len(lens_reviews_result)} lens')
return lens_reviews_result
def get_lens_answers(workload_id, lens_reviews):
# Loop through each activated lens
# TODO - List Answers for each milestone
list_answers_result = []
for lens in lens_reviews:
lens_name = lens['LensName']
logger.info(f'Looking at {lens_name} answers for Workload {workload_id}')
# Get All answers for the lens
list_answers_reponse = wa_client.list_answers(
WorkloadId=workload_id, LensAlias=lens['LensAlias'], MaxResults=list_answers_max_results_maximum
)
# Flatten the answer result to include LensAlias and Milestone Number
for answer_result in list_answers_reponse['AnswerSummaries']:
answer_result['LensAlias'] = list_answers_reponse['LensAlias']
# if 'MilestoneNumber' in list_answers_reponse:
# print("MILSTONE_DETECTED")
# answer_result['MilestoneNumber'] = list_answers_reponse['MilestoneNumber']
# Append Answers from each lens. For reporting.
list_answers_result.extend(list_answers_reponse['AnswerSummaries'])
return list_answers_result
def get_lens_summary(workload_id, lens_reviews):
# Loop through each activated lens
list_lens_review_summary = []
for lens in lens_reviews:
lens_name = lens['LensName']
logger.info(f'Looking at {lens_name} Summary for Workload {workload_id}')
list_lens_review_reponse = wa_client.get_lens_review(
WorkloadId=workload_id, LensAlias=lens['LensAlias']
)
list_lens_review_reponse['LensReview']['UpdatedAt'] = list_lens_review_reponse['LensReview']['UpdatedAt'].isoformat()
list_lens_review_summary.append(list_lens_review_reponse['LensReview'])
return list_lens_review_summary
def lambda_handler(event, context):
workloads_all = get_all_workloads()
# Generate workload JSON file
logger.info(f'Generate JSON object for each workload.')
# print (workloads_all)
for workload in workloads_all:
# Get workload info from WAR Tool API,
workload_id = workload['WorkloadId']
milestones = get_milestones(workload_id)
lens_reviews = get_lens(workload_id)
lens_summary_result = get_lens_summary(workload_id, lens_reviews)
list_answers_result = get_lens_answers(workload_id, lens_reviews)
# Build JSON of workload data
workload_report_data = {}
workload_report_data['workload_id'] = workload['WorkloadId']
workload_report_data['workload_name'] = workload['WorkloadName']
workload_report_data['workload_owner'] = workload['Owner']
workload_report_data['workload_lastupdated'] = workload['UpdatedAt'].isoformat()
workload_report_data['lens_summary'] = lens_summary_result
workload_report_data['milestones'] = milestones
# workload_report_data['report_answers'] = list_answers_result['AnswerSummaries']
workload_report_data['report_answers'] = list_answers_result
logger.debug(workload_report_data)
print(workload_report_data)
# Write to S3
file_name = workload_id + '.json'
logger.info(f'Writing JSON object to s3://{s3_bucket}/{s3_key}{file_name}.')
s3_client.put_object(
Body=str(json.dumps(workload_report_data)),
Bucket=s3_bucket,
Key=f'{s3_key}{file_name}'
)
Environment variables
Let’s add some environment variables to pass environment-specific settings to the code.
Select Edit under Environment variables, and add the following environment variables and select Save:
Key | Value |
S3_BUCKET | S3 bucket name, where you will store the extracted AWS Well-Architected data, e.g. “well-architected-reporting-blog” |
S3_KEY | path in which you want the extracted Well-Architected workload data to be stored, e.g. “WorkloadReports/” |
Your environment variables should now look like this:
Figure 3: Environment variables configuration
Trigger configuration
Let’s configure an Amazon Eventbridge (Amazon CloudWatch Events) schedule to have AWS Lambda poll the Well-Architected Tool API to extract all shared workloads to the AWS WA Tool in your AWS management account. Expand the Designer drop-down list and then select EventBridge (CloudWatch Events) as a trigger. Fill in LambdaExtractWARReportsSchedule
as the Rule name. Select Schedule expression, and fill in a suitable expression that meets your requirements, e.g. rate(1 hour)
will configure the Lambda function once every hour.
Figure 4: Lambda trigger configuration
Lambda timeout
You should also increase the function timeout to 3 minute to ensure that the function always has time to finish reading all the defined workload data.
IAM role configuration
Finally, navigate to the IAM console and open role “extract-war-reports_role”. Attach policy “WellArchitectedConsoleReadOnlyAccess ” in order to grant the permissions necessary for calling Well-Architected APIs. Also attach the following policy to the role, replacing <S3_BUCKET_NAME>
with the name of your S3 bucket where you will store the extracted AWS Well-Architected data. This is the same bucket configured as an environment variable.
{ "Version": "2012-10-17", "Statement": [ { "Action": [ "s3:GetObject", "s3:PutObject" ], "Resource": "arn:aws:s3:::<S3_BUCKET_NAME>*", "Effect": "Allow" } ] }
Use AWS Glue to catalog the Well-Architected workload data
AWS Glue is a fully managed extract, transform, and load (ETL) service that makes it easy to prepare and load your data for analytics. AWS Glue provides crawlers to determine the schema and stores the metadata in the .
Create the Crawler
- Open the AWS Glue console, and from the left navigation pane, choose Crawlers.
- Select Add crawler and name the crawler
well-architected-reporting,
select Next. - Select Next to accept the defaults for Specify crawler source type.
- Add the S3 path of the where you will store the extracted AWS Well-Architected data e.g.
s3://well-architected-reporting-blog
. Select - Select No and then Next to on the Add another data store step.
- Select Create an IAM role and provide a name, e.g.
well-architected-reporting
, select Next. - Select Run on demand as the schedule frequency. Select Next.
- Next select Add database, and fill-in a name e.g.
war-reports
. Select Create and then Next. - Review the configuration and select Finish to create the Crawler.
Figure 5: AWS Glue Crawler configuration
Run the Crawler
- Find the crawler that was just created, select it, and then choose Run Crawler.
- Wait for the crawler to complete running, which should take approximately one minute.
- From the left navigation pane, choose Databases.
- Find the database that was created during the Crawler creation, select it and choose View Tables.
- In the Name field, you should see “workloadreports”. Select this and examine the metadata that was discovered during the crawler run, as shown in Figure 6.
Figure 6: workloadreports table name
Query the Well-Architected workload data with Amazon Athena
Although structured data remains the backbone of many data platforms, unstructured or semi structured data is used to enrich existing information or to create new insights. Amazon Athena enables you to analyze a variety of data, including:
- Tabular data in comma-separated value (CSV) or Apache Parquet files
- Data extracted from log files using regular expressions
- JSON-formatted data
Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run. The AWS WA Tool data is represented as a nested JSON object. For example, running a simple SELECT query on the JSON data in Amazon Athena returns the following, where the underlying report_answers field is selected, but still represented in JSON format.
SELECT workload_id, workload_owner, report_answers
FROM "YOUR_DATABASE_NAME"."workloadreports"
LIMIT 4;
Figure 7: report_answers
Create a view for the workload answers data
Follow these steps to create a view in Amazon Athena to present the JSON data as a tabular view for reporting and visualizing. A view in Amazon Athena is a logical, not physical table. The query that defines a view runs each time the view is referenced in a query. From within the Amazon Athena console, open a new query tab and execute the following query:
CREATE OR REPLACE VIEW well_architected_reports_view AS SELECT workload_id, workload_name, workload_owner, CAST(from_iso8601_timestamp(workload_lastupdated) AS timestamp) AS "timestamp", answers.QuestionTitle, answers.LensAlias, answers.pillarid, answers.risk FROM "workloadreports" CROSS JOIN unnest(report_answers) AS t(answers)
The SQL statement, UNNEST, takes the report_answers column from the original table as a parameter. It creates a new dataset with the new column answers, which is later cross-joined. The enclosing SELECT statement can then reference the new answers column directly. You can quickly query the view to see the result to understand how the report_answers are now represented.
SELECT workload_id, workload_owner, questionid, pillarid
FROM "YOUR_DATABASE_NAME"."well_architected_reports_view"
LIMIT 4;
Figure 8: well_architected_reports_view
Create a view for the workload risk counts data
Now create a view for a summary of risks associated with each lens for each workload. Open a new query tab and execute the following query:
CREATE
OR REPLACE VIEW well_architected_workload_lens_risk_view AS
SELECT workload_id,
workload_name,
lens.LensAlias,
lens_pillar_summary.PillarId,
lens_pillar_summary.RiskCounts.UNANSWERED,
lens_pillar_summary.RiskCounts.HIGH,
lens_pillar_summary.RiskCounts.MEDIUM,
lens_pillar_summary.RiskCounts.NONE,
lens_pillar_summary.RiskCounts.NOT_APPLICABLE
FROM "workloadreports"
CROSS JOIN unnest(lens_summary) AS t(lens)
CROSS JOIN unnest(lens.PillarReviewSummaries) AS tt(lens_pillar_summary)
Previewing the newly created well_architected_workload_risk_view:
SELECT *
FROM "YOUR_DATABASE_NAME"."well_architected_workload_risk_view"
LIMIT 4;
Figure 9: well_architected_workload_risk_view
Create a view for the workload’s milestone data
A milestone records the state of a workload at a particular point in time. As a workload changes, you can add more milestones to measure progress. To enable visualization of this improvement across all workloads, create a view for a workload milestone. Open a new query tab and execute the following query:
CREATE
OR REPLACE VIEW well_architected_workload_milestone_view AS
SELECT CAST(from_iso8601_timestamp(milestone.RecordedAt) AS timestamp) AS "timestamp",
workload_id,
workload_name,
workload_owner,
milestone.MilestoneName,
milestone.MilestoneNumber,
milestone.WorkloadSummary.ImprovementStatus,
milestone.WorkloadSummary.RiskCounts.HIGH,
milestone.WorkloadSummary.RiskCounts.MEDIUM,
milestone.WorkloadSummary.RiskCounts.UNANSWERED,
milestone.WorkloadSummary.RiskCounts.NONE,
milestone.WorkloadSummary.RiskCounts.NOT_APPLICABLE
FROM “workloadreports"
CROSS JOIN unnest(milestones) AS t(milestone)
Previewing the newly created well_architected_workload_risk_view:
SELECT *
FROM "YOUR_DATABASE_NAME"."well_architected_workload_milestone_view"
LIMIT 4;
Figure 10: well_architected_workload_milestone_view
Visualize the data with Amazon QuickSight
Now that you can query your data in Amazon Athena, you can use Amazon QuickSight to visualize the results.
Grant Amazon QuickSight access to Amazon Athena and your S3 bucket
First, grant Amazon QuickSight access to the S3 bucket where your Well-Architected data is stored.
- Sign in to the Amazon QuickSight console.
- In the upper right corner of the console, choose Admin/username, and then choose Manage QuickSight
- Choose Security and permissions.
- Under QuickSight access to AWS services, choose Add or remove.
- Choose Amazon Athena, and then choose Next.
- Give QuickSight access to the S3 bucket where you will store the extracted AWS Well-Architected data, e.g. “well-architected-reporting-blog”.
Create your datasets
Before you can analyze and visualize the data in QuickSight, you must create datasets for your Athena views and tables for each of the Athena views. Create QuickSight datasets for:
- well_architected_workload_milestone_view
- well_architected_workload_lens_risk_view
- well_architected_reports_view
- To create a dataset, on the Datasets page, choose New dataset, and then choose Athena.
- Choose the Glue database that was created as part of the Glue Crawler creation step.
- Choose Directly Query your data. Figure 11: Choose your table in Amazon QuickSight
Create your analysis
Now create a single-page dashboard using the three Athena datasets you just created.
- In the QuickSight console, choose Create Analysis.
- Choose one of the Athena datasets, and then choose Create Analysis.
- Choose the pencil icon to add the other Athena datasets.
- After you have added the datasets, they should be available for analysis.
Build your dashboard
Each dataset can be used to build visualizations for the following:
- Number of conducted reviews
- Number of HRIs
- Top high-risk questions
- Top high-risk workloads
- Mix of risk across reviews, lenses, and pillars
- Trends of risks across reviews, milestones, lenses, and pillars
This list not exhaustive. At AWS, we look forward to seeing the visualizations you’ll build for your organization. Figure 9 shows an example of a dashboard:
Figure 12: Dashboard example
Clean up the deployment
Manually delete the:
- S3 bucket and the data stored in the bucket,
- Lambda function,
- Glue crawler and database,
- Athena views, and the
- QuickSight resources.
Conclusion
It’s exciting to see the custom integrations with the AWS Well-Architected Tool made possible with the Well-Architected APIs. Here are a few more examples of the functionality available with the new AWS Well-Architected Tool APIs:
- Integrate AWS Well-Architected data into centralized reporting tools, or integrate with ticketing and management solutions.
- Automation of best practice detection.
- Provide insights to AWS customers based on their AWS Well-Architected Review, and recommend remediation steps and guidance.
- Pre-populate information in the Well-Architected Tool for customers based on information that’s already known about them—streamlining the review process.
By leveraging the Well-Architected Tool APIs, companies can effectively govern workloads across many AWS accounts, stay up-to-date on the latest best practices, and scale Well-Architected principles across teams and systems.
About the author
Tom McMeekin is an Enterprise Solutions Architect with a career in technology spanning over 20 years. Tom has worked across a number of industry verticals including Telecommunications, Manufacturing, Infrastructure and Development, Utilities, Energy, and Retail. Throughout his career, he has focused on solving complex business problems through innovative technologies that deliver the right business outcomes for his customers.