AWS Big Data Blog
Centrally tracking dashboard lineage, permissions, and more with Amazon QuickSight administrative dashboards
This post is co-written with Shawn Koupal, an Enterprise Analytics IT Architect at Best Western International, Inc.
A common ask from Amazon QuickSight administrators is to understand the lineage of a given dashboard (what analysis is it built from, what datasets are used in the analysis, and what data sources do those datasets use). QuickSight APIs allow us to capture the metadata from each object and build a complete picture of the linkages between each object. As a QuickSight administrator, you can build a dashboard that displays the lineage from dashboard to data source, along with the permissions for each asset type. It can be helpful to see all permissions assigned to each of your assets as well as the relationships between them, all in one place.
Solution overview
In this solution, you build an end-to-end data pipeline using QuickSight to ingest data from an AWS Glue table.
The following diagram illustrates the architecture of the solution.
You can invoke the QuickSight APIs via the AWS Software Development Kit (AWS SDK) or the AWS Command Line Interface (AWS CLI). For this post, we use the AWS SDK.
The solution starts with an AWS Lambda function that calls the QuickSight list
APIs (list_data_sources
, list_data_sets
, list_analyses
, list_templates
, and list_dashboards
) depending on the event message to build lists of assets in chunks of 100, which are iterated through by a second Lambda function. The reason for splitting the work into two functions is to work around the 15-minute time limit in Lambda. You can schedule the Lambda function to run on each asset type based on an event rule trigger. See the following code:
import boto3
import os
import time
import datetime
import json
AWS_ACCOUNT_ID=os.environ['AWS_ACCOUNT_ID']
AWS_REGION=os.environ['AWS_REGION']
QS_S3_BUCKET=os.environ['QS_S3_BUCKET']
DownloaderFunctionName=os.environ['DownloaderFunctionName']
client = boto3.client('quicksight', region_name=AWS_REGION)
lambda_client = boto3.client('lambda')
def invoke_downloader(iteration, apicall, list_results):
apicall=apicall.replace("list_data_sources","datasource").replace("list_data_sets","dataset").replace("list_analyses","analysis").replace("list_dashboards","dashboard").replace("list_templates","template")
msg = {"Iteration": iteration, "api": apicall, "Objects": list_results }
invoke_response = lambda_client.invoke(FunctionName=DownloaderFunctionName,
InvocationType='Event',
Payload=json.dumps(msg, default=datetime_handler))
def datetime_handler(x):
if isinstance(x, datetime.datetime):
return x.isoformat()
raise TypeError("Unknown type")
def file_cleanup(apicall):
#Replace the apicall with the S3 folder name
object_type=apicall.replace("list_data_sources","datasource").replace("list_data_sets","dataset").replace("list_analyses","analysis").replace("list_dashboards","dashboard").replace("list_templates","template")
s3_path='quicksight_lineage/'+object_type+'/'
s3_path2='quicksight_lineage/'+object_type+'_permissions/'
fileprefix="QuickSight_"+object_type
botoSession = boto3.Session (region_name = 'us-west-2')
s3_session = botoSession.resource('s3')
bucket = s3_session.Bucket(QS_S3_BUCKET)
#Delete Any files with prefix in s3_path and s3_path2
bucket.objects.filter(Prefix=s3_path+fileprefix).delete()
bucket.objects.filter(Prefix=s3_path2+fileprefix).delete()
def lambda_handler(event, context):
if event == {}:
#Call All APIs assests
apicall_list=['list_data_sources','list_data_sets','list_analyses','list_dashboards','list_templates']
elif event["api"] == 'datasource':
apicall_list=['list_data_sources']
elif event["api"] == 'dataset':
apicall_list=['list_data_sets']
elif event["api"] == 'analysis':
apicall_list=['list_analyses']
elif event["api"] == 'dashboard':
apicall_list=['list_dashboards']
elif event["api"] == 'template':
apicall_list=['list_templates']
else:
print("[WARN] Exception: Invalid Event Type.")
return
for apicall in apicall_list:
try:
#Clean up files from previous run
file_cleanup(apicall)
#Reset variables for each apicall
iteration=0
user_token = None
list_results={}
while True:
iteration+=1
print("Calling ",apicall, iteration)
if user_token is None:
exec("""results=client."""+apicall+"""(AwsAccountId='"""+AWS_ACCOUNT_ID+"""', MaxResults=100);""",globals(), list_results)
else:
exec("""results=client."""+apicall+"""(AwsAccountId='"""+AWS_ACCOUNT_ID+"""', MaxResults=100,NextToken='"""+user_token+"""');""",globals(), list_results)
invoke_downloader(iteration, apicall, list_results["results"])
user_token=list_results["results"]["NextToken"]
print(user_token)
except KeyError:
print("NextToken not found.")
The second Lambda function consumes the list of assets from the event parameter from the first function and uses the QuickSight describe
APIs (describe_datasource
, describe_dataset
, describe_analysis
, describe_template
, and describe_dashboard
). The details of each QuickSight asset are written to CSV files in an Amazon Simple Storage Service (Amazon S3) bucket in groups of 100. Because the first function calls the second function in parallel, it’s recommended to set the reserved concurrency to 2 in the second Lambda function to avoid throttling errors (if you use the AWS CloudFormation template provided later in this post, this is automatically configured for you). See the following code:
import boto3
import os
import time
import datetime
import json
AWS_ACCOUNT_ID=os.environ['AWS_ACCOUNT_ID']
AWS_REGION=os.environ['AWS_REGION']
QS_S3_BUCKET=os.environ['QS_S3_BUCKET']
client = boto3.client('quicksight', region_name=AWS_REGION)
lambda_client = boto3.client('lambda')
s3 = boto3.client('s3')
def process_dashboards(list_dashboard,iteration,object_type):
filename="QuickSight_"+object_type+"_"+iteration+".csv"
filePath = os.path.join("/tmp",filename)
f=open(filePath,"w")
#CSV Header
f.write("DashboardId,Name,SourceEntityArn,VersionCreatedTime,VersionNumber,CreatedTime,DataSetArns,LastPublishedTime,LastUpdatedTime" + '\n')
for dashboard in list_dashboard["DashboardSummaryList"]:
dashboard_desc= client.describe_dashboard(AwsAccountId=AWS_ACCOUNT_ID,DashboardId=dashboard["DashboardId"])
source_entity_arn = dashboard_desc["Dashboard"]["Version"]["SourceEntityArn"]
version_created_time = dashboard_desc["Dashboard"]["Version"]["CreatedTime"].isoformat()
version_number = str(dashboard_desc["Dashboard"]["Version"]["VersionNumber"])
created_time = dashboard_desc["Dashboard"]["CreatedTime"].isoformat()
last_published_time = dashboard_desc["Dashboard"]["LastPublishedTime"].isoformat()
last_updated_time = dashboard_desc["Dashboard"]["LastUpdatedTime"].isoformat()
try:
for arn in dashboard_desc["Dashboard"]["Version"]["DataSetArns"]:
f.write(dashboard["DashboardId"]+',"'+ dashboard["Name"] + '",' + source_entity_arn + ',' + version_created_time + ',' + version_number + ',' + created_time + ','+ arn + ',' + last_published_time + ',' + last_updated_time +'\n')
except Exception as e:
print(e)
dataset_arn=''
f.write(dashboard["DashboardId"]+',"'+ dashboard["Name"] + '",' + source_entity_arn + ',' + version_created_time + ',' + version_number + ',' + created_time + ','+ dataset_arn + ',' + last_published_time + ',' + last_updated_time +'\n')
f.close()
s3_path='quicksight_lineage/'+object_type+'/'
s3.upload_file("{}/{}".format("/tmp", filename), QS_S3_BUCKET, s3_path + filename)
def process_dashboards_permissions(list_dashboard,iteration,object_type):
filename="QuickSight_"+object_type+"_"+iteration+".csv"
filePath = os.path.join("/tmp",filename)
f=open(filePath,"w")
#CSV Header
f.write("DashboardId,Name,Principal,Permission,Iteration" + '\n')
for dashboard in list_dashboard["DashboardSummaryList"]:
try:
list_permissions = client.describe_dashboard_permissions(AwsAccountId=AWS_ACCOUNT_ID,DashboardId=dashboard["DashboardId"])
except:
print("Error Listing Permissions for:"+dashboard["DashboardId"])
continue
for permission in list_permissions["Permissions"]:
#If Action includes delete operation then principal has co-owner permissions
if "quicksight:DeleteDashboard" in permission["Actions"]:
action = "Co-Owner"
else:
action = "View"
f.write(dashboard["DashboardId"]+',"'+ dashboard["Name"] + '",' + permission["Principal"] + ',' + action + ',' + iteration +'\n')
f.close()
s3_path='quicksight_lineage/'+object_type+'/'
s3.upload_file("{}/{}".format("/tmp", filename), QS_S3_BUCKET, s3_path + filename)
def process_analysis(list_analyses,iteration,object_type):
filename="QuickSight_"+object_type+"_"+iteration+".csv"
filePath = os.path.join("/tmp",filename)
f=open(filePath,"w")
#CSV Header
f.write("AnalysisId,Name,AnalysisArn,CreatedTime,LastUpdatedTime,DataSetArn,Iteration" + '\n')
for analysis in list_analyses["AnalysisSummaryList"]:
#Call describe_analysis
analysis_desc= client.describe_analysis(AwsAccountId=AWS_ACCOUNT_ID,AnalysisId=analysis["AnalysisId"])
analysis_arn = analysis_desc["Analysis"]["Arn"]
created_time = analysis_desc["Analysis"]["CreatedTime"].isoformat()
last_updated_time = analysis_desc["Analysis"]["LastUpdatedTime"].isoformat()
try:
for arn in analysis_desc["Analysis"]["DataSetArns"]:
f.write(analysis["AnalysisId"]+',"'+ analysis["Name"] + '",' + analysis_arn + ',' + created_time + ',' + last_updated_time + ',' + arn + ',' + iteration +'\n')
except Exception as e:
print(e)
dataset_arn=''
f.write(analysis["AnalysisId"]+',"'+ analysis["Name"] + '",' + analysis_arn + ',' + created_time + ',' + last_updated_time + ',' + dataset_arn + ',' + iteration +'\n')
f.close()
s3_path='quicksight_lineage/'+object_type+'/'
s3.upload_file("{}/{}".format("/tmp", filename), QS_S3_BUCKET, s3_path + filename)
def process_analysis_permissions(list_analyses,iteration,object_type):
filename="QuickSight_"+object_type+"_"+iteration+".csv"
filePath = os.path.join("/tmp",filename)
f=open(filePath,"w")
#CSV Header
f.write("AnalysisId,Name,Principal,Permission,Iteration" + '\n')
for analysis in list_analyses["AnalysisSummaryList"]:
try:
list_permissions = client.describe_analysis_permissions(AwsAccountId=AWS_ACCOUNT_ID,AnalysisId=analysis["AnalysisId"])
except:
print("Error Listing Permissions for:"+analysis["AnalysisId"])
continue
for permission in list_permissions["Permissions"]:
#If Action includes delete operation then principal has co-owner permissions
if "quicksight:DeleteAnalysis" in permission["Actions"]:
action = "Co-Owner"
else:
action = "View"
f.write(analysis["AnalysisId"]+',"'+ analysis["Name"] + '",' + permission["Principal"] + ',' + action + ',' + iteration +'\n')
f.close()
s3_path='quicksight_lineage/'+object_type+'/'
s3.upload_file("{}/{}".format("/tmp", filename), QS_S3_BUCKET, s3_path + filename)
def process_templates(list_templates,iteration,object_type):
filename="QuickSight_"+object_type+"_"+iteration+".csv"
filePath = os.path.join("/tmp",filename)
f=open(filePath,"w")
#CSV Header
f.write("TemplateId,Name,TemplateArn,CreatedTime,LastUpdatedTime,SourceEntityArn,VersionNumber,Iteration" + '\n')
for template in list_templates["TemplateSummaryList"]:
#Call describe_template
template_desc= client.describe_template(AwsAccountId=AWS_ACCOUNT_ID,TemplateId=template["TemplateId"])
template_arn = template_desc["Template"]["Arn"]
created_time = template_desc["Template"]["CreatedTime"].isoformat()
last_updated_time = template_desc["Template"]["LastUpdatedTime"].isoformat()
source_entity_arn = template_desc["Template"]["Version"]["SourceEntityArn"]
version_number = str(template_desc["Template"]["Version"]["VersionNumber"])
f.write(template["TemplateId"]+',"'+ template["Name"] + '",' + template_arn + ',' + created_time + ',' + last_updated_time + ',' + source_entity_arn + ',' + version_number + ',' + iteration +'\n')
f.close()
s3_path='quicksight_lineage/'+object_type+'/'
s3.upload_file("{}/{}".format("/tmp", filename), QS_S3_BUCKET, s3_path + filename)
def process_templates_permissions(list_templates,iteration,object_type):
filename="QuickSight_"+object_type+"_"+iteration+".csv"
filePath = os.path.join("/tmp",filename)
f=open(filePath,"w")
#CSV Header
f.write("TemplateId,Name,Principal,Permission,Iteration" + '\n')
for template in list_templates["TemplateSummaryList"]:
try:
list_permissions = client.describe_template_permissions(AwsAccountId=AWS_ACCOUNT_ID,TemplateId=template["TemplateId"])
except:
print("Error Listing Permissions for:"+template["TemplateId"])
continue
for permission in list_permissions["Permissions"]:
#If Action includes delete operation then principal has co-owner permissions
if "quicksight:DeleteTemplate" in permission["Actions"]:
action = "Co-Owner"
else:
action = "View"
f.write(template["TemplateId"]+',"'+ template["Name"] + '",' + permission["Principal"] + ',' + action + ',' + iteration +'\n')
f.close()
s3_path='quicksight_lineage/'+object_type+'/'
s3.upload_file("{}/{}".format("/tmp", filename), QS_S3_BUCKET, s3_path + filename)
def process_datasources(list_data_sources,iteration,object_type):
filename="QuickSight_"+object_type+"_"+iteration+".csv"
filePath = os.path.join("/tmp",filename)
f=open(filePath,"w")
#CSV Header
f.write("DataSourceId,DataSourceArn,Name,Type,LastUpdatedTime,CreatedTime,Status,ErrorInfo,Iteration" + '\n')
global datasource_list
datasource_list=[]
for datasource in list_data_sources["DataSources"]:
datasource_id=datasource["DataSourceId"]
name=datasource["Name"]
datasource_type=datasource["Type"]
try:
status=datasource["Status"]
except:
status=''
CreatedTime=str(datasource["CreatedTime"])
LastUpdatedTime=str(datasource["LastUpdatedTime"])
try:
ErrorInfo="Type: "+datasource["ErrorInfo"]["Type"]+" Message: "+datasource["ErrorInfo"]["Message"]
except:
ErrorInfo="Null"
f.write( datasource_id + ',' + datasource["Arn"] + ',"' + name + '",' + datasource_type + ',' + LastUpdatedTime+ ',' + CreatedTime + ',' + status + ',' + ErrorInfo+ ',' + iteration +'\n')
f.close()
s3_path='quicksight_lineage/'+object_type+'/'
s3.upload_file("{}/{}".format("/tmp", filename), QS_S3_BUCKET, s3_path + filename)
def process_datasources_permissions(list_data_sources,iteration,object_type):
filename="QuickSight_"+object_type+"_"+iteration+".csv"
filePath = os.path.join("/tmp",filename)
f=open(filePath,"w")
#CSV Header
f.write("DataSourceID,Name,Principal,Permission,Iteration" + '\n')
for datasource in list_data_sources["DataSources"]:
try:
list_permissions = client.describe_data_source_permissions(AwsAccountId=AWS_ACCOUNT_ID,DataSourceId=datasource["DataSourceId"])
except:
print("Error Listing Permissions for:"+datasource["DataSourceId"])
continue
for permission in list_permissions["Permissions"]:
#If Action includes delete operation then principal has co-owner permissions
if "quicksight:DeleteDataSource" in permission["Actions"]:
action = "Co-Owner"
else:
action = "View"
f.write(datasource["DataSourceId"]+',"'+ datasource["Name"] + '",' + permission["Principal"] + ',' + action + ',' + iteration +'\n')
f.close()
s3_path='quicksight_lineage/'+object_type+'/'
s3.upload_file("{}/{}".format("/tmp", filename), QS_S3_BUCKET, s3_path + filename)
def process_datasets(list_datasets,iteration,object_type):
filename="QuickSight_"+object_type+"_"+iteration+".csv"
filePath = os.path.join("/tmp",filename)
f=open(filePath,"w")
#CSV Header
f.write('DatasetId,DataSetArn,Name,SpiceSize,ImportMode,LastUpdatedTime,CreatedTime,DataSourceArn,DataSourceName,DataSourceType,Source,Columns,Iteration' + '\n')
for dataset in list_datasets["DataSetSummaries"]:
try:
response= client.describe_data_set(AwsAccountId=AWS_ACCOUNT_ID,DataSetId=dataset["DataSetId"])
except Exception as e:
print("Dataset ID: ", dataset["DataSetId"], e)
f.write( dataset["DataSetId"] + ',' + dataset["Arn"] + ',"' + dataset["Name"] + '",' + '0' + ',' + dataset["ImportMode"] + ',' + str(dataset["LastUpdatedTime"])+ ','+ str(dataset["CreatedTime"])+ ',' + 'n/a' + ',"' + 'n/a' + '",' + 'n/a' + ',' + 'n/a' + ',"' + 'n/a'+ '",' + iteration +'\n')
continue
dataset_id=response["DataSet"]["DataSetId"]
dataset_name=response["DataSet"]["Name"]
dataset_size=response["DataSet"]["ConsumedSpiceCapacityInBytes"]
ImportMode=response["DataSet"]["ImportMode"]
LastUpdatedTime=response["DataSet"]["LastUpdatedTime"].isoformat()
CreatedTime=response["DataSet"]["CreatedTime"].isoformat()
try:
for key in response["DataSet"]["PhysicalTableMap"].keys():
if key == 's3PhysicalTable':
source='S3Source'
DataSourceArn=response["DataSet"]["PhysicalTableMap"]["s3PhysicalTable"]["S3Source"]["DataSourceArn"]
Columns=response["DataSet"]["PhysicalTableMap"]["s3PhysicalTable"]["S3Source"]["InputColumns"]
#SqlQuery="Null"
else:
try:
DataSourceArn=response["DataSet"]["PhysicalTableMap"][key]["RelationalTable"]["DataSourceArn"]
Columns=""
source="VisualEditor"
except:
DataSourceArn=response["DataSet"]["PhysicalTableMap"][key]["CustomSql"]["DataSourceArn"]
Columns=response["DataSet"]["PhysicalTableMap"][key]["CustomSql"]["Columns"]
source="CustomSql"
DataSourceName=""
DataSourceType=""
f.write( dataset_id + ',' + dataset["Arn"] + ',"' + dataset_name + '",' + str(dataset_size) + ',' + ImportMode + ',' + LastUpdatedTime+ ','+ CreatedTime+ ',' + DataSourceArn + ',"' + DataSourceName + '",' + DataSourceType + ',' + source + ',"' + str(Columns) + '",' + iteration +'\n')
except:
print("[DEBUG]: Exception in main write for: " + str(dataset))
f.write( dataset_id + ',' + dataset["Arn"] +',"' + dataset_name + '",' + str(dataset_size) + ',' + ImportMode + ',' + LastUpdatedTime+ ',' + CreatedTime + ',,,,Unknown,"' + str(Columns) + '",' + iteration +'\n')
f.close()
s3_path='quicksight_lineage/'+object_type+'/'
s3.upload_file("{}/{}".format("/tmp", filename), QS_S3_BUCKET, s3_path + filename)
def process_datasets_permissions(list_datasets,iteration,object_type):
filename="QuickSight_"+object_type+"_"+iteration+".csv"
filePath = os.path.join("/tmp",filename)
f=open(filePath,"w")
f.write('DataSetID,Name,Principal,Permission,Iteration'+'\n')
for dataset in list_datasets["DataSetSummaries"]:
try:
list_permissions = client.describe_data_set_permissions(AwsAccountId=AWS_ACCOUNT_ID,DataSetId=dataset["DataSetId"])
except:
print("Error Listing Permissions for:"+dataset["DataSetId"])
continue
for permission in list_permissions["Permissions"]:
#If Action includes delete operation then principal has co-owner permissions
if "quicksight:DeleteDataSet" in permission["Actions"]:
action = "Co-Owner"
else:
action = "View"
f.write(dataset["DataSetId"]+',"'+ dataset["Name"] + '",' + permission["Principal"] + ',' + action+ ',' + iteration +'\n')
f.close()
s3_path='quicksight_lineage/'+object_type+'/'
s3.upload_file("{}/{}".format("/tmp", filename), QS_S3_BUCKET, s3_path + filename)
def lambda_handler(event, context):
list_objects=event["Objects"]
iteration=str(event["Iteration"])
print("Iteration: ", iteration)
print("[INFO]Processing QuickSight:", event["api"] )
if event["api"] == 'datasource':
process_datasources(list_objects, iteration, event["api"])
process_datasources_permissions(list_objects, iteration, event["api"]+'_permissions')
elif event["api"] == 'dataset':
process_datasets(list_objects, iteration, event["api"])
process_datasets_permissions(list_objects, iteration, event["api"]+'_permissions')
elif event["api"] == 'analysis':
process_analysis(list_objects, iteration, event["api"])
process_analysis_permissions(list_objects, iteration, event["api"]+'_permissions')
elif event["api"] == 'dashboard':
process_dashboards(list_objects, iteration, event["api"])
process_dashboards_permissions(list_objects, iteration, event["api"]+'_permissions')
elif event["api"] == 'template':
process_templates(list_objects, iteration, event["api"])
process_templates_permissions(list_objects, iteration, event["api"]+'_permissions')
else:
print("[WARN] Exception: Invalid Event Type.")
return
Afterwards, the S3 bucket has the directory structure under the quicksight_lineage
folder as shown in the following screenshot.
You then use AWS Glue to store the metadata of each file in an AWS Glue table, which allows you to query the information from QuickSight using an Amazon Athena or Amazon Redshift Spectrum data source (if you run the CloudFormation stack, the tables are set up for you).
The following diagram shows the tables and relationships.
Walkthrough overview
The workflow is comprised of the following high-level steps:
- Deploy the CloudFormation template to build the Lambda functions, AWS Identity and Access Management (IAM) roles, S3 bucket, AWS Glue database, and AWS Glue tables.
- Run the Python Lambda functions to build CSV files that contain the QuickSight object details.
- Visualize the data in QuickSight. To do so, you must create your data source, dataset, and then analysis.
For this post, we use Athena as the query engine. To use Redshift Spectrum, you must modify the provided queries.
Prerequisites
For this walkthrough, you should have the following prerequisites:
- An AWS account
- An IAM user with access to AWS resources used in this solution (CloudFormation, IAM, Amazon S3, AWS Glue, Athena, QuickSight)
- Athena configured with a query result location
- QuickSight Enterprise Edition
Creating resources
Create your resources by launching the following CloudFormation stack:
During the stack creation process, you must provide an S3 bucket name in the S3BucketName
parameter (AWSAccountNumber
is appended to the bucket name provided to make it unique).
After the stack creation is successful, you have two Lambda functions, two S3 buckets, an AWS Glue database and tables, and the corresponding IAM roles and policies.
Running the Lambda function
To run your Lambda function, complete the following steps:
- On the Lambda console, navigate to the
QuickSight-Lineage-Dispatcher
function. - From the Select a test event menu, choose Configure test events.
- Select Create new test event.
You create one test event for all QuickSight assets.
- For Event name, enter all.
- Enter an empty JSON object (
{}
).
- Choose Test to run the Lambda function and generate CSV files of the assets.
Alternatively, you can create test events for each QuickSight object (Data Source
, DataSet
, Analysis
, Dashboard
, and Template)
for larger QuickSight
environments:
- Test event
DataSource
code:{ "api": "datasource" }
JSON - Test event
DataSet
code:{ "api": "dataset" }
JSON - Test event
Analysis
code:{ "api": "analysis" }
JSON - Test event
Dashboard
code:{ "api": "dashboard" }
JSON - Test event
Template
code:{ "api": "template" }
JSON
The following screenshot shows the configuration of a test event for Analysis
.
Creating your data source and lineage data set
In this step, you use QuickSight to access the tables in your AWS Glue database.
- Log in to QuickSight.
- Choose Manage QuickSight.
- Choose Security & permissions.
- Ensure that access to the S3 bucket (that was created through CloudFormation) is enabled.
- Choose New analysis.
- Choose New dataset.
- For the data source, choose Athena.
- For your data source name, enter
QuickSight-Lineage
. - Choose Create data source.
QuickSight prompts you to select your schema or database.
- Choose Use custom SQL.
- Update the query name
from New custom SQL
toQuickSight Lineage
. - Enter the following code into the query box:
select a.analysisid as analysis_id, a.name as analysis_name, a.analysisarn as analysis_arn, date_parse(substr(a.createdtime,1,26),'%Y-%m-%dT%H:%i:%s.%f') as analysis_createdtime, date_parse(substr(a.lastupdatedtime,1,26),'%Y-%m-%dT%H:%i:%s.%f') as analysis_lastupdatedtime, a.datasetarn as analysis_datasetarn, r.dashboardid as dashboard_id, r.name as dashboard_name, r.name||' - ID: '||r.dashboardid as dashboard_name_w_id, date_parse(substr(r.versioncreatedtime,1,26),'%Y-%m-%dT%H:%i:%s.%f') as dashboard_versioncreatedtime, r.versionnumber as dashboard_versionnumber , date_parse(substr(r.createdtime,1,26),'%Y-%m-%dT%H:%i:%s.%f') as dashboard_createdtime, date_parse(substr(r.lastpublishedtime,1,26),'%Y-%m-%dT%H:%i:%s.%f') as dashboard_lastpublishedtime , date_parse(substr(r.lastupdatedtime,1,26),'%Y-%m-%dT%H:%i:%s.%f') as dashboard_lastupdatedtime, d.datasetid as dataset_id, d.datasetarn as dataset_arn, d.name as dataset_name, d.spicesize as dataset_spicesize, d.importmode as dataset_importmode, date_parse(substr(d.lastupdatedtime,1,26),'%Y-%m-%dT%H:%i:%s.%f') as dataset_lastupdatedtime, date_parse(substr(d.createdtime,1,26),'%Y-%m-%dT%H:%i:%s.%f') as dataset_createdtime, d.source as dataset_source, d.columns as dataset_columns, s.datasourceid as datasource_id, s.datasourcearn as datasource_arn, s.name as datasource_name, s.type as datasource_type, date_parse(substr(s.lastupdatedtime,1,26),'%Y-%m-%dT%H:%i:%s.%f') as datasource_lastupdatedtime, date_parse(substr(s.createdtime,1,26),'%Y-%m-%dT%H:%i:%s.%f') as datasource_createdtime, s.status as datasource_status, s.errorinfo as datasource_errorinfo, t.templateid as template_id, t.name as template_name, t.templatearn as template_arn, date_parse(substr(t.createdtime,1,26),'%Y-%m-%dT%H:%i:%s.%f') as template_createtime, date_parse(substr(t.lastupdatedtime,1,26),'%Y-%m-%dT%H:%i:%s.%f') as template_lastupdatedtime from "quicksight-lineage"."dashboard" r left join "quicksight-lineage"."analysis" a on a.analysisarn = r.sourceentityarn and a.datasetarn=r.datasetarns left join "quicksight-lineage"."template" t on t.templatearn = r.sourceentityarn left join "quicksight-lineage"."dataset" d on d.datasetarn = r.datasetarns left join "quicksight-lineage"."datasource" s on s.datasourcearn = d.datasourcearn
SQL
- Choose Confirm query.
- Select Import to SPICE for quicker analytics.
- Choose Visualize.
In the new analysis, one empty visual is loaded by default.
- Change the visual type to pivot table.
- Choose (single-click) dashboard_name, analysis_name, template_name, dataset_name, and datasource_name in the Fields list.
You can search for name in field list to make this step easier
- Confirm that all fields were also added to the Rows
If you have assets with duplicates names, it can helpful to add the corresponding ID columns to the visual; for example, dashboard_id
, analysis_id
, template_id
, dataset_id
, datasource_id
.
Visualizing your assets and lineage
You now create five new visuals, one for each asset type (Dashboard
, Analysis
, Template
, Dataset
, Data Source
), to display the additional columns pulled from the APIs.
- From the Add menu, choose Add visual.
- For the first new visual, choose the table visual type.
- Search for
dashboard_
in Field List. - Choose (single-click) all matching columns.
- For the second visual, choose the table visual type.
- Search for
analysis_
in the Field List. - Choose (single-click) all matching columns.
- Move the second visual underneath the first visual.
- Repeat same steps for
template_
,dataset_
, anddatasource_
.
Creating your permissions data set
You now create your new data set.
- Leave the analysis by choosing the QuickSight logo on the top left.
- In the navigation pane, choose Datasets.
- Choose New dataset.
- Locate and choose the
QuickSight-Lineage
data source created earlier in the FROM EXISTING DATA SOURCES - In the QuickSight Lineage data source window, choose Create data set.
- Choose Use custom SQL.
- Update the name from New
custom SQL
toQuickSight Lineage Permissions
. - Enter the following code into the query box:
select distinct 'datasource' as QuickSightObjectType, sp.datasourceid as "QuickSightID",sp.name, split_part(principal,':',6) as principal,permission, r.name||' - ID: '||r.dashboardid as dashboard_name_w_id from "quicksight-lineage"."datasource_permissions" sp inner join "quicksight-lineage"."datasource" s on s.datasourceid = sp.datasourceid left join "quicksight-lineage"."dataset" d on s.datasourcearn = d.datasourcearn left join "quicksight-lineage"."dashboard" r on d.datasetarn = r.datasetarns union select distinct 'dataset' as QuickSightObjectType, dp.datasetid as "QuickSightID",dp.name, split_part(principal,':',6) as principal,permission, r.name||' - ID: '||r.dashboardid as dashboard_name_w_id from "quicksight-lineage"."dataset_permissions" dp inner join "quicksight-lineage"."dataset" d on d.datasetid = dp.datasetid left join "quicksight-lineage"."dashboard" r on d.datasetarn = r.datasetarns union select distinct 'analysis' as QuickSightObjectType, ap.analysisid as "QuickSightID",ap.name, split_part(principal,':',6) as principal,permission, r.name||' - ID: '||r.dashboardid as dashboard_name_w_id from "quicksight-lineage"."analysis_permissions" ap inner join "quicksight-lineage"."analysis" a on a.analysisid = ap.analysisid left join "quicksight-lineage"."dashboard" r on a.analysisarn = r.sourceentityarn union select distinct 'template' as QuickSightObjectType, tp.templateid as "QuickSightID",tp.name, split_part(principal,':',6) as principal,permission, r.name||' - ID: '||r.dashboardid as dashboard_name_w_id from "quicksight-lineage"."template_permissions" tp inner join "quicksight-lineage"."template" t on t.templateid = tp.templateid left join "quicksight-lineage"."dashboard" r on t.templatearn = r.sourceentityarn union select distinct 'dashboard' as QuickSightObjectType, dashboardid as "QuickSightID",name, split_part(principal,':',6) as principal,permission, name||' - ID: '||dashboardid as dashboard_name_w_id from "quicksight-lineage"."dashboard_permissions"
SQL
- Choose Edit / Preview data.
- Choose Apply.
- For Query mode, select SPICE.
- Choose Save.
- Navigate to the Analyses page and open the analysis created earlier.
- Choose the pencil icon to add the new dataset.
- Choose Add data set.
- Select QuickSight Lineage Permissions.
- Choose Select.
- Make sure the new
QuickSight Lineage Permissions
dataset is active in the Data set drop-down menu.
Visualizing your permissions
You now add a new visual to display permissions.
- Choose the table visual type.
- Choose (single-click) name, principal, and permission in the Field List.
- In the navigation pane, choose Filter.
- Choose +.
- Choose quicksightobjecttype.
- Choose the new filter.
- Deselect Select all.
- Select dashboard.
- Choose Apply.
- Choose Close.
- Move the new permissions visual so it’s to the right of the dashboard visual.
- On the new permissions visual, choose the menu options (…).
- Choose Duplicate visual.
- Repeat this step four times.
- Modify the
quicksightobjectype
filter on each new permission visual so you have one visual for each asset type. - Move the visual to the right of the corresponding asset type visual.
Creating parameters for filtering
At this point all the visuals are created; next you need to create a parameter. You can simplify the following steps by using the new simplified filter control creation process. For more information, see Amazon QuickSight adds support for on-sheet filter controls. The following steps still work fine, but to add filter controls to an analysis, you don’t need to create parameters anymore.
- Navigate to the Parameters menu.
- Choose Create one…
- For Name, enter
DashboardNameWithID
. - Choose Create.
- Choose Create a new control for a filter or a calculated field.
- For Display name, enter
Dashboard Name with ID
. - For Style, choose Single select drop down.
- For Values, select Link to a data set field.
- For Select a data set, choose QuickSight Lineage Permissions.
- For Select a column, choose dashboard_name_w_id.
- Choose Add.
- Choose the first visual (
Count of Records by Dashboard_name
,Template_name
,Dataset_name
,Datasource_name
, andAnalysis_name
). - Add a filter in the dashboard_name_w_id field.
- Choose the newly added filter.
- Set the filter scope to All visuals.
- For Filter type, choose Custom filter.
- Select Use parameters.
- From the drop-down menu, choose DashboardNameWithId.
- Choose Apply.
- Choose Close.
- Choose the first permissions visual (
Permission, Principal, and Name
). - Add a filter in the dashboard_name_w_id field.
- Set the filter scope to All visuals.
- For Filter type, choose Custom filter.
- Select Use parameters.
- From the drop-down menu, choose DashboardNameWithID.
- Choose Apply.
- Choose Close.
The analysis build is complete and can be published as a dashboard.
Creating additional visuals
You can also create additional visuals for different use cases.
Visualizing SPICE usage across all your SPICE datasets
To visualize Spice usage across your SPICE datasets, complete the following steps.
- Use the
QuickSight Lineage
dataset and choose the donut chart visual. - For Group/Color, add
dataset_name
. - For Value, add
dataset_spicesize
. - Change the aggregation of
dataset_spicesize
to Average because a dataset can be listed multiple times in the dataset if it is reused across multiple dashboards.
This visual can be useful to track down what is consuming SPICE storage.
Visualizing SPICE refreshes by hour
To visualize SPICE refreshes by hour, complete the following steps:
- Use the
QuickSight Lineage
dataset to create a vertical stacked bar chart. - For X axis, add
dataset_lastupdatetime
aggregated by HOUR. - For Value, add
dataset_id
aggregated by Count district. - For Group/Color, add
dataset_name
. - Create a filter on
dataset_importmode
equal to SPICE.
This visual can be useful to see when all the SPICE dataset refreshes last occurred. The source data is a snapshot in time, so you need to update the source data by running the Lambda function on a regular basis.
Cleaning up
To avoid incurring future charges, delete the resources you created in this walkthrough by deleting the CloudFormation stack. Also, be sure to delete the analysis and dataset (to free up SPICE usage).
Conclusion
We also created some visuals to display SPICE usage by data set as well as the last refresh time per data set, allowing you to view the health of your SPICE refreshes and to free up SPICE capacity by cleaning up older data sets.
Give this technique of building administrative dashboards from data collected via the QuickSight APIs a try, and share you feedback and questions in the comments.
About the Authors
Jesse Gebhardt is a senior global business development manager focused on analytics. He has spent over 10 years in the Business Intelligence industry. At AWS, he aids customers around the globe gain insight and value from the data they have stored in their data lakes and data warehouses. Jesse lives in sunny Phoenix, and is an amateur electronic music producer.
Arun Santhosh is a Specialized World Wide Solution Architect for Amazon QuickSight. Arun started his career at IBM as a developer and progressed on to be an Application Architect. Later, he worked as a Technical Architect at Cognizant. Business Intelligence has been his core focus in these prior roles as well.
Shawn Koupal is an Enterprise Analytics IT Architect at Best Western International, Inc.