Centrally tracking dashboard lineage, permissions, and more with Amazon QuickSight administrative dashboards

Table of Contents ( Press the ← key in browser search bar to return TOC)

https://aws.amazon.com/blogs/big-data/centrally-tracking-dashboard-lineage-permissions-and-more-with-amazon-quicksight-administrative-dashboards/

This post is co-written with Shawn Koupal, an Enterprise Analytics IT Architect at Best Western International, Inc. A common ask from Amazon QuickSight administrators is to understand the lineage of a given dashboard (what analysis is it built from, what datasets are used in the analysis, and what data sources do those datasets use).

QuickSight APIs allow us to capture the metadata from each object and build a complete picture of the linkages between each object. As a QuickSight administrator, you can build a dashboard that displays the lineage from dashboard to data source, along with the permissions for each asset type. It can be helpful to see all permissions assigned to each of your assets as well as the relationships between them, all in one place.

Solution overview

In this solution, you build an end-to-end data pipeline using QuickSight to ingest data from an AWS Glue table.

The following diagram illustrates the architecture of the solution.

You can invoke the QuickSight APIs via the AWS Software Development Kit (AWS SDK) or the AWS Command Line Interface (AWS CLI). For this post, we use the AWS SDK.

The solution starts with an AWS Lambda function that calls the QuickSight listAPIs (list_data_sources, list_data_sets, list_analyses, list_templates, and list_dashboards) depending on the event message to build lists of assets in chunks of 100, which are iterated through by a second Lambda function. The reason for splitting the work into two functions is to work around the 15-minute time limit in Lambda. You can schedule the Lambda function to run on each asset type based on an event rule trigger. See the following code:

import boto3
import os
import time
import datetime
import json

AWS_ACCOUNT_ID=os.environ['AWS_ACCOUNT_ID']
AWS_REGION=os.environ['AWS_REGION']
QS_S3_BUCKET=os.environ['QS_S3_BUCKET']
DownloaderFunctionName=os.environ['DownloaderFunctionName']

client = boto3.client('quicksight', region_name=AWS_REGION)
lambda_client = boto3.client('lambda')

def invoke_downloader(iteration, apicall, list_results):

    apicall=apicall.replace("list_data_sources","datasource").replace("list_data_sets","dataset").replace("list_analyses","analysis").replace("list_dashboards","dashboard").replace("list_templates","template")
    msg = {"Iteration": iteration, "api": apicall, "Objects":  list_results }
    invoke_response = lambda_client.invoke(FunctionName=DownloaderFunctionName,
                                           InvocationType='Event',
                                           Payload=json.dumps(msg, default=datetime_handler))


def datetime_handler(x):
    if isinstance(x, datetime.datetime):
        return x.isoformat()
    raise TypeError("Unknown type")

def file_cleanup(apicall):
    #Replace the apicall with the S3 folder name
    object_type=apicall.replace("list_data_sources","datasource").replace("list_data_sets","dataset").replace("list_analyses","analysis").replace("list_dashboards","dashboard").replace("list_templates","template")
    
    s3_path='quicksight_lineage/'+object_type+'/'
    s3_path2='quicksight_lineage/'+object_type+'_permissions/'
    fileprefix="QuickSight_"+object_type
    botoSession = boto3.Session (region_name = 'us-west-2')
    s3_session = botoSession.resource('s3')
    bucket = s3_session.Bucket(QS_S3_BUCKET)
    #Delete Any files with prefix in s3_path and s3_path2
    bucket.objects.filter(Prefix=s3_path+fileprefix).delete()
    bucket.objects.filter(Prefix=s3_path2+fileprefix).delete()

def lambda_handler(event, context):


    if event == {}:
        #Call All APIs assests 
        apicall_list=['list_data_sources','list_data_sets','list_analyses','list_dashboards','list_templates']
    elif  event["api"] == 'datasource':
        apicall_list=['list_data_sources']
    elif event["api"] == 'dataset':
        apicall_list=['list_data_sets']
    elif event["api"] == 'analysis':
        apicall_list=['list_analyses']
    elif event["api"] == 'dashboard':
        apicall_list=['list_dashboards']
    elif event["api"] == 'template':
        apicall_list=['list_templates']
    else:
        print("[WARN] Exception: Invalid Event Type.")
        return
    for apicall in apicall_list: 
        try:
            #Clean up files from previous run
            file_cleanup(apicall)
            #Reset variables for each apicall
            iteration=0
            user_token = None
            list_results={}

            while True:
                iteration+=1
                print("Calling ",apicall, iteration)
                
                if user_token is None:
                    exec("""results=client."""+apicall+"""(AwsAccountId='"""+AWS_ACCOUNT_ID+"""', MaxResults=100);""",globals(), list_results)
                else:
                    exec("""results=client."""+apicall+"""(AwsAccountId='"""+AWS_ACCOUNT_ID+"""', MaxResults=100,NextToken='"""+user_token+"""');""",globals(), list_results)

                invoke_downloader(iteration, apicall, list_results["results"])
                user_token=list_results["results"]["NextToken"]
                print(user_token)
        except KeyError:
            print("NextToken not found.")

The second Lambda function consumes the list of assets from the event parameter from the first function and uses the QuickSight describe APIs (describe_datasource, describe_dataset, describe_analysis, describe_template, and describe_dashboard). The details of each QuickSight asset are written to CSV files in an Amazon Simple Storage Service (Amazon S3) bucket in groups of 100. Because the first function calls the second function in parallel, it’s recommended to set the reserved concurrency to 2 in the second Lambda function to avoid throttling errors (if you use the AWS CloudFormationtemplate provided later in this post, this is automatically configured for you). See the following code:

import boto3
import os
import time
import datetime
import json

AWS_ACCOUNT_ID=os.environ['AWS_ACCOUNT_ID']
AWS_REGION=os.environ['AWS_REGION']
QS_S3_BUCKET=os.environ['QS_S3_BUCKET']

client = boto3.client('quicksight', region_name=AWS_REGION)
lambda_client = boto3.client('lambda')
s3 = boto3.client('s3')

def process_dashboards(list_dashboard,iteration,object_type):
    filename="QuickSight_"+object_type+"_"+iteration+".csv"
    filePath = os.path.join("/tmp",filename)
    f=open(filePath,"w")
    #CSV Header
    f.write("DashboardId,Name,SourceEntityArn,VersionCreatedTime,VersionNumber,CreatedTime,DataSetArns,LastPublishedTime,LastUpdatedTime" + '\n')

    for dashboard in list_dashboard["DashboardSummaryList"]:
        dashboard_desc= client.describe_dashboard(AwsAccountId=AWS_ACCOUNT_ID,DashboardId=dashboard["DashboardId"])
        
        source_entity_arn = dashboard_desc["Dashboard"]["Version"]["SourceEntityArn"]
        version_created_time = dashboard_desc["Dashboard"]["Version"]["CreatedTime"].isoformat()
        version_number = str(dashboard_desc["Dashboard"]["Version"]["VersionNumber"])
        created_time = dashboard_desc["Dashboard"]["CreatedTime"].isoformat()

        last_published_time = dashboard_desc["Dashboard"]["LastPublishedTime"].isoformat()
        last_updated_time = dashboard_desc["Dashboard"]["LastUpdatedTime"].isoformat()
        try:
            for arn in dashboard_desc["Dashboard"]["Version"]["DataSetArns"]:
                f.write(dashboard["DashboardId"]+',"'+ dashboard["Name"] + '",' + source_entity_arn + ',' + version_created_time + ',' + version_number + ','   + created_time + ','+ arn + ',' + last_published_time + ',' + last_updated_time +'\n')
        except Exception as e:
            print(e)
            dataset_arn=''
            f.write(dashboard["DashboardId"]+',"'+ dashboard["Name"] + '",' + source_entity_arn + ',' + version_created_time + ',' + version_number + ','   + created_time + ','+ dataset_arn + ',' + last_published_time + ',' + last_updated_time +'\n')

    f.close()        
    s3_path='quicksight_lineage/'+object_type+'/'
    s3.upload_file("{}/{}".format("/tmp",  filename), QS_S3_BUCKET, s3_path + filename)            
        

def process_dashboards_permissions(list_dashboard,iteration,object_type):

    filename="QuickSight_"+object_type+"_"+iteration+".csv"
    filePath = os.path.join("/tmp",filename)
    f=open(filePath,"w")
    #CSV Header
    f.write("DashboardId,Name,Principal,Permission,Iteration" + '\n')
	
    for dashboard in list_dashboard["DashboardSummaryList"]:

        try:
            list_permissions = client.describe_dashboard_permissions(AwsAccountId=AWS_ACCOUNT_ID,DashboardId=dashboard["DashboardId"])
        except:
            print("Error Listing Permissions for:"+dashboard["DashboardId"])
            continue

        for permission in list_permissions["Permissions"]:
            #If Action includes delete operation then principal has co-owner permissions
            if "quicksight:DeleteDashboard" in permission["Actions"]:
                action = "Co-Owner"
            else:
                action = "View"

            f.write(dashboard["DashboardId"]+',"'+ dashboard["Name"] + '",' + permission["Principal"] +  ',' + action +  ',' + iteration +'\n')

    f.close()        
    s3_path='quicksight_lineage/'+object_type+'/'
    s3.upload_file("{}/{}".format("/tmp",  filename), QS_S3_BUCKET, s3_path + filename)      


def process_analysis(list_analyses,iteration,object_type):
    filename="QuickSight_"+object_type+"_"+iteration+".csv"
    filePath = os.path.join("/tmp",filename)
    f=open(filePath,"w")
    #CSV Header
    f.write("AnalysisId,Name,AnalysisArn,CreatedTime,LastUpdatedTime,DataSetArn,Iteration" + '\n')

    for analysis in list_analyses["AnalysisSummaryList"]:
        #Call describe_analysis
        analysis_desc= client.describe_analysis(AwsAccountId=AWS_ACCOUNT_ID,AnalysisId=analysis["AnalysisId"])

        analysis_arn = analysis_desc["Analysis"]["Arn"]
        created_time = analysis_desc["Analysis"]["CreatedTime"].isoformat()
        last_updated_time = analysis_desc["Analysis"]["LastUpdatedTime"].isoformat()

        try:
            for arn in analysis_desc["Analysis"]["DataSetArns"]:
                f.write(analysis["AnalysisId"]+',"'+ analysis["Name"] + '",' + analysis_arn + ',' + created_time + ','  + last_updated_time + ',' + arn + ',' + iteration  +'\n')
        except Exception as e:
            print(e)
            dataset_arn=''
            f.write(analysis["AnalysisId"]+',"'+ analysis["Name"] + '",' + analysis_arn + ',' + created_time + ','  + last_updated_time + ',' + dataset_arn  + ',' + iteration +'\n')

    f.close()        
    s3_path='quicksight_lineage/'+object_type+'/'
    s3.upload_file("{}/{}".format("/tmp",  filename), QS_S3_BUCKET, s3_path + filename)            

        
def process_analysis_permissions(list_analyses,iteration,object_type):

    filename="QuickSight_"+object_type+"_"+iteration+".csv"
    filePath = os.path.join("/tmp",filename)
    f=open(filePath,"w")
    #CSV Header
    f.write("AnalysisId,Name,Principal,Permission,Iteration" + '\n')
	
    for analysis in list_analyses["AnalysisSummaryList"]:

        try:
            list_permissions = client.describe_analysis_permissions(AwsAccountId=AWS_ACCOUNT_ID,AnalysisId=analysis["AnalysisId"])
        except:
            print("Error Listing Permissions for:"+analysis["AnalysisId"])
            continue
        for permission in list_permissions["Permissions"]:
            #If Action includes delete operation then principal has co-owner permissions
            if "quicksight:DeleteAnalysis" in permission["Actions"]:
                action = "Co-Owner"
            else:
                action = "View"

            f.write(analysis["AnalysisId"]+',"'+ analysis["Name"] + '",' + permission["Principal"] +  ',' + action +  ',' + iteration +'\n')
    
    f.close()        
    s3_path='quicksight_lineage/'+object_type+'/'
    s3.upload_file("{}/{}".format("/tmp",  filename), QS_S3_BUCKET, s3_path + filename)      


def process_templates(list_templates,iteration,object_type):
    filename="QuickSight_"+object_type+"_"+iteration+".csv"
    filePath = os.path.join("/tmp",filename)
    f=open(filePath,"w")
    #CSV Header
    f.write("TemplateId,Name,TemplateArn,CreatedTime,LastUpdatedTime,SourceEntityArn,VersionNumber,Iteration" + '\n')

    for template in list_templates["TemplateSummaryList"]:
        #Call describe_template
        template_desc= client.describe_template(AwsAccountId=AWS_ACCOUNT_ID,TemplateId=template["TemplateId"])

        template_arn = template_desc["Template"]["Arn"]
        created_time = template_desc["Template"]["CreatedTime"].isoformat()
        last_updated_time = template_desc["Template"]["LastUpdatedTime"].isoformat()
        source_entity_arn = template_desc["Template"]["Version"]["SourceEntityArn"]
        version_number = str(template_desc["Template"]["Version"]["VersionNumber"])
        f.write(template["TemplateId"]+',"'+ template["Name"] + '",' + template_arn + ',' + created_time + ','  + last_updated_time + ',' + source_entity_arn + ',' + version_number +  ',' + iteration  +'\n')

    f.close()        
    s3_path='quicksight_lineage/'+object_type+'/'
    s3.upload_file("{}/{}".format("/tmp",  filename), QS_S3_BUCKET, s3_path + filename)            

        
def process_templates_permissions(list_templates,iteration,object_type):

    filename="QuickSight_"+object_type+"_"+iteration+".csv"
    filePath = os.path.join("/tmp",filename)
    f=open(filePath,"w")
    #CSV Header
    f.write("TemplateId,Name,Principal,Permission,Iteration" + '\n')
	
    for template in list_templates["TemplateSummaryList"]:

        try:
            list_permissions = client.describe_template_permissions(AwsAccountId=AWS_ACCOUNT_ID,TemplateId=template["TemplateId"])
        except:
            print("Error Listing Permissions for:"+template["TemplateId"])
            continue
        for permission in list_permissions["Permissions"]:
            #If Action includes delete operation then principal has co-owner permissions
            if "quicksight:DeleteTemplate" in permission["Actions"]:
                action = "Co-Owner"
            else:
                action = "View"

            f.write(template["TemplateId"]+',"'+ template["Name"] + '",' + permission["Principal"] +  ',' + action +  ',' + iteration +'\n')
    
    f.close()        
    s3_path='quicksight_lineage/'+object_type+'/'
    s3.upload_file("{}/{}".format("/tmp",  filename), QS_S3_BUCKET, s3_path + filename)      


def process_datasources(list_data_sources,iteration,object_type):
    filename="QuickSight_"+object_type+"_"+iteration+".csv"
    filePath = os.path.join("/tmp",filename)
    f=open(filePath,"w")
    #CSV Header
    f.write("DataSourceId,DataSourceArn,Name,Type,LastUpdatedTime,CreatedTime,Status,ErrorInfo,Iteration" + '\n')

    global datasource_list
    datasource_list=[]
    for datasource in list_data_sources["DataSources"]:
        datasource_id=datasource["DataSourceId"]
        name=datasource["Name"]
        datasource_type=datasource["Type"]
        try:
            status=datasource["Status"]
        except:
            status=''
        CreatedTime=str(datasource["CreatedTime"])
        LastUpdatedTime=str(datasource["LastUpdatedTime"])
        try:
            ErrorInfo="Type: "+datasource["ErrorInfo"]["Type"]+" Message: "+datasource["ErrorInfo"]["Message"]
        except:
            ErrorInfo="Null"

        f.write( datasource_id + ',' + datasource["Arn"] + ',"' + name + '",'  + datasource_type + ',' + LastUpdatedTime+ ',' + CreatedTime + ',' + status + ',' + ErrorInfo+ ',' + iteration +'\n')

    f.close()        
    s3_path='quicksight_lineage/'+object_type+'/'
    s3.upload_file("{}/{}".format("/tmp",  filename), QS_S3_BUCKET, s3_path + filename)            


def process_datasources_permissions(list_data_sources,iteration,object_type):
    
    filename="QuickSight_"+object_type+"_"+iteration+".csv"
    filePath = os.path.join("/tmp",filename)
    f=open(filePath,"w")
    #CSV Header
    f.write("DataSourceID,Name,Principal,Permission,Iteration" + '\n')

    for datasource in list_data_sources["DataSources"]:
        try:
            list_permissions = client.describe_data_source_permissions(AwsAccountId=AWS_ACCOUNT_ID,DataSourceId=datasource["DataSourceId"])
        except:
            print("Error Listing Permissions for:"+datasource["DataSourceId"])
            continue
        for permission in list_permissions["Permissions"]:
            #If Action includes delete operation then principal has co-owner permissions
            if "quicksight:DeleteDataSource" in permission["Actions"]:
                action = "Co-Owner"
            else:
                action = "View"
                
            f.write(datasource["DataSourceId"]+',"'+ datasource["Name"] + '",' + permission["Principal"] +  ',' + action +  ',' + iteration +'\n')
    
    f.close()        
    s3_path='quicksight_lineage/'+object_type+'/'
    s3.upload_file("{}/{}".format("/tmp",  filename), QS_S3_BUCKET, s3_path + filename)            
    

def process_datasets(list_datasets,iteration,object_type):

    filename="QuickSight_"+object_type+"_"+iteration+".csv"
    filePath = os.path.join("/tmp",filename)
    f=open(filePath,"w")
    #CSV Header
    f.write('DatasetId,DataSetArn,Name,SpiceSize,ImportMode,LastUpdatedTime,CreatedTime,DataSourceArn,DataSourceName,DataSourceType,Source,Columns,Iteration' + '\n')
    
    for dataset in list_datasets["DataSetSummaries"]:
        
        try:
            response= client.describe_data_set(AwsAccountId=AWS_ACCOUNT_ID,DataSetId=dataset["DataSetId"])
        except Exception as e:
            print("Dataset ID: ", dataset["DataSetId"], e)
            f.write( dataset["DataSetId"] + ',' + dataset["Arn"] + ',"' + dataset["Name"] + '",' + '0' + ',' + dataset["ImportMode"] + ',' + str(dataset["LastUpdatedTime"])+ ','+ str(dataset["CreatedTime"])+ ',' + 'n/a' + ',"' + 'n/a' + '",' +  'n/a'  + ',' +  'n/a' + ',"'  + 'n/a'+ '",' + iteration +'\n')
            continue

        dataset_id=response["DataSet"]["DataSetId"]
        dataset_name=response["DataSet"]["Name"]
        dataset_size=response["DataSet"]["ConsumedSpiceCapacityInBytes"]
        ImportMode=response["DataSet"]["ImportMode"]
        LastUpdatedTime=response["DataSet"]["LastUpdatedTime"].isoformat()
        CreatedTime=response["DataSet"]["CreatedTime"].isoformat()

        try:
            for key in response["DataSet"]["PhysicalTableMap"].keys():
                
                if key == 's3PhysicalTable':
                    
                    source='S3Source'
                    DataSourceArn=response["DataSet"]["PhysicalTableMap"]["s3PhysicalTable"]["S3Source"]["DataSourceArn"]
                    Columns=response["DataSet"]["PhysicalTableMap"]["s3PhysicalTable"]["S3Source"]["InputColumns"]
                    #SqlQuery="Null"

                else:

                    try:
                        DataSourceArn=response["DataSet"]["PhysicalTableMap"][key]["RelationalTable"]["DataSourceArn"]
                        Columns=""
                        source="VisualEditor"
                    except:
                        DataSourceArn=response["DataSet"]["PhysicalTableMap"][key]["CustomSql"]["DataSourceArn"]
                        Columns=response["DataSet"]["PhysicalTableMap"][key]["CustomSql"]["Columns"]
                        source="CustomSql"

                DataSourceName=""
                DataSourceType=""
                
                f.write( dataset_id + ',' + dataset["Arn"] + ',"' + dataset_name + '",' + str(dataset_size) + ',' + ImportMode + ',' + LastUpdatedTime+ ','+ CreatedTime+ ',' + DataSourceArn + ',"' + DataSourceName + '",' +  DataSourceType  + ',' +  source + ',"'  + str(Columns) + '",' + iteration +'\n')
                
        except:
            print("[DEBUG]: Exception in main write for: " + str(dataset))
            f.write( dataset_id  + ',' + dataset["Arn"] +',"' + dataset_name + '",' + str(dataset_size) + ',' + ImportMode + ',' + LastUpdatedTime+ ',' + CreatedTime + ',,,,Unknown,"'  + str(Columns) + '",' + iteration +'\n')

    f.close()
    s3_path='quicksight_lineage/'+object_type+'/'
    s3.upload_file("{}/{}".format("/tmp",  filename), QS_S3_BUCKET, s3_path + filename)


def process_datasets_permissions(list_datasets,iteration,object_type):
    
    filename="QuickSight_"+object_type+"_"+iteration+".csv"
    filePath = os.path.join("/tmp",filename)
    f=open(filePath,"w")
    f.write('DataSetID,Name,Principal,Permission,Iteration'+'\n')

    for dataset in list_datasets["DataSetSummaries"]:
        try:
            list_permissions = client.describe_data_set_permissions(AwsAccountId=AWS_ACCOUNT_ID,DataSetId=dataset["DataSetId"])
        except:
            print("Error Listing Permissions for:"+dataset["DataSetId"])
            continue
        
        for permission in list_permissions["Permissions"]:
            #If Action includes delete operation then principal has co-owner permissions
            if "quicksight:DeleteDataSet" in permission["Actions"]:
                action = "Co-Owner"
            else:
                action = "View"
                
            f.write(dataset["DataSetId"]+',"'+ dataset["Name"] + '",' + permission["Principal"] +  ',' + action+  ',' + iteration +'\n')

    f.close()        
    s3_path='quicksight_lineage/'+object_type+'/'
    s3.upload_file("{}/{}".format("/tmp",  filename), QS_S3_BUCKET, s3_path + filename)            


def lambda_handler(event, context):

    list_objects=event["Objects"]
    iteration=str(event["Iteration"])
    
    print("Iteration: ", iteration)
    print("[INFO]Processing QuickSight:", event["api"] )
    
    if  event["api"] == 'datasource':
        process_datasources(list_objects, iteration, event["api"])
        process_datasources_permissions(list_objects, iteration, event["api"]+'_permissions')
    elif event["api"] == 'dataset':
        process_datasets(list_objects, iteration, event["api"])
        process_datasets_permissions(list_objects, iteration, event["api"]+'_permissions')
    elif event["api"] == 'analysis':
        process_analysis(list_objects, iteration, event["api"])
        process_analysis_permissions(list_objects, iteration, event["api"]+'_permissions')
    elif event["api"] == 'dashboard':
        process_dashboards(list_objects, iteration, event["api"])
        process_dashboards_permissions(list_objects, iteration, event["api"]+'_permissions')
    elif event["api"] == 'template':
        process_templates(list_objects, iteration, event["api"])
        process_templates_permissions(list_objects, iteration, event["api"]+'_permissions')
    else:
        print("[WARN] Exception: Invalid Event Type.")
        return

Afterwards, the S3 bucket has the directory structure under the quicksight_lineage folder as shown in the following screenshot.
Original Post>

You then use AWS Glue to store the metadata of each file in an AWS Glue table, which allows you to query the information from QuickSight using an Amazon Athena or Amazon Redshift Spectrum data source (if you run the CloudFormation stack, the tables are set up for you).

The following diagram shows the tables and relationships.

Walkthrough overview

The workflow is comprised of the following high-level steps:

Deploy the CloudFormation template to build the Lambda functions, AWS Identity and Access Management (IAM) roles, S3 bucket, AWS Glue database, and AWS Glue tables.
Run the Python Lambda functions to build CSV files that contain the QuickSight object details.
Visualize the data in QuickSight. To do so, you must create your data source, dataset, and then analysis.

For this post, we use Athena as the query engine. To use Redshift Spectrum, you must modify the provided queries.

Prerequisites

For this walkthrough, you should have the following prerequisites:

An AWS account
An IAM user with access to AWS resources used in this solution (CloudFormation, IAM, Amazon S3, AWS Glue, Athena, QuickSight)
Athena configured with a query result location
QuickSight Enterprise Edition

Creating resources

Create your resources by launching the following CloudFormation stack:

During the stack creation process, you must provide an S3 bucket name in the S3BucketName parameter (AWSAccountNumber is appended to the bucket name provided to make it unique).

After the stack creation is successful, you have two Lambda functions, two S3 buckets, an AWS Glue database and tables, and the corresponding IAM roles and policies.

Running the Lambda function

To run your Lambda function, complete the following steps:

On the Lambda console, navigate to the QuickSight-Lineage-Dispatcherfunction.
From the Select a test event menu, choose Configure test events.

Select Create new test event.

You create one test event for all QuickSight assets.

For Event name, enter all.
Enter an empty JSON object ({}).

Choose Test to run the Lambda function and generate CSV files of the assets.

Alternatively, you can create test events for each QuickSight object (Data Source, DataSet, Analysis, Dashboard, and Template) for larger QuickSightenvironments:

Test event DataSource code:{ "api": "datasource" }
Test event DataSet code:{ "api": "dataset" }
Test event Analysis code:{ "api": "analysis" }
Test event Dashboard code:{ "api": "dashboard" }
Test event Template code:{ "api": "template" }

The following screenshot shows the configuration of a test event for Analysis.

Creating your data source and lineage data set

In this step, you use QuickSight to access the tables in your AWS Glue database.

Log in to QuickSight.
Choose Manage QuickSight.
Choose Security & permissions.
Ensure that access to the S3 bucket (that was created through CloudFormation) is enabled.
Choose New analysis.
Choose New dataset.
For the data source, choose Athena.

For your data source name, enter QuickSight-Lineage.
Choose Create data source.

QuickSight prompts you to select your schema or database.

Choose Use custom SQL.

Update the query name from New custom SQL to QuickSight Lineage.
Enter the following code into the query box:select a.analysisid as analysis_id, a.name as analysis_name, a.analysisarn as analysis_arn, date_parse(substr(a.createdtime,1,26),'%Y-%m-%dT%H:%i:%s.%f') as analysis_createdtime, date_parse(substr(a.lastupdatedtime,1,26),'%Y-%m-%dT%H:%i:%s.%f') as analysis_lastupdatedtime, a.datasetarn as analysis_datasetarn, r.dashboardid as dashboard_id, r.name as dashboard_name, r.name||' - ID: '||r.dashboardid as dashboard_name_w_id, date_parse(substr(r.versioncreatedtime,1,26),'%Y-%m-%dT%H:%i:%s.%f') as dashboard_versioncreatedtime, r.versionnumber as dashboard_versionnumber , date_parse(substr(r.createdtime,1,26),'%Y-%m-%dT%H:%i:%s.%f') as dashboard_createdtime, date_parse(substr(r.lastpublishedtime,1,26),'%Y-%m-%dT%H:%i:%s.%f') as dashboard_lastpublishedtime , date_parse(substr(r.lastupdatedtime,1,26),'%Y-%m-%dT%H:%i:%s.%f') as dashboard_lastupdatedtime, d.datasetid as dataset_id, d.datasetarn as dataset_arn, d.name as dataset_name, d.spicesize as dataset_spicesize, d.importmode as dataset_importmode, date_parse(substr(d.lastupdatedtime,1,26),'%Y-%m-%dT%H:%i:%s.%f') as dataset_lastupdatedtime, date_parse(substr(d.createdtime,1,26),'%Y-%m-%dT%H:%i:%s.%f') as dataset_createdtime, d.source as dataset_source, d.columns as dataset_columns, s.datasourceid as datasource_id, s.datasourcearn as datasource_arn, s.name as datasource_name, s.type as datasource_type, date_parse(substr(s.lastupdatedtime,1,26),'%Y-%m-%dT%H:%i:%s.%f') as datasource_lastupdatedtime, date_parse(substr(s.createdtime,1,26),'%Y-%m-%dT%H:%i:%s.%f') as datasource_createdtime, s.status as datasource_status, s.errorinfo as datasource_errorinfo, t.templateid as template_id, t.name as template_name, t.templatearn as template_arn, date_parse(substr(t.createdtime,1,26),'%Y-%m-%dT%H:%i:%s.%f') as template_createtime, date_parse(substr(t.lastupdatedtime,1,26),'%Y-%m-%dT%H:%i:%s.%f') as template_lastupdatedtime from "quicksight-lineage"."dashboard" r left join "quicksight-lineage"."analysis" a on a.analysisarn = r.sourceentityarn and a.datasetarn=r.datasetarns left join "quicksight-lineage"."template" t on t.templatearn = r.sourceentityarn left join "quicksight-lineage"."dataset" d on d.datasetarn = r.datasetarns left join "quicksight-lineage"."datasource" s on s.datasourcearn = d.datasourcearn
Choose Confirm query.

Select Import to SPICE for quicker analytics.
Choose Visualize.

In the new analysis, one empty visual is loaded by default.

Change the visual type to pivot table.
Choose (single-click) dashboard_name, analysis_name, template_name, dataset_name, and datasource_name in the Fields list.

You can search for name in field list to make this step easier

Confirm that all fields were also added to the Rows

If you have assets with duplicates names, it can helpful to add the corresponding ID columns to the visual; for example, dashboard_id, analysis_id, template_id, dataset_id, datasource_id.

Visualizing your assets and lineage

You now create five new visuals, one for each asset type (Dashboard, Analysis, Template, Dataset, Data Source), to display the additional columns pulled from the APIs.

From the Add menu, choose Add visual.

For the first new visual, choose the table visual type.
Search for dashboard_ in Field List.
Choose (single-click) all matching columns.

For the second visual, choose the table visual type.
Search for analysis_ in the Field List.
Choose (single-click) all matching columns.
Move the second visual underneath the first visual.
Repeat same steps for template_, dataset_, and datasource_.

Creating your permissions data set

You now create your new data set.

Leave the analysis by choosing the QuickSight logo on the top left.
In the navigation pane, choose Datasets.
Choose New dataset.
Locate and choose the QuickSight-Lineage data source created earlier in the FROM EXISTING DATA SOURCES
In the QuickSight Lineage data source window, choose Create data set.
Choose Use custom SQL.

Update the name from New custom SQL to QuickSight Lineage Permissions.
Enter the following code into the query box:select distinct 'datasource' as QuickSightObjectType, sp.datasourceid as "QuickSightID",sp.name, split_part(principal,':',6) as principal,permission, r.name||' - ID: '||r.dashboardid as dashboard_name_w_id from "quicksight-lineage"."datasource_permissions" sp inner join "quicksight-lineage"."datasource" s on s.datasourceid = sp.datasourceid left join "quicksight-lineage"."dataset" d on s.datasourcearn = d.datasourcearn left join "quicksight-lineage"."dashboard" r on d.datasetarn = r.datasetarns union select distinct 'dataset' as QuickSightObjectType, dp.datasetid as "QuickSightID",dp.name, split_part(principal,':',6) as principal,permission, r.name||' - ID: '||r.dashboardid as dashboard_name_w_id from "quicksight-lineage"."dataset_permissions" dp inner join "quicksight-lineage"."dataset" d on d.datasetid = dp.datasetid left join "quicksight-lineage"."dashboard" r on d.datasetarn = r.datasetarns union select distinct 'analysis' as QuickSightObjectType, ap.analysisid as "QuickSightID",ap.name, split_part(principal,':',6) as principal,permission, r.name||' - ID: '||r.dashboardid as dashboard_name_w_id from "quicksight-lineage"."analysis_permissions" ap inner join "quicksight-lineage"."analysis" a on a.analysisid = ap.analysisid left join "quicksight-lineage"."dashboard" r on a.analysisarn = r.sourceentityarn union select distinct 'template' as QuickSightObjectType, tp.templateid as "QuickSightID",tp.name, split_part(principal,':',6) as principal,permission, r.name||' - ID: '||r.dashboardid as dashboard_name_w_id from "quicksight-lineage"."template_permissions" tp inner join "quicksight-lineage"."template" t on t.templateid = tp.templateid left join "quicksight-lineage"."dashboard" r on t.templatearn = r.sourceentityarn union select distinct 'dashboard' as QuickSightObjectType, dashboardid as "QuickSightID",name, split_part(principal,':',6) as principal,permission, name||' - ID: '||dashboardid as dashboard_name_w_id from "quicksight-lineage"."dashboard_permissions"

Choose Edit / Preview data.
Choose Apply.
For Query mode, select SPICE.

Choose Save.
Navigate to the Analyses page and open the analysis created earlier.
Choose the pencil icon to add the new dataset.

Choose Add data set.

Select QuickSight Lineage Permissions.
Choose Select.

Make sure the new QuickSight Lineage Permissions dataset is active in the Data set drop-down menu.

Visualizing your permissions

You now add a new visual to display permissions.

Choose the table visual type.
Choose (single-click) name, principal, and permission in the Field List.
In the navigation pane, choose Filter.
Choose +.
Choose quicksightobjecttype.

Choose the new filter.
Deselect Select all.
Select dashboard.
Choose Apply.

Choose Close.
Move the new permissions visual so it’s to the right of the dashboard visual.

On the new permissions visual, choose the menu options (…).
Choose Duplicate visual.
Repeat this step four times.
Modify the quicksightobjectype filter on each new permission visual so you have one visual for each asset type.
Move the visual to the right of the corresponding asset type visual.

Creating parameters for filtering

At this point all the visuals are created; next you need to create a parameter. You can simplify the following steps by using the new simplified filter control creation process. For more information, see Amazon QuickSight adds support for on-sheet filter controls. The following steps still work fine, but to add filter controls to an analysis, you don’t need to create parameters anymore.

Navigate to the Parameters menu.
Choose Create one…
For Name, enter DashboardNameWithID.
Choose Create.

Choose Create a new control for a filter or a calculated field.
For Display name, enter Dashboard Name with ID.
For Style, choose Single select drop down.
For Values, select Link to a data set field.
For Select a data set, choose QuickSight Lineage Permissions.
For Select a column, choose dashboard_name_w_id.
Choose Add.

Choose the first visual (Count of Records by Dashboard_name, Template_name, Dataset_name, Datasource_name, and Analysis_name).
Add a filter in the dashboard_name_w_id field.
Choose the newly added filter.
Set the filter scope to All visuals.
For Filter type, choose Custom filter.
Select Use parameters.
From the drop-down menu, choose DashboardNameWithId.
Choose Apply.
Choose Close.

Choose the first permissions visual (Permission, Principal, and Name).
Add a filter in the dashboard_name_w_id field.
Set the filter scope to All visuals.
For Filter type, choose Custom filter.
Select Use parameters.
From the drop-down menu, choose DashboardNameWithID.
Choose Apply.
Choose Close.

The analysis build is complete and can be published as a dashboard.

Creating additional visuals

You can also create additional visuals for different use cases.

Visualizing SPICE usage across all your SPICE datasets

To visualize Spice usage across your SPICE datasets, complete the following steps.

Use the QuickSight Lineage dataset and choose the donut chart visual.
For Group/Color, add dataset_name.
For Value, add dataset_spicesize.
Change the aggregation of dataset_spicesize to Average because a dataset can be listed multiple times in the dataset if it is reused across multiple dashboards.

This visual can be useful to track down what is consuming SPICE storage.

Visualizing SPICE refreshes by hour

To visualize SPICE refreshes by hour, complete the following steps:

Use the QuickSight Lineage dataset to create a vertical stacked bar chart.
For X axis, add dataset_lastupdatetime aggregated by HOUR.
For Value, add dataset_id aggregated by Count district.
For Group/Color, add dataset_name.
Create a filter on dataset_importmode equal to SPICE.

This visual can be useful to see when all the SPICE dataset refreshes last occurred. The source data is a snapshot in time, so you need to update the source data by running the Lambda function on a regular basis.

Cleaning up

To avoid incurring future charges, delete the resources you created in this walkthrough by deleting the CloudFormation stack. Also, be sure to delete the analysis and dataset (to free up SPICE usage).

Conclusion

We also created some visuals to display SPICE usage by data set as well as the last refresh time per data set, allowing you to view the health of your SPICE refreshes and to free up SPICE capacity by cleaning up older data sets.

Give this technique of building administrative dashboards from data collected via the QuickSight APIs a try, and share you feedback and questions in the comments.

About the Authors

Jesse Gebhardt is a senior global business development manager focused on analytics. He has spent over 10 years in the Business Intelligence industry. At AWS, he aids customers around the globe gain insight and value from the data they have stored in their data lakes and data warehouses. Jesse lives in sunny Phoenix, and is an amateur electronic music producer.

Arun Santhosh is a Specialized World Wide Solution Architect for Amazon QuickSight. Arun started his career at IBM as a developer and progressed on to be an Application Architect. Later, he worked as a Technical Architect at Cognizant. Business Intelligence has been his core focus in these prior roles as well.

Shawn Koupal is an Enterprise Analytics IT Architect at Best Western International, Inc.

Enjoyed this article? Sign up for our newsletter to receive regular insights and stay connected.

Solution overview

Walkthrough overview

Prerequisites

Creating resources

Running the Lambda function

Creating your data source and lineage data set

Visualizing your assets and lineage

Creating your permissions data set

Visualizing your permissions

Creating parameters for filtering

Creating additional visuals

Visualizing SPICE usage across all your SPICE datasets

Visualizing SPICE refreshes by hour

Cleaning up

Conclusion

About the Authors

Like this:

Related

Solution overview

Walkthrough overview

Prerequisites

Creating resources

Running the Lambda function

Creating your data source and lineage data set

Visualizing your assets and lineage

Creating your permissions data set

Visualizing your permissions

Creating parameters for filtering

Creating additional visuals

Visualizing SPICE usage across all your SPICE datasets

Visualizing SPICE refreshes by hour

Cleaning up

Conclusion

About the Authors

Share this:

Like this:

Related

Discover more from Global Intelligence and Insight Platform: IT Innovation, ETF Investment, plus Health Wellbeing