AWS Big Data Blog

Monitor your Amazon QuickSight deployments using the new Amazon CloudWatch integration

Amazon QuickSight is a fully-managed, cloud-native business intelligence (BI) service that makes it easy to connect to your data, create interactive dashboards, and share these with tens of thousands of users, either within the QuickSight interface or embedded in software as a service (SaaS) applications or web portals. With QuickSight providing insights to power your daily decisions, it becomes more important than even for administrators and developers to ensure their QuickSight dashboards and data refreshes are operating smoothly as expected.

We recently announced the availability of QuickSight metrics within Amazon CloudWatch, which enables developers and administrators to monitor the availability and performance of their QuickSight deployments in real time. With the availability of metrics related to dashboard views, visual load times, and data ingestion details into SPICE (the QuickSight in-memory data store), developers and administrators can ensure that end-users of QuickSight deployments have an uninterrupted experience with relevant data. CloudWatch integration is now available in QuickSight Enterprise Edition in all supported Regions. These metrics can be accessed via CloudWatch, and allow QuickSight deployments to be monitored similarly to other application deployments on AWS, with the ability to generate alarms on failures and to slice and dice historical events to view trends and identify optimization opportunities. Metrics are kept for a period of 15 months, allowing them to be used for historical comparison and trend analysis.

Feature overview

QuickSight emits the following metrics to track the performance and availability of dataset ingestions, dashboards, and visuals. In addition to individual asset metrics, QuickSight also emits aggregated metrics to track performance and availability of all dashboards and SPICE ingestions for an account in a Region.

. Metric Description Unit
1 IngestionErrorCount The number of failed ingestions. Count
2 IngestionInvocationCount The number of ingestions initiated. This includes scheduled and manual ingestions that are triggered through either the QuickSight console or through APIs. Count
3 IngestionLatency The time from ingestion initiation to completion. Second
4 IngestionRowCount The number of successful row ingestions. Count
5 DashboardViewCount The number of times that a dashboard has been loaded or viewed. This includes all access patterns such as web, mobile, and embedded. Count
6 DashboardViewLoadTime The time that it takes a dashboard to load. The time is measured starting from the navigation to the dashboard to when all visuals within the view port are rendered. Millisecond
7 VisualLoadTime The time it takes for a QuickSight visual to load, including the round-trip query time from the client to QuickSight and back to the client. Millisecond
8 VisualLoadErrorCount The number of times a QuickSight visual fails to complete a data load. Count

Access QuickSight metrics in CloudWatch

Use the following procedure to access QuickSight metrics in CloudWatch:

  1. Sign in to the AWS account associated with your QuickSight account.
  2. In the upper-left corner of the AWS Console Home, choose Services, and then choose CloudWatch.
  3. On the CloudWatch console, under Metrics in the navigation pane, choose All metrics, and choose QuickSight.
  4. To access individual metrics, choose Dashboard metrics, Visual metrics, and Ingestion metrics.
  5. To access aggregate metrics, choose Aggregate metrics.

Visualize metrics on the CloudWatch console

You can use the CloudWatch console to visualize metric data generated from your QuickSight deployment. For more information, see Graphing metrics.

Create an alarm using CloudWatch console

You can also create a CloudWatch alarm that monitors CloudWatch metrics for your QuickSight assets. CloudWatch automatically sends you a notification when the metric reaches a threshold you specify. For examples, see Using Amazon CloudWatch alarms.

Use case overview

Let’s consider a fictional company, OkTank, which is an independent software vendor (ISV) in the healthcare space. They have an application that is used by different hospitals across different regions of the country to manage their revenue. OkTank has hundreds of hospitals with thousands of healthcare employees accessing their application and has embedded operations related to their business using multiple QuickSight dashboards in their application. In addition, they allow embedded authoring experience to each hospital’s in-house data analysts to build their own dashboards for their BI needs.

All the dashboards are powered by a database cluster, and they have multiple ingestion schedules. Because their QuickSight usage is growing and hospitals’ in-house data analysts are contributing by bringing in more data and their own dashboards, OkTank wants to monitor and make sure they’re providing their readers with a consistent, performant, and uninterrupted experience on QuickSight.

OkTank has some key monitoring needs that they deem critical:

  • Monitoring console – They want a general monitoring console where they can monitor reader engagement in their account, most popular dashboards, and overall visual load performance. They would like to monitor overall ingestion performance in their account.
  • Dashboard adoption and performance – They want to monitor traffic growth with respect to performance to make sure they’re meeting scaling needs.
  • Visual performance and availability – They have some visuals with complex queries and would like to make sure these queries are running fast enough without failures so that their readers have a performant and uninterrupted experience.
  • Ingestion failures – They want to be alerted if any scheduled ingestion fails, so that they can act right away and make sure their readers don’t experience any interruptions.

In the following sections, we discuss how OkTank meets each monitoring need in more detail.

Monitoring console

OkTank wants to have a general monitoring console to look at key KPIs, monitor reader engagement, and make sure their readers are getting a consistent and uninterrupted experience with QuickSight.

To create a monitoring console and add a KPI metric to it, OkTank takes the following steps:

  1. On the CloudWatch console, under Metrics in the navigation pane, choose Dashboards.
  2. Choose Create dashboard.
  3. Enter the dashboard name and choose Create dashboard.
  4. On the blank dashboard landing page, choose either Add a first widget or the plus sign to add a widget.
  5. In the Add widget section, choose Number.

  6. On the Browse tab, choose QuickSight.
  7. Choose Aggregate metrics.
  8. Select DashboardViewCount.
  9. Choose Create widget.
  10. On the options menu of the newly created widget, choose Edit.
  11. Enter the desired widget name.
  12. For Statistic, choose Sum.
  13. For Period, choose 1 day.
  14. Choose Update widget.

With the widget options, OkTank has added more KPIs on the console, such as average dashboard load time across the region during the day and the 10 most popular dashboards with the highest views, and created their monitoring console.

Dashboard adoption and performance

OkTank has some critical dashboards, and they want to monitor adoption of that dashboard and track its loading performance to make sure they can meet scaling needs.

They take the following steps to create a widget:

  1. On the monitoring console, choose the plus sign.
  2. In the Add widget section, choose Line.
  3. In the Add to this dashboard section, choose Metrics.
  4. On the Browse tab, choose QuickSight.
  5. Choose Dashboard metrics.
  6. Choose the DashboardViewCount and DashbordViewLoadTime metrics of the critical dashboard.
  7. Choose Create widget.

The newly created widget shows critical dashboards views and load times in multiple dimensions.

Visual performance and availability

OkTank has some visuals that require them to run complex queries while loading. They want to provide their readers with consistent and uninterrupted experience. In addition, they would like to be alerted in case a query experiences failures when running or takes longer than the desired runtime.

They take the following steps to monitor and set up an alarm:

  1. On the monitoring console, choose the plus sign.
  2. In the Add widget section, choose Line.
  3. In the Add to this dashboard section, choose Metrics.
  4. On the Browse tab, choose QuickSight.
  5. Choose Visual metrics.
  6. Choose the VisualLoadTime metric of the critical visual and configure the time period on the menu above the chart.
  7. To get alerted in case the critical visual fails to load due to query failure, choose the VisualLoadErrorCount metric.

    The newly created widget shows visuals load performance over the selected time frame.
  8. On the Graphed metrics tab, select the VisualLoadErrorCount metric.
  9. On the Actions menu, choose Create alarm.
  10. For Metric name, enter a name.
  11. Confirm that the value for DashboardId matches the dashboard that has the visual.

    In the Conditions section, OkTank wants to be notified when the error count is greater than or equal to 1.
  12. For Threshold type, select Static.
  13. Select Greater/Equal.
  14. Enter 1.
  15. Choose Next.
  16. In the Notification section, choose Select an existing SNS topic or Create a new topic.
  17. If you’re creating a new topic, provide a name for the topic and email addresses of recipients.
  18. Choose Create topic.
  19. Enter an alarm name and optional description.
  20. Choose Next.
  21. Verify the details and choose Create alarm.

The alarm is now available on the CloudWatch console. If the visual fails to load, the VisualLoadErrorCount value becomes 1 or more (depending on the number of times the dashboard is invoked) and the alarm state is set to In alarm.

Choose the alarm to get more details.

You can scroll down for more information about the alarm.

OkTank also receives an email to the email endpoint defined in the Amazon Simple Notification Service (Amazon SNS) topic.

Ingestion failures

OkTank wants to be alerted if any scheduled SPICE data ingestion fails, so that they can act right away and make sure their readers don’t experience any interruptions. This allows the administrator to find out the root cause of the SPICE ingestion failure (for example, an overloaded database instance) and fix it to ensure the latest data is available in the dependent dashboards.

They take the following steps to monitor and set up an alarm:

  1. On the monitoring console, choose the plus sign.
  2. In the Add widget section, choose Line.
  3. In the Add to this dashboard section, choose Metrics.
  4. On the Browse tab, choose QuickSight.
  5. Choose Ingestion metrics.
  6. Choose the IngestionErrorCount metric of the dataset and configure the time period on the menu above the chart.
  7. Follow the same steps as in the previous section to set up an alarm.

When ingestion fails for the dataset, the alarm changes to an In Alarm state and you receive an email notification.

The following screenshot shows an example of the email.

Conclusion

With QuickSight metrics in CloudWatch, QuickSight developers and administrators can observe and respond to the availability and performance of their QuickSight ecosystem in near-real time. They can monitor dataset ingestions, dashboards, and visuals to provide end-users of QuickSight and applications that embed QuickSight dashboards with a consistent, performant, and uninterrupted experience.

Try out QuickSight metrics in Amazon CloudWatch to monitor your Amazon QuickSight deployments, and share your feedback and questions in the comments.


About the Authors

Mayank Agarwal is a product manager for Amazon QuickSight, AWS’ cloud-native, fully managed BI service. He focuses on account administration, governance and developer experience. He started his career as an embedded software engineer developing handheld devices. Prior to QuickSight he was leading engineering teams at Credence ID, developing custom mobile embedded device and web solutions using AWS services that make biometric enrollment and identification fast, intuitive, and cost-effective for Government sector, healthcare and transaction security applications.

Raj Jayaraman is a Senior Specialist Solutions Architect for Amazon QuickSight. Raj focuses on helping customers develop sample dashboards, embed analytics and adopt BI design patterns and best practices.