This Guidance deploys a configurable, end-to-end set of AWS data and analytics services to visualize your data. Previously, integrating multiple data and analytics components required manually deploying and configuring each service, often a time-consuming process needing engineers to instrument. This Guidance is especially helpful to quickly iterate, publish, and test analytics projects on the way to wider-scale implementations. With Accelerating Analytics on AWS, you can launch a single AWS CloudFormation template that deploys and configures multiple, integrated AWS data and analytics services to quickly scale users’ access to data and insights more quickly and with less resources.
Please note: [Disclaimer]
An AWS Glue crawler crawls the Amazon S3 data bucket to obtain metadata of the user data. The crawler runs everyday on a schedule to update the Table metadata if new data is available in the bucket.
The crawler updates the metadata in an AWS Glue Data Catalog as a Database and Table, with permissions managed by Lake Formation.
Amazon Athena queries the Table using SQL queries issued by QuickSight when visuals are loaded.
A QuickSight subscription is created if it does not exist. A new QuickSight Dataset and Analysis is created using the IAM role to issue queries to Athena and access the data in the Amazon S3 bucket.
You can access QuickSight analysis and start creating visuals to gain insight about your data.
The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
The AWS services used in this Guidance support logging information in Amazon CloudWatch or AWS CloudTrail that can be tracked and reviewed for additional customization. This enables a fast and easy way to review errors and respond to incidents appropriately.
The AWS Glue database and table created in this Guidance are secured using Lake Formation, and access is granted only to the IAM role used by Quicksight. This allows for secure authentication and authorization for people and machine access.
The resources provisioned in this Guidance are private by default, and can only be modified with IAM identity-based policies.
Since AWS services used in this Guidance are serverless, use AWS managed endpoints and DNS, the implementation can depend on the high availability and resiliency to failures that are inherent in AWS services.
AWS CloudFormation automates deployment and provisioning of resources. Upon failure of one resource, the implementation rolls back all other provisioned resources, ensuring you have a reliable application-level architecture.
Additionally, CloudFormation logs resource provisioning and errors that can be accessed using CloudTrail and CloudWatch. QuickSight sends an email to notify account administrators when significant events occur.
The services selected for this Guidance are purpose-built to handle advanced analytics. For example, QuickSight is an AWS managed serverless business intelligence (BI) service that integrates with Athena to query data in Amazon S3.
A QuickSight analysis, where you analyze and visualize your data, is created when this Guidance is deployed, helping you gain insights from your data. Based on Athena tables, also created by this Guidance, you can experiment and create additional tables to query data in Amazon S3 or other supported data sources.
This Guidance uses managed services that deploy a pay-as-you-go approach, removing the need to maintain overhead and reduce cost. AWS services used in this Guidance are also provisioned in the same AWS Region to reduce data transfer charges, whereas QuickSight does not accrue any data transfer charges.
All services in this Guidance are serverless and do not require running for an extended period of time. To ensure this Guidance scales to continually match the demand with the minimum resources, it deploys an Amazon S3 bucket that contains only customer provisioned data. The AWS Glue crawler runs once per day to check for new data, and Athena is invoked only when using QuickSight.
This Guidance invokes Athena only when users interact with the corresponding data sets in QuickSight, ensuring limited provisioning of resources. Additionally, Amazon S3 and Athena automatically scale to accommodate data that you provide.
Architecture patterns that maintain consistently high utilization of deployed resources are implemented with this Guidance. For example, the AWS Glue crawler is only invoked once every day to crawl customer data in Amazon S3 buckets and to update the Glue Catalog. Finally, this Guidance uses serverless AWS services that do not require continuous hardware provisioning, making it a more sustainable architecture.
A detailed guide is provided to experiment and use within your AWS account. Each stage of building the Guidance, including deployment, usage, and cleanup, is examined to prepare it for deployment.
The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.
The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.
References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.