AWS Big Data Blog
Category: Analytics
Preprocessing Data in Amazon Kinesis Analytics with AWS Lambda
Kinesis Analytics now gives you the option to preprocess your data with AWS Lambda. This gives you a great deal of flexibility in defining what data gets analyzed by your Kinesis Analytics application. In this post, I discuss some common use cases for preprocessing, and walk you through an example to help highlight its applicability.
Build a Schema-On-Read Analytics Pipeline Using Amazon Athena
In this post, I show how to build a schema-on-read analytical pipeline, similar to the one used with relational databases, using Amazon Athena. The approach is completely serverless, which allows the analytical platform to scale as more data is stored and processed via the pipeline.
Amazon QuickSight Now Allows Users to Create Analyses from Dashboards and Import Custom Date Formats
Starting today, QuickSight will allow users to save the contents of a dashboard as an analysis within their account. As the user of a dashboard, this will allow you to create an analysis that contains all visuals from the dashboard.
Query and Visualize AWS Cost and Usage Data Using Amazon Athena and Amazon QuickSight
If you’ve ever wondered if a serverless alternative existed for consuming and querying your AWS Cost and Usage report data, then wonder no more. The answer is yes, and this post both introduces you to that solution and illustrates the simplicity and effortlessness of deploying it.
Create Custom AMIs and Push Updates to a Running Amazon EMR Cluster Using Amazon EC2 Systems Manager
In this post, I show how Systems Manager Automation can be used to automate the creation and patching of custom Amazon Linux AMIs for EMR. I also show how you can use Run Command to send commands to all nodes of a running EMR cluster.
Unite Real-Time and Batch Analytics Using the Big Data Lambda Architecture, Without Servers!
In this post, I show you how you can use AWS services like AWS Glue to build a Lambda Architecture completely without servers. I use a practical demonstration to examine the tight integration between serverless services on AWS and create a robust data processing Lambda Architecture system.
Amazon QuickSight Now Supports Search, Filter Groups, and Amazon S3 Analytics Connector
I’m excited to share information about some new features in Amazon QuickSight. You can now search for datasets, analyses, and dashboards, you can create filter groups with multiple filter conditions that are evaluated together using the OR operation, and you can now use the built-in Amazon S3 analytics connector to visualize your S3 storage access patterns across multiple S3 buckets and configurations within a single Amazon QuickSight dashboard to optimize for cost.
Analyzing Salesforce Data with Amazon QuickSight
In this post, we will walk through creating a new data set based on Salesforce data, creating your analysis and adding visuals, creating an Amazon QuickSight dashboard, and working with filters.
From Data Lake to Data Warehouse: Enhancing Customer 360 with Amazon Redshift Spectrum
Achieving a 360o-view of your customer has become increasingly challenging as companies embrace omni-channel strategies, engaging customers across websites, mobile, call centers, social media, physical sites, and beyond. The promise of a web where online and physical worlds blend makes understanding your customers more challenging, but also more important. Businesses that are successful in this […]
Analyzing AWS Cost and Usage Reports with Looker and Amazon Athena
In the post, I walk through setting up the data pipeline for cost and usage reports, Amazon S3, and Athena, and discuss some of the most common levers for cost savings. I surface tables through Looker, which comes with a host of pre-built data models and dashboards to make analysis of your cost and usage data simple and intuitive.