AWS Big Data Blog
Category: Analytics
Meet the Amazon EMR Team this Friday at a Tech Talk & Networking Event in Mountain View
Want to change the world with Big Data and Analytics? Come join us on the Amazon EMR team in Amazon Web Services! Meet the Amazon EMR team this Friday April 7th from 5:00 – 7:30 PM at Michael’s at Shoreline in Mountain View. We’ll feature short tech talks by EMR leadership who will talk about the past, […]
Encrypt and Decrypt Amazon Kinesis Records Using AWS KMS
Customers with strict compliance or data security requirements often require data to be encrypted at all times, including at rest or in transit within the AWS cloud. This post shows you how to build a real-time streaming application using Kinesis in which your records are encrypted while at rest or in transit. Amazon Kinesis overview […]
Top 10 Performance Tuning Tips for Amazon Athena
February 2024: This post was reviewed and updated to reflect changes in Amazon Athena engine version 3, including cost-based optimization and query result reuse. Amazon Athena is an interactive analytics service built on open source frameworks that make it straightforward to analyze data stored using open table and file formats in Amazon Simple Storage Service […]
Running R on Amazon Athena
This blog post has been translated into Japanese. Data scientists are often concerned about managing the infrastructure behind big data platforms while running SQL on R. Amazon Athena is an interactive query service that works directly with data stored in S3 and makes it easy to analyze data using standard SQL without the need to […]
Amazon Redshift Monitoring Now Supports End User Queries and Canaries
Ian Meyers is a Solutions Architecture Senior Manager with AWS The serverless Amazon Redshift Monitoring utility lets you gather important performance metrics from your Redshift cluster’s system tables and persists the results in Amazon CloudWatch. This serverless solution leverages AWS Lambda to schedule custom SQL queries and process the results. With this utility, you can use […]
Analyzing VPC Flow Logs using Amazon Athena, and Amazon QuickSight
February 2, 2022: Blog updated by Chaitanya Shah. February 9, 2024: Amazon Kinesis Data Firehose has been renamed to Amazon Data Firehose. Read the AWS What’s New post to learn more. Organizations of different size who migrate their applications in cloud or applications born in cloud makes use of various cloud services to innovate and […]
Analyze Security, Compliance, and Operational Activity Using AWS CloudTrail and Amazon Athena
As organizations move their workloads to the cloud, audit logs provide a wealth of information on the operations, governance, and security of assets and resources. As the complexity of the workloads increases, so does the volume of audit logs being generated. It becomes increasingly difficult for organizations to analyze and understand what is happening in […]
Harmonize, Search, and Analyze Loosely Coupled Datasets on AWS
September 8, 2021: Amazon Elasticsearch Service has been renamed to Amazon OpenSearch Service. See details. You have come up with an exciting hypothesis, and now you are keen to find and analyze as much data as possible to prove (or refute) it. There are many datasets that might be applicable, but they have been created […]
Scheduled Refresh for SPICE Data Sets on Amazon QuickSight
Jose Kunnackal is a Senior Product Manager for Amazon Quicksight This blog post has been translated into Japanese. In November 2016, we launched Amazon QuickSight, a cloud-powered, business analytics service that lets you quickly and easily visualize your data. QuickSight uses SPICE (Super-fast, Parallel, In-Memory Calculation Engine), a fully managed data store that enables blazing […]
Create Tables in Amazon Athena from Nested JSON and Mappings Using JSONSerDe
July 2024: This post was reviewed and updated for accuracy. February 9, 2024: Amazon Kinesis Data Firehose has been renamed to Amazon Data Firehose. Read the AWS What’s New post to learn more. Most systems use Java Script Object Notation (JSON) to log event information. Although it’s efficient and flexible, deriving information from JSON is […]









