AWS Big Data Blog

Category: Analytics

Build an optimized self-service interactive analytics platform with Amazon EMR Studio

Data engineers and data scientists are dependent on distributed data processing infrastructure like Amazon EMR to perform data processing and advanced analytics jobs on large volumes of data. In most mid-size and enterprise organizations, cloud operations teams own procuring, provisioning, and maintaining the IT infrastructures, and their objectives and best practices differ from the data […]

How Hudl built a cost-optimized AWS Glue pipeline with Apache Hudi datasets

This is a guest blog post co-written with Addison Higley and Ramzi Yassine from Hudl. Hudl Agile Sports Technologies, Inc. is a Lincoln, Nebraska based company that provides tools for coaches and athletes to review game footage and improve individual and team play. Its initial product line served college and professional American football teams. Today, […]

How SOCAR built a streaming data pipeline to process IoT data for real-time analytics and control

August 30, 2023: Amazon Kinesis Data Analytics has been renamed to Amazon Managed Service for Apache Flink. Read the announcement in the AWS News Blog and learn more. SOCAR is the leading Korean mobility company with strong competitiveness in car-sharing. SOCAR has become a comprehensive mobility platform in collaboration with Nine2One, an e-bike sharing service, […]

Learn more about Apache Flink and Amazon Kinesis Data Analytics with three new videos

August 30, 2023: Amazon Kinesis Data Analytics has been renamed to Amazon Managed Service for Apache Flink. Read the announcement in the AWS News Blog and learn more. Amazon Kinesis Data Analytics is a fully managed service for Apache Flink that reduces the complexity of building, managing, and integrating Apache Flink applications with other AWS […]

Enrich VPC Flow Logs with resource tags and deliver data to Amazon S3 using Amazon Kinesis Data Firehose

February 9, 2024: Amazon Kinesis Data Firehose has been renamed to Amazon Data Firehose. Read the AWS What’s New post to learn more. VPC Flow Logs is an AWS feature that captures information about the network traffic flows going to and from network interfaces in Amazon Virtual Private Cloud (Amazon VPC). Visibility to the network […]

How Kyligence Cloud uses Amazon EMR Serverless to simplify OLAP

This post was co-written with Daniel Gu and Yolanda Wang, from Kyligence. Today, more than ever, organizations realize that modern business runs on data—almost all our interactions with business are based on data, and organizations must use analytics to understand, plan, and improve their operations. That is where Online Analytical Processing (OLAP) comes in. OLAP […]

Field-level security in Amazon OpenSearch Service

Amazon OpenSearch Service is fully open-source search and analytics engine that securely unlocks real-time search, monitoring, and analysis of business and operational data for use cases like application monitoring, log analytics, observability, and website search. But what if you have personal identifiable information (PII) data in your log data? How do you control and audit […]

Reduce cost and improve query performance with Amazon Athena Query Result Reuse

Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon Simple Storage Service (Amazon S3) using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run on datasets at petabyte scale. You can use Athena to query […]

What’s new with Amazon QuickSight at AWS re:Invent 2022

AWS re:Invent is a learning conference hosted by AWS for the global cloud computing community. This year’s re:Invent will be held in Las Vegas, Nevada, from November 28 to December 2. Amazon QuickSight is the most popular cloud-native serverless BI service. This post walks you through the details of all QuickSight-related sessions and activities to […]

California State University Chancellor’s Office reduces cost and improves efficiency using Amazon QuickSight for streamlined HR reporting in higher education

The California State University Chancellor’s Office (CSUCO) sits at the center of America’s most significant and diverse 4-year universities. The California State University (CSU) serves approximately 477,000 students and employs more than 55,000 staff and faculty members across 23 universities and 7 off-campus centers. The CSU provides students with opportunities to develop intellectually and personally, […]