AWS Big Data Blog

Category: Amazon Kinesis

Stream Apache HBase edits for real-time analytics

Apache HBase is a non-relational database. To use the data, applications need to query the database to pull the data and changes from tables. In this post, we introduce a mechanism to stream Apache HBase edits into streaming services such as Apache Kafka or Amazon Kinesis Data Streams. In this approach, changes to data are […]

Unify log aggregation and analytics across compute platforms

Our customers want to make sure their users have the best experience running their application on AWS. To make this happen, you need to monitor and fix software problems as quickly as possible. Doing this gets challenging with the growing volume of data needing to be quickly detected, analyzed, and stored. In this post, we […]

Load CDC data by table and shape using Amazon Kinesis Data Firehose Dynamic Partitioning

Amazon Kinesis Data Firehose is the easiest way to reliably load streaming data into data lakes, data stores, and analytics services. Customers already use Amazon Kinesis Data Firehose to ingest raw data from various data sources using direct API call or by integrating Kinesis Data Firehose with Amazon Kinesis Data Streams including “change data capture” […]

Integrate Etleap with Amazon Redshift Streaming Ingestion (preview) to make data available in seconds

Amazon Redshift is a fully managed cloud data warehouse that makes it simple and cost-effective to analyze all your data using SQL and your extract, transform, and load (ETL), business intelligence (BI), and reporting tools. Tens of thousands of customers use Amazon Redshift to process exabytes of data per day and power analytics workloads. Etleap […]

The following diagram shows our solution architecture.

Effective data lakes using AWS Lake Formation, Part 2: Creating a governed table for streaming data sources

We announced the general availability of AWS Lake Formation transactions, row-level security, and acceleration at AWS re:Invent 2021. In Part 1 of this series, we explained how to set up a governed table and add objects to it. In this post, we expand on this example, and demonstrate how to ingest streaming data into governed tables […]

Now Available: Updated guidance on the Data Analytics Lens for AWS Well-Architected Framework

Nearly all businesses today require some form of data analytics processing, from auditing user access to generating sales reports. For all your analytics needs, the Data Analytics Lens for AWS Well-Architected Framework provides prescriptive guidance to help you assess your workloads and identify best practices aligned to the AWS Well-Architected Pillars: Operational Excellence, Security, Reliability, […]

Continuous monitoring with Sumo Logic using Amazon Kinesis Data Firehose HTTP endpoints

Amazon Kinesis Data Firehose streams data to AWS destinations such as Amazon Simple Storage Service (Amazon S3), Amazon Redshift, and Amazon OpenSearch Service. Additionally, Kinesis Data Firehose supports destinations to third-party partners. This ability to send data to third-party partners is a vital feature for customers who already use these AWS partner platforms; especially partners […]

Query your Amazon MSK topics interactively using Amazon Kinesis Data Analytics Studio

Amazon Kinesis Data Analytics Studio makes it easy to analyze streaming data in real time and build stream processing applications powered by Apache Flink using standard SQL, Python, and Scala. With a few clicks on the AWS Management Console, you can launch a serverless notebook to query data streams and get results in seconds. Kinesis […]

How NortonLifelock built a serverless architecture for real-time analysis of their VPN usage metrics

This post presents a reference architecture and optimization strategies for building serverless data analytics solutions on AWS using Amazon Kinesis Data Analytics. In addition, this post shows the design approach that the engineering team at NortonLifeLock took to build out an operational analytics platform that processes usage data for their VPN services, consuming petabytes of […]

Register now for Flink Forward Global, October 26-27, 2021

Flink Forward Global 2021 is a 2-day virtual conference for the Apache Flink and stream processing communities. Apache Flink is an open-source distributed engine for processing data streams that can support both streaming and batch workloads. Amazon Kinesis Data Analytics is a fully managed service for Apache Flink on AWS that reduces the complexity of […]