AWS Big Data Blog
Category: Amazon Kinesis
Build a streaming data mesh using Amazon Kinesis Data Streams
AWS provides two primary solutions for streaming ingestion and storage: Amazon Managed Streaming for Apache Kafka (Amazon MSK) or Amazon Kinesis Data Streams. These services are key to building a streaming mesh on AWS. In this post, we explore how to build a streaming mesh using Kinesis Data Streams.
Achieve full control over your data encryption using customer managed keys in Amazon Managed Service for Apache Flink
Encryption of both data at rest and in transit is a non-negotiable feature for most organizations. Furthermore, organizations operating in highly regulated and security-sensitive environments—such as those in the financial sector—often require full control over the cryptographic keys used for their workloads. Amazon Managed Service for Apache Flink makes it straightforward to process real-time data […]
Optimize industrial IoT analytics with Amazon Data Firehose and Amazon S3 Tables with Apache Iceberg
In this post, we show how to use AWS service integrations to minimize custom code while providing a robust platform for industrial data ingestion, processing, and analytics. By using Amazon S3 Tables and its built-in optimizations, you can maximize query performance and minimize costs without additional infrastructure setup.
Overcome your Kafka Connect challenges with Amazon Data Firehose
We’re happy to announce a new feature in the Amazon Data Firehose integration with Amazon MSK. You can now specify the Firehose stream to either read from the earliest position on the Kafka topic or from a custom timestamp to begin reading from your MSK topic. In this post of this series, we focus on managed data delivery from Kafka to your data lake.
Stream data from Amazon MSK to Apache Iceberg tables in Amazon S3 and Amazon S3 Tables using Amazon Data Firehose
In this post, we walk through two solutions that demonstrate how to stream data from your Amazon MSK provisioned cluster to Iceberg-based data lakes in Amazon S3 using Amazon Data Firehose.
PackScan: Building real-time sort center analytics with AWS Services
In this post, we explore how PackScan uses Amazon cloud-based services to drive real-time visibility, improve logistics efficiency, and support the seamless movement of packages across Amazon’s Middle Mile network.
How Airties achieved scalability and cost-efficiency by moving from Kafka to Amazon Kinesis Data Streams
Airties is a wireless networking company that provides AI-driven solutions for enhancing home connectivity. This post explores the strategies the Airties team employed during their migration from Apache Kafka to Amazon Kinesis Data Streams, the challenges they overcame, and how they achieved a more efficient, scalable, and maintenance-free streaming infrastructure.
Unify streaming and analytical data with Amazon Data Firehose and Amazon SageMaker Lakehouse
In this post, we show you how to create Iceberg tables in Amazon SageMaker Unified Studio and stream data to these tables using Firehose. With this integration, data engineers, analysts, and data scientists can seamlessly collaborate and build end-to-end analytics and ML workflows using SageMaker Unified Studio, removing traditional silos and accelerating the journey from data ingestion to production ML models.
Announcing end-of-support for Amazon Kinesis Client Library 1.x and Amazon Kinesis Producer Library 0.x effective January 30, 2026
Amazon Kinesis Client Library (KCL) 1.x and Amazon Kinesis Producer Library (KPL) 0.x will reach end-of-support on January 30, 2026. Accordingly, these versions will enter maintenance mode on April 17, 2025. During maintenance mode, AWS will provide updates only for critical bug fixes and security issues. Major versions in maintenance mode will not receive updates for new features or feature enhancements.
Deploy real-time analytics with StarTree for managed Apache Pinot on AWS
In this post, we introduce StarTree as a managed solution on AWS for teams seeking the advantages of Pinot. We highlight the key distinctions between open-source Pinot and StarTree, and provide valuable insights for organizations considering a more streamlined approach to their real-time analytics infrastructure.