AWS Big Data Blog

Category: Amazon Kinesis

Learn more about Apache Flink and Amazon Kinesis Data Analytics with three new videos

Amazon Kinesis Data Analytics is a fully managed service for Apache Flink that reduces the complexity of building, managing, and integrating Apache Flink applications with other AWS services. Apache Flink is an open-source framework and engine for stateful processing of data streams. It’s highly available and scalable, delivering high throughput and low latency for the […]

Enrich VPC Flow Logs with resource tags and deliver data to Amazon S3 using Amazon Kinesis Data Firehose

VPC Flow Logs is an AWS feature that captures information about the network traffic flows going to and from network interfaces in Amazon Virtual Private Cloud (Amazon VPC). Visibility to the network traffic flows of your application can help you troubleshoot connectivity issues, architect your application and network for improved performance, and improve security of […]

How a blockchain startup built a prototype solution to solve the need of analytics for decentralized applications with AWS Data Lab

This post is co-written with Dr. Quan Hoang Nguyen, CTO at Fantom Foundation. Here at Fantom Foundation (Fantom), we have developed a high performance, highly scalable, and secure smart contract platform. It’s designed to overcome limitations of the previous generation of blockchain platforms. The Fantom platform is permissionless, decentralized, and open source. The majority of […]

Ingest VPC flow logs into Splunk using Amazon Kinesis Data Firehose

In September 2017, during the annual Splunk.conf, Splunk and AWS jointly announced Amazon Kinesis Data Firehose integration to support Splunk Enterprise and Splunk Cloud as a delivery destination. This native integration between Splunk Enterprise, Splunk Cloud, and Kinesis Data Firehose is designed to make AWS data ingestion setup seamless, while offering a secure and fault-tolerant […]

Common streaming data enrichment patterns in Amazon Kinesis Data Analytics for Apache Flink

Stream data processing allows you to act on data in real time. Real-time data analytics can help you have on-time and optimized responses while improving overall customer experience. Apache Flink is a distributed computation framework that allows for stateful real-time data processing. It provides a single set of APIs for building batch and streaming jobs, […]

How NerdWallet uses AWS and Apache Hudi to build a serverless, real-time analytics platform

This is a guest post by Kevin Chun, Staff Software Engineer in Core Engineering at NerdWallet. NerdWallet’s mission is to provide clarity for all of life’s financial decisions. This covers a diverse set of topics: from choosing the right credit card, to managing your spending, to finding the best personal loan, to refinancing your mortgage. […]

Best practices to optimize cost and performance for AWS Glue streaming ETL jobs

AWS Glue streaming extract, transform, and load (ETL) jobs allow you to process and enrich vast amounts of incoming data from systems such as Amazon Kinesis Data Streams, Amazon Managed Streaming for Apache Kafka (Amazon MSK), or any other Apache Kafka cluster. It uses the Spark Structured Streaming framework to perform data processing in near-real […]

Analyze logs with Dynatrace Davis AI Engine using Amazon Kinesis Data Firehose HTTP endpoint delivery

This blog post is co-authored with Erick Leon, Sr. Technical Alliance Manager from Dynatrace. Amazon Kinesis Data Firehose is the easiest way to reliably load streaming data into data lakes, data stores, and analytics services. With just a few clicks, you can create fully-managed delivery streams that auto scale on demand to match the throughput of […]

BDB-2063-kda-keyspaces-architecture

Sink Amazon Kinesis Data Analytics Apache Flink output to Amazon Keyspaces using Apache Cassandra Connector

Amazon Keyspaces (for Apache Cassandra) is a scalable, highly available, and managed Apache Cassandra–compatible database service. With Amazon Keyspaces you don’t have to provision, patch, or manage servers, and you don’t have to install, maintain, or operate software. Amazon Keyspaces is serverless, so you only pay for the resources that you use. You can use […]

Stream change data to Amazon Kinesis Data Streams with AWS DMS

In this post, we discuss how to use AWS Database Migration Service (AWS DMS) native change data capture (CDC) capabilities to stream changes into Amazon Kinesis Data Streams. AWS DMS is a cloud service that makes it easy to migrate relational databases, data warehouses, NoSQL databases, and other types of data stores. You can use […]