AWS Big Data Blog

Category: Amazon Kinesis

Ingest VPC flow logs into Splunk using Amazon Kinesis Data Firehose

In September 2017, during the annual Splunk.conf, Splunk and AWS jointly announced Amazon Kinesis Data Firehose integration to support Splunk Enterprise and Splunk Cloud as a delivery destination. This native integration between Splunk Enterprise, Splunk Cloud, and Kinesis Data Firehose is designed to make AWS data ingestion setup seamless, while offering a secure and fault-tolerant […]

Common streaming data enrichment patterns in Amazon Kinesis Data Analytics for Apache Flink

Stream data processing allows you to act on data in real time. Real-time data analytics can help you have on-time and optimized responses while improving overall customer experience. Apache Flink is a distributed computation framework that allows for stateful real-time data processing. It provides a single set of APIs for building batch and streaming jobs, […]

How NerdWallet uses AWS and Apache Hudi to build a serverless, real-time analytics platform

This is a guest post by Kevin Chun, Staff Software Engineer in Core Engineering at NerdWallet. NerdWallet’s mission is to provide clarity for all of life’s financial decisions. This covers a diverse set of topics: from choosing the right credit card, to managing your spending, to finding the best personal loan, to refinancing your mortgage. […]

Best practices to optimize cost and performance for AWS Glue streaming ETL jobs

AWS Glue streaming extract, transform, and load (ETL) jobs allow you to process and enrich vast amounts of incoming data from systems such as Amazon Kinesis Data Streams, Amazon Managed Streaming for Apache Kafka (Amazon MSK), or any other Apache Kafka cluster. It uses the Spark Structured Streaming framework to perform data processing in near-real […]

Analyze logs with Dynatrace Davis AI Engine using Amazon Kinesis Data Firehose HTTP endpoint delivery

This blog post is co-authored with Erick Leon, Sr. Technical Alliance Manager from Dynatrace. Amazon Kinesis Data Firehose is the easiest way to reliably load streaming data into data lakes, data stores, and analytics services. With just a few clicks, you can create fully-managed delivery streams that auto scale on demand to match the throughput of […]

BDB-2063-kda-keyspaces-architecture

Sink Amazon Kinesis Data Analytics Apache Flink output to Amazon Keyspaces using Apache Cassandra Connector

Amazon Keyspaces (for Apache Cassandra) is a scalable, highly available, and managed Apache Cassandra–compatible database service. With Amazon Keyspaces you don’t have to provision, patch, or manage servers, and you don’t have to install, maintain, or operate software. Amazon Keyspaces is serverless, so you only pay for the resources that you use. You can use […]

Stream change data to Amazon Kinesis Data Streams with AWS DMS

In this post, we discuss how to use AWS Database Migration Service (AWS DMS) native change data capture (CDC) capabilities to stream changes into Amazon Kinesis Data Streams. AWS DMS is a cloud service that makes it easy to migrate relational databases, data warehouses, NoSQL databases, and other types of data stores. You can use […]

A serverless operational data lake for retail with AWS Glue, Amazon Kinesis Data Streams, Amazon DynamoDB, and Amazon QuickSight

Do you want to reduce stockouts at stores? Do you want to improve order delivery timelines? Do you want to provide your customers with accurate product availability, down to the millisecond? A retail operational data lake can help you transform the customer experience by providing deeper insights into a variety of operational aspects of your […]

Build a big data Lambda architecture for batch and real-time analytics using Amazon Redshift

With real-time information about customers, products, and applications in hand, organizations can take action as events happen in their business application. For example, you can prevent financial fraud, deliver personalized offers, and identify and prevent failures before they occur in near real time. Although batch analytics provides abilities to analyze trends and process data at […]

Use Amazon Kinesis Data Firehose to extract data insights with Coralogix

This is a guest blog post co-written by Tal Knopf at Coralogix. Digital data is expanding exponentially, and the existing limitations to store and analyze it are constantly being challenged and overcome. According to Moore’s Law, digital storage becomes larger, cheaper, and faster with each successive year. The advent of cloud databases is just one […]