Wouldn’t it be great if there were a user-friendly tool to generate test data and send it to Amazon Kinesis? Well, now there is—the Amazon Kinesis Data Generator (KDG).
In 2016, AWS introduced the EKK stack (Amazon Elasticsearch Service, Amazon Kinesis, and Kibana, an open source plugin from Elastic) as an alternative to ELK (Amazon Elasticsearch Service, the open source tool Logstash, and Kibana) for ingesting and visualizing Apache logs. One of the main features of the EKK stack is that the data transformation is handled via the Amazon Kinesis Firehose agent. In this post, we describe how to optimize the EKK solution—by handling the data transformation in Amazon Kinesis Firehose through AWS Lambda.
To effectively replicate data from DynamoDB to Aurora, a reliable, scalable data replication (ETL) process needs to be built. In this post, I show you how to build such a process using a serverless architecture with AWS Lambda and Amazon Kinesis Firehose.
Last year, I published an AWS Security Blog post that showed how to optimize and visualize your security groups. Today’s post continues in the vein of that post by using Amazon Kinesis Firehose and AWS Lambda to enrich the VPC Flow Logs dataset and enhance your ability to optimize security groups. The capabilities in this post’s solution are based on the Lambda functions available in this VPC Flow Log Appender GitHub repository.
We have many customers who own and operate Elasticsearch, Logstash, and Kibana (ELK) stacks to load and visualize Apache web logs, among other log types. Amazon Elasticsearch Service provides Elasticsearch and Kibana in the AWS Cloud in a way that’s easy to set up and operate. Amazon Kinesis Firehose provides reliable, serverless delivery of Apache web logs (or other log data) to Amazon Elasticsearch Service. With Firehose, you can add an automatic call to an AWS Lambda function to transform records within Firehose. With these two technologies, you have an effective, easy-to-manage replacement for your existing ELK stack.
This blog post shows how to build a serverless architecture by using Amazon Kinesis Firehose, AWS Lambda, Amazon S3, Amazon Athena, and Amazon QuickSight to collect, store, query, and visualize flow logs.
Amazon Kinesis Firehose is a fully managed service for delivering real-time streaming data to destinations such as Amazon S3, Amazon Redshift, or Amazon Elasticsearch Service (Amazon ES). In this post, I introduce data transformation capabilities on your delivery streams, to seamlessly transform incoming source data and deliver the transformed data to your destinations.
In this post, I show how you can build a business intelligence capability for streaming IoT device data using AWS serverless and managed services. You can be up and running in minutes―starting small, but able to easily grow to millions of devices and billions of messages.
Log aggregation is critical to your operational infrastructure. A reliable, secure, and scalable log aggregation solution makes all the difference during a crunch-time debugging session. In this post, we explore an alternative to the popular log aggregation solution, the ELK stack (Elasticsearch, Logstash, and Kibana): the EKK stack (Amazon Elasticsearch Service, Amazon Kinesis, and Kibana).
In this post, I show an analytics pipeline which detects anomalies in real time for a web traffic stream, using the RANDOM_CUT_FOREST function available in Amazon Kinesis Analytics.
This is the second of two AWS Big Data posts on Writing SQL on Streaming Data with Amazon Kinesis Analytics. In the last post, I provided an overview of streaming data and key concepts, such as the basics of streaming SQL, and completed a walkthrough using a simple example. In this post, I cover more advanced stream processing concepts using Amazon Kinesis Analytics and you can complete an end-to-end application.
This is the first of two AWS Big Data blog posts on Writing SQL on Streaming Data with Amazon Kinesis Analytics. In this post, I provide an overview of streaming data and key concepts like the basics of streaming SQL, and complete a walkthrough using a simple example. In the next post, I will cover more advanced stream processing concepts using Amazon Kinesis Analytics.
Streaming data technologies shorten the time to analyze and use your data from hours and days to minutes and seconds. Let’s walk through an example of using Amazon Kinesis Firehose, Amazon Redshift, and Amazon QuickSight to set up a streaming data pipeline and visualize Maryland traffic violation data in real time.
In this guest post, Anton Slutsky of MeetMe will discuss a solution using Amazon Kinesis Firehose to optimize and streamline large-scale data ingestion at MeetMe, which is a popular social discovery platform that caters to more than a million active daily users. The Data Science team at MeetMe needed to collect and store approximately 0.5 TB per day of various types of data in a way that would expose it to data mining tasks, business-facing reporting and advanced analytics. The team selected Amazon S3 as the target storage facility and faced a challenge of collecting the large volumes of live data in a robust, reliable, scalable and operationally affordable way.
Amazon Kinesis Agent is a stand-alone Java software application that provides an easy and reliable way to send data to Amazon Kinesis Streams and Amazon Kinesis Firehose. The agent monitors a set of files for new data and then sends it to Kinesis Streams or Kinesis Firehose continuously. It handles file rotation, checkpointing, and retrial upon failures. It also supports Amazon CloudWatch so that you can closely monitor and troubleshoot the data flow from the agent.
Elasticsearch is a popular open-source search and analytics engine. Amazon Elasticsearch Service is a managed service that makes it easy for you to deploy, run, and scale Elasticsearch in the AWS Cloud. You can now arrange to deliver your Kinesis Firehose data stream to an Amazon Elasticsearch Cluster. This will allow you to index and analyze server logs, clickstreams, and social media traffic.
In this post we use Twitter public streams to analyze the candidates’ performance, both Republican and Democrat, in a near real-time fashion. We show you how to integrate Amazon Kinesis Firehose, AWS Lambda (Python function), and Amazon Elasticsearch Service to create an end-to-end, near real-time discovery platform.
This blog post walks you through a simple and effective way to persist data to Amazon S3 from Amazon Kinesis Streams using AWS Lambda and Amazon Kinesis Firehose.