AWS Big Data Blog
Category: Amazon Kinesis
Exploring real-time streaming for generative AI Applications
Foundation models (FMs) are large machine learning (ML) models trained on a broad spectrum of unlabeled and generalized datasets. FMs, as the name suggests, provide the foundation to build more specialized downstream applications, and are unique in their adaptability. They can perform a wide range of different tasks, such as natural language processing, classifying images, […]
Build an end-to-end serverless streaming pipeline with Apache Kafka on Amazon MSK using Python
The volume of data generated globally continues to surge, from gaming, retail, and finance, to manufacturing, healthcare, and travel. Organizations are looking for more ways to quickly use the constant inflow of data to innovate for their businesses and customers. They have to reliably capture, process, analyze, and load the data into a myriad of […]
Invoke AWS Lambda functions from cross-account Amazon Kinesis Data Streams
A multi-account architecture on AWS is essential for enhancing security, compliance, and resource management by isolating workloads, enabling granular cost allocation, and facilitating collaboration across distinct environments. It also mitigates risks, improves scalability, and allows for advanced networking configurations. In a streaming architecture, you may have event producers, stream storage, and event consumers in a […]
Gain insights from historical location data using Amazon Location Service and AWS analytics services
Many organizations around the world rely on the use of physical assets, such as vehicles, to deliver a service to their end-customers. By tracking these assets in real time and storing the results, asset owners can derive valuable insights on how their assets are being used to continuously deliver business improvements and plan for future […]
Reference guide to analyze transactional data in near-real time on AWS
Business leaders and data analysts use near-real-time transaction data to understand buyer behavior to help evolve products. The primary challenge businesses face with near-real-time analytics is getting the data prepared for analytics in a timely manner, which can often take days. Companies commonly maintain entire teams to facilitate the flow of data from ingestion to […]
Architectural patterns for real-time analytics using Amazon Kinesis Data Streams, part 1
We’re living in the age of real-time data and insights, driven by low-latency data streaming applications. Today, everyone expects a personalized experience in any application, and organizations are constantly innovating to increase their speed of business operation and decision making. The volume of time-sensitive data produced is increasing rapidly, with different formats of data being […]
Run Kinesis Agent on Amazon ECS
February 9, 2024: Amazon Kinesis Data Firehose has been renamed to Amazon Data Firehose. Read the AWS What’s New post to learn more. Kinesis Agent is a standalone Java software application that offers a straightforward way to collect and send data to Amazon Kinesis Data Streams and Amazon Kinesis Data Firehose. The agent continuously monitors […]
Amazon Kinesis Data Streams: celebrating a decade of real-time data innovation
Data is a key strategic asset for every organization, and every company is a data business at its core. However, in many organizations, data is typically spread across a number of different systems such as software as a service (SaaS) applications, operational databases, and data warehouses. Such data silos make it difficult to get unified […]
Processing large records with Amazon Kinesis Data Streams
In this post, we show you some different options for handling large records within Kinesis Data Streams and the benefits and disadvantages of each approach. We provide some sample code for each option to help you get started with any of these approaches with your own workloads.
Modernize a legacy real-time analytics application with Amazon Managed Service for Apache Flink
In this post, we discuss challenges with relational databases when used for real-time analytics and ways to mitigate them by modernizing the architecture with serverless AWS solutions. We introduce you to Amazon Managed Service for Apache Flink Studio and get started querying streaming data interactively using Amazon Kinesis Data Streams. We walk through a call center analytics solution that provides insights into the call center’s performance in near-real time through metrics that determine agent efficiency in handling calls in the queue. Key performance indicators (KPIs) of interest for a call center from a near-real-time platform could be calls waiting in the queue, highlighted in a performance dashboard within a few seconds of data ingestion from call center streams.