Streaming Data Solution for Amazon MSK

Continuously capture data for processing using a scalable and durable real-time data streaming service

Overview

Streaming Data Solution for Amazon MSK allows you to capture streaming data using Amazon Managed Streaming for Apache Kafka (Amazon MSK), a massively scalable storage service capable of handling high data volume from data producers. A producer can be thousands of data sources, each generating streaming data continuously and which, typically, submit records simultaneously and in small sizes (kilobytes).

Additionally, streaming data includes a wide variety of data such as log files generated by customers using mobile or web applications, ecommerce purchases, in-game player activity, information from social networks, financial trading floors, or geospatial services and telemetry from connected devices or instrumentation in data centers.

This AWS Solution provides four AWS CloudFormation templates where data flows through producers, streaming storage, consumers, and destinations. Similar to Streaming Data Solution for Amazon Kinesis, the templates are configured to apply best practices to monitor functionality and secure data using dashboards and alarms.

Use cases for this AWS Solution
  • Headline
More…

Benefits

Automated configuration
Automatically configure the AWS services necessary to easily capture, store, process, and deliver streaming data.
Four template options
Choose from four different AWS CloudFormation template options. Test new service combinations for your production environment, and improve existing applications.
Real-time use cases
Capture high-volume application logs, analyze clickstream data, continuously deliver to a data lake, and more.
Customizable source code
Customize the solution's boilerplate code, and then use the monitoring capabilities to quickly transition from testing to production.

Technical details

You can automatically deploy this architecture using the implementation guide and the accompanying AWS CloudFormation templates.

AWS Architecture Blog
Amazon MSK Backup for Archival, Replay, or Analytics

This post covers patterns and solutions that can be used to backup MSK topics to S3, which enables customers to reduce long-term data retention settings in MSK. Some customers store long term-data in MSK for data analytics and machine learning workloads. We share a pattern to simplify this architecture by offloading topics data in S3 and use S3 for analytics/ML.

Read the blog 
Training
Data Analytics Fundamentals

In this self-paced course, you learn about the process for planning data analysis solutions and the various data analytic processes that are involved.

Enroll now 

Was this page helpful?