Apache Kafka is an open-source platform for building real-time streaming data pipelines and applications. However, Apache Kafka is difficult to architect, operate, and manage on your own. Amazon MSK is a fully managed service that makes it easy for you to build and run applications that use Apache Kafka to process streaming data without needing Apache Kafka infrastructure management expertise.

The resources on this page help you create an Amazon MSK cluster, use best practices when operating and monitoring your cluster, migrate Apache Kafka workloads to Amazon MSK, and use Amazon Kinesis Data Analytics for Apache Flink to analyze data streams.

Amazon MSK Master Class

The Amazon MSK Master Class gives you 6 hours of hands-on learning and best practices in the most comprehensive content on Amazon MSK. This course is designed and presented by Stephane Maarek in partnership with AWS. Stephane is an active member of the Apache Kafka community, an AWS Hero, and has taught Apache Kafka to over 100,000 students.

Learn how to:

  • Create an Amazon MSK cluster in a dedicated VPC and start producing and consuming from Apache Kafka topics
  • Setup an Amazon RDS PostgreSQL database to send event streams for change data capture (CDC)
  • Deploy stream processing applications, including using Amazon Kinesis Data Analytics for Apache Flink
  • Use best practices for operating and monitoring
  • Amazon MSK cluster security, and more

Amazon MSK Labs

Getting Started Lab

Gain hands-on experience with Amazon MSK:

  • Set up an Amazon MSK cluster using both the Amazon MSK Console and AWS CLI
  • Add brokers and rebalance the cluster to adapt to additional producers, consumers, and growing workloads, both on the Console and using the CLI
  • Scale your Amazon MSK cluster to add more storage capacity
  • Gather all of the JMX metrics and work with the data in the popular open source monitoring tool Prometheus
  • Enable encryption-in-transit and authentication using various security features

Clickstream Analytics Lab

This workshop teaches you how to analyze the performance of various products in an e-commerce site by ingesting, transforming, and analyzing real-time clickstream data.

  • Set up your data producer to deliver clickstreams to Apache Kafka topics in Amazon MSK
  • Configure and start an Amazon Kinesis Data Analytics for Apache Flink application to process and aggregate the clickstream data, and send the results to Amazon MSK and Amazon Elasticsearch Service
  • Create Kibana visualizations and a Kibana dashboard to visualize the real-time clickstream analytics                                                                                           

Migration Lab

The Migration Lab teaches you different ways of migrating a self-managed Apache Kafka cluster, whether on Amazon EC2 or on premises, to Amazon MSK. Gain experience with different tools to complete a migration to Amazon MSK.

  • Migrate a self-managed cluster on Amazon EC2 or on-premises to Amazon MSK
  • Setup monitoring and your pre-migration environment
  • Use tools like MirrorMaker 2.0
  • Migrate your client and perform the final cutover                                                    

                                                                                                        

Lenses

lenses.io

Lenses is a simple, powerful, and secure self service DataOps platform. Operate with confidence on Apache Kafka & Amazon MSK. Lenses delivers broad access to data with fine-grained controls while empowering data gurus with SQL capabilities for data flows.

Visit partner 
DataDog

Datadog

Datadog is a SaaS-based monitoring and analytics platform for large-scale applications and infrastructure. Combining real-time logs, metrics from servers, containers, databases, and applications with end-to-end tracing, Datadog delivers actionable alerts and powerful visualizations to provide full-stack observability.

Visit partner 
New Relic

New Relic

With complete visibility from the customer to code to containers, New Relic gives you an easy way to manage the complexities of your AWS environment. Monitor Amazon MSK using the New Relic Prometheus OpenMetrics integration.

Visit partner 
Sumo Logic

Sumo Logic

The Sumo Logic cloud-native analytics platform helps you collect, correlate, and analyze all types of machine data (logs, metrics, events) to reduce the time to identify, troubleshoot, and resolve performance, security, and compliance issues. Gather information on your Amazon MSK cluster for use with Telegraf.

Visit partner