Amazon Managed Streaming for Apache Kafka features

Why Amazon MSK?

Amazon Managed Streaming for Apache Kafka (Amazon MSK) offers fully managed Apache Kafka, Kafka Connect, and Amazon MSK Replicator. Apache Kafka is a distributed data store optimized for ingesting and processing streaming data in real time. Amazon MSK provisions your cluster infrastructure, configures your Apache Kafka clusters, replaces servers when they fail, orchestrates server patches and upgrades, architects clusters for high availability, ensures data is durably stored and secured, sets up monitoring and alarms, and runs scaling to support load changes. With Amazon MSK, you can spend more of your time developing and running streaming event applications than managing your Apache Kafka clusters.

Scalable integration

Amazon MSK is the integration backbone for modern messaging and event-driven applications at the center of data ingest and processing services as well as microservice application architectures. There are multiple ways to integrate with other systems, including a variety of other AWS services, making application development simpler and faster. You can bring your own connector and deploy it on fully managed infrastructure through Amazon MSK Connect. Alternatively, you can choose from an ever-growing list of native integrations with other AWS services, such as Amazon S3, Amazon Redshift, Amazon Managed Service for Apache Flink, and AWS Lambda. Amazon MSK also integrates with AWS Identity and Access Management (IAM), AWS Certificate Manager (ACM), and AWS Key Management Service (AWS KMS) to provide secure, authenticated, and authorized client access to your data. You also have the option to enforce schema governance through AWS Glue Schema Registry.

Compatible with Apache Kafka

Open all

Supports all features

Amazon MSK supports all features of Apache Kafka out of the box and makes newer versions of Apache Kafka available within a few weeks of public availability.

Client compatibility

Amazon MSK maintains full compatibility with Apache Kafka’s open source client protocol, so applications and tools built for Apache Kafka work with Amazon MSK out of the box, with no application code changes.

Seamless version upgrades

You can upgrade Apache Kafka versions on provisioned clusters in only a few steps, allowing you to decide when to take advantage of features and bug fixes present in new Apache Kafka versions. Amazon MSK automates the deployment of version upgrades on running clusters to maintain client I/O availability for you.

Choose your own cluster type

Open all

Amazon MSK Provisioned

Amazon MSK Provisioned provides fine-grained control over your Apache Kafka cluster. You can choose your broker type, pre-provision server instances, select the type of storage you want, and pick the Apache Kafka version of choice. You can also choose when and by how much to scale your clusters in response to workload variation.

Amazon MSK Serverless

Amazon MSK Serverless fully manages your Apache Kafka cluster so you don’t have to estimate how much capacity you need for your workload or decide when to scale it in response to changes in traffic.

No servers to manage

Open all

Fully managed clusters

No matter which cluster type you choose, with a few steps in the AWS management console, you can create a fully managed cluster that is highly available, secure, and backed by Amazon MSK advanced monitoring and detection systems that automatically maintain the operational health of your cluster.

Metadata management nodes included

Apache Kafka uses either Apache Kafka Raft (KRaft) or Apache ZooKeeper for metadata management. Amazon MSK allows you to create clusters in either mode on supported Apache Kafka versions. Amazon MSK also manages these additional metadata nodes for you at no additional cost.

Multiple broker types

Open all

Express brokers

Express brokers are one type of broker offered under MSK Provisioned. Express brokers make Apache Kafka simpler to manage, more cost-effective to run at scale, and more elastic with the same low latency that you expect. Express brokers include virtually unlimited and elastic storage capacity that requires no management overhead, provide up to 3x more throughput per broker, and can scale up to 20x faster. You can also recover up to 90% faster than Standard Apache Kafka brokers in Amazon Managed Streaming services for Apache Kafka (MSK).

Standard brokers

Standard brokers under MSK Provisioned offer the most flexibility to configure your cluster's performance. You can choose from a wide range of configurations on the cluster to tune dimensions, including availability, durability, throughput, and latency. On Standard brokers, you also control the storage configurations on your cluster and are responsible for managing storage provisioning and utilization.

Storage options

Open all

Fully managed storage

Express brokers include virtually unlimited and elastic storage capacity that requires no sizing, provisioning, or on-going capacity management. Storage capacity automatically scales to accommodate your data retention needs and you pay only for the storage you use.

Tiered storage

With tiered storage, you can store virtually unlimited data in Amazon MSK without the need to provision and manage storage capacity. You can enable tiered storage in a few steps for existing clusters and only pay for what you use. You can first store data in a performance-optimized primary storage tier, and then let Amazon MSK automatically tier data into the low-cost tier for longer retention. The feature is supported in all AWS Regions where Amazon MSK is available. To learn how to get started with tiered storage, visit the Amazon MSK Developer Guide.

Highly resilient

Open all

Multi-AZ deployments

All clusters are distributed across multiple Availability Zones (three is the default), and Amazon MSK offers replicating data across these Availability Zones at no additional cost. Your cluster availability is also backed by the Amazon MSK Service Level Agreement that guarantees three 9s of availability.

Fast and automated recovery

Amazon MSK has automated systems that quickly detect and respond to issues. If a component fails, Amazon MSK automatically replaces it without downtime to your applications. We also automatically deploy software patches as needed to keep your cluster up to date and running smoothly.

Automated Best Practices

Amazon MSK Serverless and Express brokers in MSK Provisioned enforce best practice configurations, such as three-way replication, and reserve bandwidth for background operations, such as replication and recovery, so you can more easily achieve predictable availability of your cluster resources.

Easily set up cross-Region resilience

Using Amazon MSK Replicator, you can set up continual data replication to a secondary backup cluster in another Region, allowing you to build highly available and fault-tolerant multi-Region applications for increased resiliency. You can also use MSK Replicator to provide lower latency data access in different geographic Regions or to distribute data to your partners.

Highly secure

Open all

Private connectivity

Your Apache Kafka cluster runs in an Amazon Virtual Private Cloud (Amazon VPC) managed by Amazon MSK. Kafka clients in your own Amazon VPC can privately access the cluster through a cross-account elastic network interface that Amazon MSK deploys in your VPC. If your Kafka clients are spread across one or more VPCs or AWS accounts, you can still privately connect to your cluster by using the multi-VPC private connectivity feature. This feature removes the operational overhead of self-managing an AWS PrivateLink solution and seamlessly scales as the Amazon MSK cluster scales, enabling you to maintain private connectivity to the cluster without making additional configuration changes. Multi-VPC private connectivity also removes the challenges with managing non-overlapping IPs, complex peering and routing tables associated with other VPC connectivity solutions. The Multi-VPC private connectivity feature allows for overlapping IPs across connecting VPCs.

Granular access control

IAM access control is a no-cost security option that simplifies cluster authentication and Apache Kafka API authorization using IAM roles or user policies to control access. Using IAM access control, you no longer need to build and run one-off access management systems to control client authentication and authorization for Apache Kafka. Your clusters are secured using least-privilege permissions by default. For provisioned clusters, you also can use Simple Authentication and Security Layer (SASL)/Salted Challenge Response Authentication Mechanism (SCRAM) or mutual Transport Layer Security (TLS) authentication with Apache Kafka access control lists (ACLs) to control client access.

Encryption at rest and in transit

Amazon MSK encrypts your data at rest without special configuration or third-party tools. For provisioned clusters, all data at rest can be encrypted using an AWS KMS key by default or your own key. You can also encrypt data in transit through TLS between brokers and between clients and brokers on your cluster. For serverless clusters, all data at rest is encrypted by default using service-managed keys and all data in transit is encrypted by default through TLS.

Connectivity over the internet

Amazon MSK offers an option to securely connect to the brokers of Amazon MSK clusters running Apache Kafka 2.6.0 or later versions over the internet. By enabling Public Access, authorized clients external to a private Amazon VPC can stream encrypted data in and out of specific Amazon MSK clusters.

Scalable

Open all

On-demand broker scaling (provisioned clusters only)

You can scale your MSK Provisioned clusters by adding more brokers or moving to a larger-sized broker instance in minutes with no downtime. Similarly, you can scale down your cluster capacity by removing brokers or by moving to a smaller-sized broker instance.

Cluster scaling (serverless clusters only)

Amazon MSK Serverless clusters automatically adjust compute and storage resources available to your workloads in response to your application’s throughput needs.

Automatic partition management

Amazon MSK integrates with Cruise Control, a popular open source tool for Apache Kafka that automatically manages partition assignments on your behalf. For serverless clusters, Amazon MSK automatically manages partition assignments for you.

Automatic storage scaling

You can seamlessly scale up the amount of storage provisioned per broker to match storage requirement changes using the AWS Management Console or AWS Command Line Interface (AWS CLI). You can also create an auto scaling policy to automatically expand your storage to meet growing streaming requirements.

Configurable

Open all

Cluster configuration

With Amazon MSK, you can choose how configurable you want your clusters to be. Express brokers come preconfigured with the Amazon MSK recommended best practice defaults. This provides optimal availability, durability, and throughput performance out of the box. You can customize select configurations to meet the specific needs of your workload. On the other hand, Standard brokers give you the flexibility to modify more than 30 different cluster configurations. This allows you to tailor the availability, price performance, and overall cluster behavior to your exact requirements. You also have access to Kafka's full suite of dynamic and topic-level configurations, helping you to further refine your experience. For more information, see the Custom MSK configurations documentation.

Observable

Open all

Monitoring streaming performance with Amazon CloudWatch metrics

You can visualize and monitor key metrics using CloudWatch to understand and maintain streaming application performance. A default set of over 50 metrics is available at no additional cost. You can also enable more enhanced broker-level and topic-level monitoring to troubleshoot specific issues. Enhanced metrics are billed at standard CloudWatch rates.

Exporting JMX and Node metrics to a Prometheus server with open monitoring (provisioned clusters only)

Open monitoring with Prometheus lets you monitor Amazon MSK using solutions, such as Datadog, Lenses, New Relic, Sumo Logic, or a Prometheus server, and easily migrate your existing monitoring dashboards to Amazon MSK. For more information, see the Open monitoring with Prometheus documentation.

Exporting broker logs to a destination of your choice

Broker logs enable you to troubleshoot your Apache Kafka applications and analyze their communications with your MSK cluster. You can deliver Apache Kafka broker logs to one or more of the following destination types: Amazon CloudWatch Logs, Amazon Simple Storage Service (Amazon S3), and Amazon Data Firehose. You can also log Amazon MSK API calls with AWS CloudTrail.

Deeply integrated

Open all

Breadth and depth

We offer a wide variety of AWS integrations in Amazon MSK. These integrations include the following:

AWS Identity and Access Management (IAM) for Apache Kafka and service-level API access control
Amazon Managed Service for Apache Flink for running fully managed Apache Flink applications to process streaming data within Apache Kafka
Amazon Managed Service for Apache Flink Studio for running interactive streaming SQL and long-running SQL jobs using Apache Flink SQL
AWS Glue Schema Registry for centrally controlling and evolving schemas
AWS IoT Core for IoT event streaming into Amazon MSK
AWS Database Migration Service (AWS DMS) for change data capture and analytics
Amazon VPC for private client connectivity and network isolation
AWS KMS for at-rest encryption
AWS Certificate Manager Private Certificate Authority for mutual TLS client authentication
AWS Secrets Manager for secure storage and management of SASL/SCRAM secrets
AWS CloudFormation for deploying Amazon MSK in code
Amazon CloudWatch for cluster-, broker-, topic-, consumer-, and partition-level metrics

Next steps

FAQs

Find answers to frequently asked questions

Learn more

Documentation

Learn how to get started with the Amazon MSK Developer Guide

Explore

Amazon Managed Streaming for Apache Kafka features

Why Amazon MSK?

Scalable integration

Page topics

Compatible with Apache Kafka

Supports all features

Client compatibility

Seamless version upgrades

Choose your own cluster type

Amazon MSK Provisioned

Amazon MSK Serverless

No servers to manage

Fully managed clusters

Metadata management nodes included

Multiple broker types

Express brokers

Standard brokers

Storage options

Fully managed storage

Tiered storage

Highly resilient

Multi-AZ deployments

Fast and automated recovery

Automated Best Practices

Easily set up cross-Region resilience

Highly secure

Private connectivity

Granular access control

Encryption at rest and in transit

Connectivity over the internet

Scalable

On-demand broker scaling (provisioned clusters only)

Cluster scaling (serverless clusters only)

Automatic partition management

Automatic storage scaling

Configurable

Cluster configuration

Observable

Monitoring streaming performance with Amazon CloudWatch metrics

Exporting JMX and Node metrics to a Prometheus server with open monitoring (provisioned clusters only)

Exporting broker logs to a destination of your choice

Deeply integrated

Breadth and depth

Next steps

Find answers to frequently asked questions

Learn how to get started with the Amazon MSK Developer Guide

Ending Support for Internet Explorer