Amazon MSK is a fully managed service that makes it easy for you to build and run applications that use Apache Kafka to process streaming data. Managing Apache Kafka clusters is complex and time consuming. Amazon MSK makes it easy for you to build and run production applications on Apache Kafka without needing Apache Kafka infrastructure management expertise so you spend less time managing infrastructure and more time building applications.
Support for native Apache Kafka APIs and tools
Amazon MSK supports native Apache Kafka APIs and existing open-source tools built against those APIs. This enables existing Apache Kafka applications to work with Amazon MSK clusters without changes to application code. You continue to use Apache Kafka’s APIs and the open-source ecosystem to populate data lakes, stream changes to and from databases, and power machine learning and analytics applications.
No servers to manage
With a few clicks in the Amazon MSK console, you can create a fully managed Apache Kafka cluster that follows Apache Kafka’s deployment best practices, or you can create your own cluster using your own custom configuration. Once you create your desired configuration, Amazon MSK automatically provisions, configures, and manages the operations of your Apache Kafka cluster and Apache ZooKeeper nodes.
Fully Managed Apache Kafka Upgrades
You can upgrade Apache Kafka versions on Amazon MSK clusters in just a few clicks, allowing you to take advantage of features and bug fixes present in new Apache Kafka versions. Amazon MSK provides a fully-managed upgrade experience using a rolling update of Apache Kafka brokers to enable in-place upgrades for customers following high-availability best practices.
Apache ZooKeeper included
Apache ZooKeeper is required to run Apache Kafka, coordinate cluster tasks, and maintain state for resources interacting with the cluster. Amazon MSK manages the Apache ZooKeeper nodes for you. Each Amazon MSK cluster includes the appropriate number of Apache ZooKeeper nodes for your Apache Kafka cluster at no additional cost.
Automatic recovery and patching
Amazon MSK continuously monitors the health of your clusters and replaces unhealthy brokers without downtime for your applications. Amazon MSK manages the availability of your Apache ZooKeeper nodes so you will not need to start, stop, or directly access the nodes yourself. Amazon MSK also deploys software patches as needed to keep your cluster up to date and running smoothly.
Amazon MSK uses multi-AZ replication for high-availability. Data replication is included at no additional cost.
Your Apache Kafka clusters run in an Amazon VPC managed by Amazon MSK. Your clusters are available to your own Amazon VPCs, subnets, and security groups based on the configuration you specify. You have complete control of your network configuration, and IP addresses from your VPCs are attached to your Amazon MSK resources through elastic network interfaces (ENIs).
Encryption and security
Amazon MSK encrypts your data at rest without special configuration or third-party tools. All data can be encrypted at rest using AWS Key Management Service (KMS) Customer Master Key (CMK) by default, or your own CMK.
Amazon MSK also encrypts data in-transit via TLS between brokers and between clients and brokers on your cluster. Amazon MSK also supports TLS based certificate authentication, SASL/SCRAM authentication secured by AWS Secrets Manager, and Apache Kafka access control lists (ACLs) to authenticate and authorize producers and consumers within your cluster.
You can start with a few brokers within an Amazon MSK cluster. Then, using the AWS management console or AWS CLI, you can scale up to 100’s of brokers per cluster. Submit a limit increase request if you need more than 15 brokers per cluster or more than 30 brokers per account.
Alternatively, you can scale your Amazon clusters by changing the size or family of your Apache Kafka brokers. Changing the size or family of your brokers gives you the flexibility to adjust your MSK cluster’s compute capacity for changes in your workloads.
You can seamlessly scale up the amount of storage provisioned per broker to match changes in storage requirements using the AWS management console or AWS CLI or you can create an auto scaling policy to automatically expand your storage to meet your streaming requirements.
Amazon MSK makes it easier for AWS customers to build end-to-end solutions by providing native AWS integrations out-of-the-box. You can run fully managed Apache Flink applications on data within Amazon MSK, encrypt data at rest using AWS KMS, authenticate clients to Amazon MSK using AWS Certificate Manager Private CAs or client credentials secured by AWS Secrets Manager, deploy Amazon MSK using code with AWS CloudFormation, privately connect clients within an Amazon VPC to Amazon MSK, leverage AWS Identity and Access Management (IAM) for fine-grained service-level API control, and integrate with the AWS Glue Schema Registry to centrally control and evolve your data schemas.
Integrates with AWS Glue Schema Registry
AWS Glue Schema Registry, a serverless feature of AWS Glue, enables you to validate and control the evolution of streaming data using registered Apache Avro schemas, at no additional charge. Through Apache-licensed serializers and deserializers, the Schema Registry integrates with Java applications developed for Apache Kafka, Amazon Managed Streaming for Apache Kafka (MSK), Amazon Kinesis Data Streams, Apache Flink, Amazon Kinesis Data Analytics for Apache Flink, and AWS Lambda. When data streaming applications are integrated with the Schema Registry, you can improve data quality and safeguard against unexpected changes using compatibility checks that govern schema evolution.
Amazon MSK deploys a best practice cluster configuration for Apache Kafka by default, and gives customers the ability to tune more than 30 different cluster configurations while supporting all dynamic and topic-level configurations. For more information, see Custom MSK Configurations in the documentation.