- Amazon Managed Streaming for Apache Kaftka
- Features
- MSK Replicator
Amazon MSK Replicator
Reliable and automated data replication for Amazon MSK clusters
Why MSK Replicator?
Amazon MSK Replicator is a feature of Amazon MSK that enables you to reliably replicate data across Amazon MSK clusters and between Kafka deployments including those running on-premises, on AWS, or other cloud providers, as well as Kafka-protocol-compatible services like Confluent Platform, Aiven, RedPanda, WarpStream, or AutoMQ in just few clicks without requiring expertise to setup open-source tools, writing code, or managing infrastructure. MSK Replicator automatically provisions and scales underlying resources, so you can easily build multi-region applications and only pay for the data you are replicating.
Benefits
With MSK Replicator, there is no need to provision infrastructure or setup and run Apache Kafka MirrorMaker 2. MSK Replicator drives best practices through design, monitoring and automation, so you can easily replicate data and metadata between Apache Kafka clusters.
MSK Replicator provides a serverless experience and scales the underlying resources up and down, so you don’t have to plan replication infrastructure capacity and pay only for what you need to replicate data across your Apache Kafka clusters.
MSK Replicator can help applications stay available and high performing for business continuity. If a single AWS region becomes degraded, your application can redirect to a different region which already has the data replicated from the primary region.
Amazon MSK Replicator simplifies migration from self-managed Kafka and other Kafka cloud environments to MSK Express Brokers through managed replication of topic data and metadata. With built-in monitoring and reliable bidirectional replication, MSK Replicator accelerates data migrations compared to traditional tools like MirrorMaker 2 (MM2). During migrations, MSK Replicator keeps consumer group offsets in sync across both source and destination clusters in both directions, so your applications can seamlessly switch between clusters without losing their place — letting you migrate producers and consumers in any order without duplicate message processing.
Use cases
Distribute streaming data between Amazon MSK and external Kafka clusters — whether on-premises, at the edge, or across other cloud providers. MSK Replicator enables bidirectional data flow, allowing you to selectively push data from MSK to external Kafka deployments or pull data from external clusters into MSK, supporting multi-cloud and hybrid architectures where workloads span multiple environments.
Consolidate streaming data from multiple Kafka deployments into a single Amazon MSK cluster for centralized analytics and insights.
Enable bidirectional replication with safe rollback mechanisms. Migrate applications back and forth between clusters if needed, ensuring business continuity. If operating at the edge or on-premises, create disaster recovery clusters in Amazon MSK to maintain continuous data replication and seamless failover when your primary environment is unavailable.
Move your existing Kafka workloads from other Kafka deployments including those running on-premises, on AWS, or other cloud providers, as well as Kafka-protocol-compatible services like Confluent Platform, Aiven, RedPanda, WarpStream, or AutoMQ —to Amazon MSK Express brokers.
Replicate data across regions for low-latency access and improved application resilience. With multi-active replication, you can automatically propagate writes to multiple regions.
Fitch Group
Amazon MSK Replicator powers our FitchRatingsPro web platform, helping us deliver world- class financial data and insights to customers through extensive data feeds and API interfaces. MSK Replicator has simplified our multi-region architecture, replicating data across clusters to manage traffic and keep FitchRatingsPro reliable no matter where customers connect. The seamless data replication while preserving topic names has been a game-changer, allowing us to serve customers with low-latency, high-availability access to critical business data.
Derek Ferguson, Chief Software Officer, Fitch Group