[SEO Subhead]
This Guidance demonstrates how to implement dual-write capabilities from Apache Cassandra to Amazon Keyspaces for Apache Cassandra with minimal downtime during the data migration. Included are AWS CloudFormation templates that significantly reduce the complexity of setting up key the components, such as the Amazon Managed Streaming for Apache Kafka (Amazon MSK) cluster, Amazon Keyspaces, and the Apache Cassandra cluster, reducing manual configuration efforts. These templates, along with additional scripts for Apache Kafka Sink connectors, allow for the simultaneous insertion of data into Amazon Keyspaces and Apache Cassandra databases to facilitate a seamless migration process.
Note: [Disclaimer]
Architecture Diagram

[Architecture diagram description]
Step 1
The Apache Kafka console producer command line interface (CLI), hosted within an Amazon Elastic Compute Cloud (Amazon EC2) instance and functioning as a Kafka client, directly produces and publishes messages to a Kafka topic within the Amazon Managed Streaming for Kafka (Amazon MSK) service.
Get Started

Deploy this Guidance
Well-Architected Pillars

The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
-
Operational Excellence
Amazon MSK is a fully managed Apache Kafka service that automates complex administrative tasks like setup, scaling, and patching. By managing Kafka Connect connectors directly within the Amazon MSK service, operations are not only automated but also optimized for handling high-volume data streams with minimal downtime, supporting continuous improvement and operational resilience. Moreover, Amazon CloudWatch can be used to monitor the published metrics from Amazon MSK to quickly identify and troubleshoot any issues that arise. This monitoring capability allows for quick detection of anomalies and performance bottlenecks, making it easier to maintain system reliability and meet your service level agreements.
-
Security
By configuring a combination of PrivateLink, Amazon VPC, and Amazon VPC endpoints, a set of services is established that work in tandem to help ensure that all data transfers occur within the private AWS network. This setup minimizes potential attack vectors by keeping critical infrastructure off the public internet and restricting access to trusted entities only. Specifically, PrivateLink facilitates secure data transmission within AWS, while Amazon VPC helps facilitate both Amazon MSK clusters. Apache Cassandra Amazon EC2 instances operate in a secure, isolated network environment, accessible only through specific, controlled points. In addition, Amazon VPC endpoints for Amazon Keyspaces allow secure, private connectivity between those services, removing the need to use public URLs. Lastly, AWS Identity and Access Management (IAM) roles provide fine-grained access control so that only authorized users and systems can access specific AWS resources.
-
Reliability
Amazon MSK is a resilient streaming service, automatically managing data replication and failover within its Kafka brokers across multiple Availability Zones (AZs) for message handling. Additionally, Amazon Keyspaces enhances data availability through automatic three-way replication across three AZs within an AWS Region. Amazon EC2 instances hosting Apache Cassandra are deployed across private subnets within different AZs through Amazon VPC, distributing resources to mitigate risks from single points of failure. Lastly, PrivateLink specifically secures data transfers to Amazon Keyspaces for reliable and protected data flow without exposure to the public internet.
-
Performance Efficiency
Amazon Keyspaces is a service with managed, serverless database capabilities that automatically provide the capacity to match the demand of incoming writes from Amazon MSK, providing efficient processing without latency issues. This automation supports consistent performance even during high volumes of writes. Amazon Keyspaces also offers workload isolation at the table level so that the performance of one table is not affected by the workload of another table. This feature supports predictable performance across different tables by maintaining dedicated resources for each table.
-
Cost Optimization
Amazon MSK is a fully managed Kafka service that removes the need for manual provisioning and management of Kafka clusters, thus minimizing operational overhead and reducing resource waste. Amazon Keyspaces eliminates the need for you to invest in hardware upfront. You can offload essential operational tasks such as provisioning, patching, and managing servers, as well as installing, maintaining, and operating database software, to AWS.
-
Sustainability
With Amazon Keyspaces, you can choose on-demand or provisioned capacity mode so you can optimize the use of reads and writes based on your traffic patterns, preventing the over-provisioning of your resources. This efficient use of infrastructure conserves resources and reduces energy waste.
Related Content

[Title]
Disclaimer
The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.
References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.