Setting Data in Motion at the Mobile Edge with AWS Wavelength and Confluent
By Joseph Morais, Staff Cloud Partner Solutions Architect – Confluent
By Robert Belson, Developer Advocate, Edge Computing – AWS
One of the most exciting evolutions in cloud computing is the proliferation of hybrid cloud and edge computing services.
By providing developers and enterprises more choices for where they can run applications, customers can satisfy increasingly complex requirements for latency, data sovereignty, high-availability, and beyond.
In this post, learn how the Confluent for AWS Wavelength solution was developed to bring Confluent’s data-in-motion technology to the network edge.
We’ll demonstrate how the solution extends Confluent for Kubernetes (CFK) to Amazon EKS self-managed nodes in AWS Wavelength Zones to create a geo-distributed Kafka cluster at-scale for industrial IoT and low-latency 5G use cases.
Founded by the creators of Apache Kafka, Confluent is an AWS Data and Analytics Competency Partner that enables organizations to harness business value from stream data.
The Shift to the Edge
While edge computing allows for a whole new world of customer experiences and real-time use cases, the implementation of edge architectures often introduces multi-faceted challenges to customers.
The conversation-starter is almost always cost. If an application in the cloud were to be distributed to every edge environment, the cost would increase by a multiplicative factor (corresponding to the number of edges), not to mention the incremental management complexity to operate each of these topologically distinct edge locations.
Operational teams are often a known constraint, so trying to have staff available to support a global edge computing use case can also be a daunting task. Furthermore, given the inherently ambiguous definition of “edge,” lighter-weight edge devices running production applications do not have the same availability and durability guarantees. This results in potential outages and downstream customer impacts.
Through AWS edge computing, we can solve these challenges directly by providing customers a breadth of solutions. From AWS regions and the metropolitan edge to the 5G and disconnect edge, customers have more chances than ever to balance their availability and durability requirements, as well as their budget.
AWS Wavelength brings the best of compute and storage services to the edge of carrier 5G networks to deliver ultra-reliable, secure, and low-latency immersive experiences. However, real-time applications for Industry 4.0 use cases often necessitate high-throughput data streaming. In the absence of a managed Kafka or data streaming solution at the edge today, we saw customers express the desire to set their data in motion for 5G-connected devices at scale.
Figure 1 – AWS Wavelength end-to-end latency visualization.
Demystifying Edge Use Cases
We have seen a paradigm shift in the industry, where user experiences and business operations need to happen in real time, powered by data about what’s actually happening in the world.
Companies need to build experiences to meet consumer and employee expectations to differentiate themselves from competitors. In an effort to accelerate these digital initiatives and lower operational costs while increasing development speeds, many businesses are adopting a cloud-first approach.
For example, you might have a large fleet of delivery trucks dropping off packages around the country, and you want your customers and backend systems to know immediately when a package has been successfully delivered. Conversely, if there’s a delay or some other problem, you also want to know about that immediately and update your customers.
Or, you may have thousands of retail locations around the country and want to update your backend inventory systems, as well as in-store discounts and promotions, to respond to real-time customer activities within each store.
Since 2021, AWS and Confluent—in close collaboration with Verizon—have been co-developing joint solutions for the edge cloud. Starting with AWS Wavelength, we explored how to extend data-in-motion solutions via Confluent using the Verizon 5G Ultra Wideband network.
Working closely with early customers and partners, we are excited to announce the integration of Confluent for AWS Wavelength, a fully-automated solution to deliver low-latency mobile applications without the customer having to manage or purchase dedicated hardware.
This joint solution—available today on Terraform—scales seamlessly to all Wavelength Zones within a given region, and allows customers to leverage the best experiences of the Confluent Platform without having to manage the complexities of 5G networks, edge replication, or Kafka cluster configuration.
In this section, we’ll highlight the critical infrastructure and software components that, together, enable Confluent to seamlessly operate in mobile edge computing (MEC) environments.
- AWS Wavelength: To deploy containers at the network edge, nodes are needed in each Wavelength Zone for the underlying Confluent deployment.
- Amazon EKS: Building on existing services developers know and love, the AWS-managed control plane is leveraged to orchestrate containers at the network edge.
- Confluent for Kubernetes (CFK): Confluent’s Kubernetes-native distribution is used to easily extend geo-distributed Kafka clusters closer to end users.
- Cluster linking: To abstract away the hub-and-spoke topology of AWS Wavelength, a managed replication mechanism ensures data generated at the edge can be sent (when needed) back to the cloud.
Figure 2 – Confluent for AWS Wavelength (one site).
AWS Wavelength is an infrastructure offering optimized for mobile edge computing applications. Wavelength Zones are AWS infrastructure deployments that embed AWS compute and storage services within communications service providers’ data centers at the edge of 5G networks.
This allows application traffic from 5G devices to leverage telecommunications networks. By doing so, it avoids mobile traffic having to egress over the public internet to the cloud and incur an incremental latency penalty.
Furthermore, because the traffic is segmented and isolated only to this network, your architecture is not accessible by the public internet and, thus, more secure. As a result, AWS Wavelength allows customers to take advantage of the benefits of modern 5G networks without tackling the complexities over the underlying network itself.
AWS Wavelength provides fully-managed infrastructure including Kubernetes clusters out-of-the-box using Amazon Elastic Kubernetes Service (Amazon EKS) and node groups in any one of the generally-available Wavelength Zones located closer to your edge endpoints.
Amazon EKS is a managed Kubernetes service that makes it easy to run Kubernetes on AWS and on premises. It automatically manages the availability and scalability of the Kubernetes control plane nodes responsible for scheduling containers, managing application availability, storing cluster data, and other key tasks.
With Amazon EKS, you can take advantage of the performance, scale, reliability, and availability of AWS infrastructure, as well as integrations with AWS networking and security services.
Confluent for Kubernetes (CFK)
On top of Amazon EKS, Confluent for Kubernetes (CFK) is deployed on nodes in AWS Wavelength to orchestrate Confluent’s enterprise-ready version of Kafka in a declarative, API-driven manner. Using the same technology that powers Confluent Cloud, a fully managed offering, Confluent for Kubernetes makes it easy to deploy a performant complete, production-grade data-in-motion platform based on Apache Kafka to each of your Wavelength Zones.
Customers can use CFK to deploy Confluent Platform natively to Kubernetes environments to help build your own private cloud Kafka service. With CFK, you can achieve the same simplicity, flexibility, and efficiency of the cloud without the headaches and burdens of complex, Kafka-related infrastructure operations.
Confluent Cluster Linking
Layering Confluent Cluster Linking to replicate and seamlessly link data collected and processed in a Wavelength Zone to Confluent Cloud, a pipeline to the cloud is provided for the aggregation or sharing of data in real time. This unlocks deep integration with AWS-native services or ISV solutions in the parent AWS region, and provides a pattern to build lightning-fast edge applications meshed across all supported locations.
Confluent Cluster Linking is a built-in capability that mirrors data, topic configuration and structure, and consumer offsets from one Confluent cluster to another in real time. A cluster link between a Confluent Platform cluster in your data center—or in this case AWS Wavelength—and a Confluent Cloud cluster in AWS is a single secure, scalable hybrid data bridge that can be used by hundreds of topics, applications, and data systems.
As an example, cluster linking unlocks the ability to aggregate data back to the parent AWS region to utilize multi-site datasets via an Amazon Simple Storage Service (Amazon S3) bucket, or to improve the failure detection model using Amazon SageMaker before pushing the model back to the edge, as shown in Figure 3.
Confluent provides Kafka streams and ksqlDB to allow the process to occur at the edge. Kafka streams allow users to run event-stream applications built in Java or Scala alongside their events, while ksqlDB provides a SQL-like abstraction layer on top of streams. This provides a myriad of possibilities when coupled with the low latency of AWS Wavelength for building near-real-time applications at the edge.
Edge Machine Learning with Confluent for Wavelength
With Confluent for AWS Wavelength, possibilities are opened for delivering machine learning (ML)-based Kafka streams applications at the edge, including failure detection. This use case takes advantage of the low latency of the service and ensures devices in a single factory, for example, receive timely notification of imminent failure.
Figure 3 – Confluent for AWS Wavelength (multi-site).
Whether you are a traditional enterprise with an on-premises data footprint or a digital native, Confluent and AWS can meet you where you are on your cloud journey.
Enterprises can migrate data to AWS Wavelength with Confluent and power real-time analytics and apps on a unified data platform to improve customer experiences and backend operations.
Supporting everything from fraud detection and predictive maintenance to customer retention, Confluent for Wavelength accelerates your journey to the cloud with a complete data-in-motion platform powered by Apache Kafka.
Get started with your first Confluent deployment at the edge, or stream one of Confluent’s APN Immersion Days or community events to deep dive into the architecture and customer use cases. Learn more by reaching out to Confluent or visiting the Confluent partner page.
- Terraform module for Confluent on AWS Wavelength
- Confluent and Verizon at re:Invent 2021, including “Confluent at the edge on AWS” (ARC328-S)
- Confluent and Verizon joint APN Immersion Day with live agro-tech use case demo
Confluent – AWS Partner Spotlight
Confluent is an AWS Data and Analytics Competency Partner that was founded by the creators of Apache Kafka and enables organizations to harness business value from stream data.