AWS Public Sector Blog

Enabling resilient hybrid edge architectures with AWS

Enabling resilient hybrid edge architectures with AWS

Organizations operating in austere environments require robust, resilient architectures that maintain operations regardless of connectivity status. This post explores how to implement and deploy resilient hybrid edge architectures using Amazon Web Services (AWS), with a focus on network resilience, data synchronization, security, and configuration management. We’ll cover best practices for deploying and managing hybrid architectures that enable continuous operations in austere conditions.

AWS hybrid architectures combine the scalability and innovation of the cloud with the reliability and low latency of local processing. Customers can leverage services like; AWS Systems Manager to enable unified operations management, AWS Outposts to extend AWS infrastructure to on-premises locations, and AWS Identity and Access Management (IAM) to support robust security. By implementing AWS hybrid edge best practices, organizations can build resilient architectures that maintain operations in challenging environments. Organizations can achieve resilient connectivity across distributed environments by combining terrestrial and space-based communications. Resilient, hybrid architectures enable organizations to achieve:

  • Consistent operations across cloud and on-premises environments
  • Seamless workload portability
  • Unified security and governance
  • Reliable local processing capabilities

Using the PACE approach for network resiliency

Network resilience is critical for maintaining operations in austere environments. Resiliency through redundancy creates a framework designed to keep edge workloads operating during extenuating circumstances. AWS recommends the Primary, Alternate, Contingency, Emergency (PACE) framework through multiple connectivity options to maintain continuous availability. The primary tier path uses AWS Direct Connect for dedicated, high-bandwidth connectivity with multiple connection paths and automated failover capabilities. An alternate tier path is created when you link two or more edge sites, using multiple internet service provider (ISP) connections for redundancy and dynamic routing protocols for automatic path selection. In the following diagram, each edge site is anchored to a different Availability Zone. This creates geographical separation of the primary and alternate paths for each edge site. For contingency situations, the architecture integrates satellite networking to enable redundant connectivity to cloud resources. In emergency scenarios where all external connectivity is compromised, local processing capabilities continue operating with static stability on AWS Outposts for up to 7 days of disconnected operations, supported by local caching and data synchronization mechanisms.

The following figure shows edge resilient architecture using AWS Outposts, AWS Direct Connect, and terrestrial and nonterrestrial network paths:

Figure 1 Edge resilient architecture

Figure 1: Edge resilient architecture

Terrestrial and nonterrestrial network communications

A robust hybrid architecture uses both terrestrial and nonterrestrial communication paths to maintain continuous connectivity in austere environments. For terrestrial communications, AWS Direct Connect enables the establishment of dedicated network connections from on-premises data centers to AWS, while AWS Transit Gateway streamlines network architecture by centrally managing connections between virtual private clouds (VPCs) and on-premises networks. For nonterrestrial communications, AWS Ground Station integrates satellite communications directly with AWS services, and LEO SATCOM provides global broadband coverage. Together, both terrestrial and nonterrestrial solutions create a sophisticated communication framework that maintains operational continuity regardless of ground-based infrastructure limitations or disruptions.

Data transport and synchronization options

Efficient data transport and synchronization are critical for maintaining consistency across hybrid environments, particularly in austere locations where connectivity might be intermittent or bandwidth is limited. AWS DataSync automates and accelerates data transfers between on-premises storage systems and AWS so that critical information flows reliably between edge locations and the cloud. AWS Storage Gateway provides on-premises applications with seamless access to cloud storage, creating a unified storage experience that bridges local and remote resources. Additionally, AWS Systems Manager enables the management of file transfers and system states across distributed environments, providing visibility and control over data synchronization operations throughout the hybrid architecture. These integrated services work together to keep data consistent, accessible, and properly replicated across the components of the hybrid infrastructure.

Demonstration and testing

AWS recently created a demonstration named Project Independence for organizations that need to process geospatial data in austere environments. This environment includes AWS Outposts and third-party servers operating inside an AWS Modular Data Center (AWS MDC). AWS MDC is a seamless, cost-effective service for defense and intelligence agencies to deploy infrastructure around the world to run low latency applications. The solution in the demonstration enabled the ingestion of satellite data through AWS Ground Station for processing and analysis, and durable cloud storage through Amazon Simple Storage Service (Amazon S3). PACE redundancy allowed AWS Outposts to process data using edge compute and store products in the cloud for further use. When the primary link was removed, the secondary link over SATCOM provided a path to the Region to continue operations.

Satellite data processing levels 0 through 2

Satellite data processing begins with Level 0, where raw sensor data is downlinked using AWS Ground Station and stored in Amazon S3 with communications artifacts removed. This foundational data progresses to Level 1 processing on AWS Outposts and/or third-party servers within the AWS MDC. At Level 1, the imagery is georeferenced and adjusted for known sources of error or interference, such as atmospheric distortion and sensor calibration issues. Lastly, Level 2 processing transforms this corrected data into specific data-rich products, such as sea surface temperature maps or calibrated visible light imagery, which are ready for analysis and decision-making. Throughout this pipeline, the resilient connectivity of the AWS MDC, enabled by AWS Direct Connect and SATCOM, maintains continuous processing during PACE scenarios.

Figure 2 illustrates Project Independence demonstration architecture, which includes satellite data processing from ingestion through edge computing to visualization. The architecture consists of four primary components working in concert: AWS Ground Station receives National Oceanic and Atmospheric Administration (NOAA) satellite broadcasts and transfers the data to the AWS US East (N. Virginia) Region, where software-defined radio (SDR) components demodulate the satellite signal and store the source data durably in S3. The AWS MDC serves as the edge computing environment, housing multiple compute resources including an AWS Outposts Rack, AWS Outposts Servers, and third-party servers. These compute resources download the source data and apply government off-the-shelf (GOTS) algorithms from NOAA data to produce weather imagery at the edge.

Figure 2 Demonstration architecture

Figure 2: Demonstration architecture

The architecture is specifically designed to demonstrate resilient operations across multiple connectivity scenarios. Under normal conditions, data flows seamlessly from satellite to final product using terrestrial networks. However, the system maintains operational continuity during degraded conditions by using LEO satellite broadband as an alternate connectivity path. The AWS MDC continues local processing when disconnected from the control plane, showcasing the architecture’s ability to support mission-critical operations in austere or contested environments where reliable connectivity can’t be guaranteed.

Outpost service link connection status

The following graph monitors the connectivity status of two AWS Outposts service links throughout an observation period. During a network failover from AWS Direct Connect to SATCOM, one service link, which is represented by the orange line, experienced a brief disconnection. This is visible in the graph as a momentary drop to zero. This blip lasted only seconds before connectivity was restored and the service link returned to its normal operational state using SATCOM instead of AWS Direct Connect.

Figure 3 Outpost service link connection showing failover from terrestrial to nonterrestrial failover

Figure 3: Outpost service link connection showing failover from terrestrial to nonterrestrial failover

Despite this brief interruption to the service link during the transport layer transition, the Outposts themselves remained fully operational and continued processing downlinked satellite data without interruption. This demonstrates an important architectural characteristic: The AWS Outposts local compute capabilities are resilient to transient service link disruptions, meaning mission-critical processing workloads maintain continuity during network failover events.

Total workflow duration

The following graph shows the processing time for the satellite data workflow over a 3-hour period. Under normal operations with terrestrial fiber connectivity, the end-to-end workflow is completed within a range of 120-300 seconds. At around 16:00 UTC, when connectivity switched from terrestrial fiber to SATCOM, the total workflow duration increased in range to 442-900 seconds, due to the higher latency inherent in space-based networking paths. After terrestrial fiber connectivity was restored, performance returned to baseline levels.

Figure 4 Total workflow increase and decrease in seconds during nonterrestrial workflow

Figure 4: Total workflow increase and decrease in seconds during nonterrestrial workflow

This demonstrates the resilience of the AWS MDC architecture: Despite the performance impact during the SATCOM failover period, mission-critical satellite data processing continued uninterrupted, maintaining operational continuity during adverse network conditions.

The system maintained operational capability throughout the connectivity transition, proving that AWS Direct Connect with SATCOM backup enables continuous processing during PACE scenarios, albeit with expected performance trade-offs. Using a more robust SATCOM option that provides higher throughput can decrease the total workflow time.

Process Level 0

As shown in the following graph, the Level 0 processing time remained consistently stable at approximately 150 seconds throughout the observation period, demonstrating that compute-intensive operations are unaffected by network connectivity changes. Because Level 0 processing relies exclusively on local compute resources within the AWS MDC, such as AWS Outposts and third-party servers, the workflow maintained nominal performance regardless of whether data transport occurred over terrestrial fiber or SATCOM.

Figure 5 Level 0 processing time at the edge while transitioning between terrestrial and nonterrestrial network paths

Figure 5: Level 0 processing time at the edge while transitioning between terrestrial and nonterrestrial network paths

This stability highlights a key architectural advantage: By collocating processing resources with the data, compute operations continue at full performance during network failover events so mission-critical data processing workloads are insulated from connectivity disruptions.

Conclusion

Global edge and austere deployments represent a strategic imperative for organizations operating in challenging environments where traditional cloud connectivity can’t be guaranteed. By implementing hybrid architectures that seamlessly integrate AWS Cloud services with on-premises infrastructure through solutions such as AWS Outposts, AWS Systems Manager, AWS Direct Connect, and satellite networking, organizations can achieve true operational resilience. The PACE approach to network resiliency provides a sophisticated framework for building architectures that adapt to varying connectivity conditions. This enables consistent operations across distributed environments, seamless workload portability, and reliable local processing capabilities while maintaining unified security postures and governance models.

Success in austere deployments requires careful planning around data synchronization strategies, automated failover mechanisms, and static stability on AWS Outposts that enables disconnected operations. As global operations increasingly extend to remote locations—whether supporting military operations, remote industrial sites, maritime deployments, or disaster recovery scenarios—the ability to deploy resilient hybrid architectures becomes a competitive differentiator. Organizations that embrace these architectural patterns position themselves to operate confidently and provide continuity under many conditions. They can maintain critical functions that continue independently at edge locations while facilitating eventual consistency with central cloud resources after connectivity is restored. Engage your AWS account team today to learn more about architecting for edge resiliency in austere environments.

Zack Colbert

Zack Colbert

Zack Colbert is a Technical Sales Representative specializing in edge and secure networking solutions for National Security customers. With a strong focus on providing comprehensive training on edge technologies, he empowers customers to effectively implement and manage secure solutions at the edge. Zack combines his technical expertise with a passion for customer success, ensuring that each client is equipped to meet their unique security challenges.

Mark Simonds

Mark Simonds

Mark Simonds is a Senior Solutions Architect at AWS specializing in national security solutions. He has experience working with large processing systems and satellite-based communications.

Wallace Cole

Wallace Cole

Wallace Cole is a Principal Solutions Architect at AWS specializing in edge computing and national security solutions. He leads technical responses for government cloud initiatives, develops training programs for secure environments, and provides architectural guidance for high-performance workload implementations. Wallace focuses on helping organizations leverage AWS capabilities for tactical edge computing, isolated operations, and mission-critical workloads in highly regulated environments.