Networking & Content Delivery
Maximising application resiliency with AWS Global Accelerator
AWS services, including AWS Global Accelerator, are designed for inherent operational resiliency, to avoid single points of failure. Global Accelerator is architected and designed to provide operational resiliency, including the following components and operational practices:
- Global static anycast IP addresses
- Network zones
- Cell-based architecture
- Shuffle sharding
- Multi-Region Amazon Route 53 health checks
We’ll discuss each of these in detail in this blog post. First, we’ll explain how AWS defines resiliency and ways that you might deploy an AWS application to improve its resiliency. Next, we’ll provide an overview of Global Accelerator, and then we’ll dive into the specifics of how the service provides inherent resiliency. Resiliency is built into the architecture of the Global Accelerator, along with functionality that can improve the resiliency posture of applications you deploy with it.
Note: This post covers the key architectural underpinnings of Global Accelerator that provide resiliency in the service and in applications deployed with it. Design considerations to ensure application endpoint resiliency are not in scope for this post.
Resiliency with AWS services
At AWS, resiliency is defined as the capability of a workload or service to recover when stressed by load (more requests for service), attacks (either accidental through a bug, or deliberate through intention), or failure of a workload component. Non-resilient architectures result in a poor customer experience because an application or service has more instances where it’s unavailable. This might be caused by single points of failure in the architecture, or by end-user requests being served by overloaded or unhealthy application components.
When you use a service that’s designed with inherent resiliency, it contributes to the performance and high availability of your application. Availability and performance are both crucial, since they determine the end-user experience for a cloud-based application.
For applications that provide an internet-facing service to clients, these factors are impacted by “internet weather” – that is, fluctuations in traffic quality over the public internet. The impact of internet weather increases proportionally with the geographical distance between users and applications. To limit this exposure, many customers deploy their applications in multiple AWS Regions so that they can serve their global customer base by routing client requests to geographically-close application endpoints. However, this results in complex DNS architectures, especially if customers need to dynamically distribute traffic across application endpoints in different Regions.
This contributes to another challenge, because typically, failing over between geographically-distributed application endpoints requires manual intervention. One alternative is to use Amazon Route 53 health checks to automatically fail over. With Route 53, you can set up automatic multi-Region failover with health checks on application endpoints, and optimize DNS responses based on the availability and latencies that are reported by different application endpoints. However, using this technique, your application’s availability can be impacted by delays caused by DNS caching and static DNS entries.
AWS Global Accelerator is a networking service that helps improve the resiliency, availability, and performance of multi-Region applications. It can also help reduce your dependence on DNS for failover. In the next section, we’ll summarize Global Accelerator and introduce automatic failover for endpoints behind accelerators.
About AWS Global Accelerator
Global Accelerator is a networking service that uses the AWS global network infrastructure to send end-user traffic to workloads hosted in multiple AWS Regions. Global Accelerator provides two global static anycast IPv4 addresses (or two IPv4 addresses and two IPv6 addresses), announced from AWS Edge locations, that can be used to connect to applications hosted in multiple AWS Regions. The Edge locations near users act as entry points, and the traffic is carried over the AWS backbone network to a Regional application endpoint closest to the end user.
Global Accelerator supports endpoints such as Application Load Balancers (ALBs), Network Load Balancers (NLBs), Amazon Elastic Compute Cloud (Amazon EC2) instances, and Elastic IP addresses, hosted in one or more Regions. Global Accelerator’s automatic routing optimization routes traffic to healthy application endpoints while the backbone provides low latency and minimizes jitter and packet loss for end-user traffic. Global Accelerator improves end-user performance by up to 60%.
Global Accelerator also provides automatic failover for application endpoints distributed within a Region, or across multiple Regions, which improves the overall resiliency posture of applications deployed behind an accelerator. We will discuss more about Route 53 health checkers and Global Accelerator in a later section in this blog post, Multi-Region Route 53 health checkers.
Global static anycast IP addresses
Global Accelerator improves the availability of your workload by providing two global static anycast IPv4 addresses (or two IPv4 addresses and two IPv6 addresses for dual-stack accelerators). These IP addresses provide a single point of entry for global traffic toward application endpoints running in one AWS Region or multiple Regions.
The physical infrastructure for Global Accelerator is hosted in AWS Edge locations, and these deployments are called Global Accelerator Points of Presence (POPs). An end-user request is routed to the nearest Edge location that advertises the accelerator’s static IP addresses, and then the request goes over the AWS global backbone to the nearest healthy application endpoint (see Figure 1).
Currently, AWS has 99 POPs in 46 countries and 84 cities, from which the static anycast IP addresses are advertised over the internet. You can see the current list of Global Accelerator POPs here.
Global Accelerator’s anycast routing provides resiliency against the failure of individual Global Accelerator POP locations. Because anycast IP addresses are advertised to the internet from multiple Edge locations, if the closest Global Accelerator POP fails, traffic is automatically forwarded to the next closest Global Accelerator POP location (see Figure 2).
Figure1: Request routed to closest AWS Regional application endpoint
Figure 2: Failover to another Regional application endpoint
Network zones
At each Edge location, Global Accelerator uses two independent network zones. Network zones host a shared infrastructure and serve multiple customers (see Figure 3). Similar to an Availability Zone (AZ) in an AWS Region, a network zone is an isolated unit with its own set of physical infrastructure, deployed in all 96 POPs. Each network zone is managed independently and serves IP addresses from a unique IP subnet. There are strict change control policies in place to make sure that any changes to the service are applied to only one network zone at a time, whether they’re software updates or engineer-driven changes.
Global Accelerator uses network zones to advertise IP addresses to provide fault tolerance and isolation. When you create an accelerator, you’re assigned two global anycast static IP addresses, and each IP address is advertised from a different network zone. Since network zones are isolated from each other, they provide protection against the failure of physical infrastructure in one network zone within a single Edge location (see Figure 4).
Figure 3: Network zone architecture
Figure 4: Network event affecting one zone
You can configure DNS records to point your application domain name to the anycast IP addresses, or you can use the DNS record associated with the accelerator that Global Accelerator provides. You can also associate health checks with these A records to track the availability of each IP address. By using both IP addresses within your application, you protect your application’s availability from individual hardware component failures inside a single Global Accelerator POP.
Global Accelerator is designed for 99.995% availability when you use both accelerator IP addresses to serve customer requests. AWS provides a service commitment of 99.99% of monthly uptime for Global Accelerator.
Limiting the blast radius – Cell-Based Architecture and Shuffle Sharding
As you can see in Figures 3 and 4, network zones are shared by multiple customers. To limit the impact of one customer’s issues on other customers, Global Accelerator uses the additional safeguards of a cell-based architecture and shuffle sharding.
To create more protection for customers, each network zone is partitioned into four cells, each of which is isolated from the other and has similar configurations for serving customer traffic (see Figure 5). Each cell is supported by multiple physical hosts within a network zone and is managed independently through software automation.
In addition, each customer is associated with a pair of cells, with each cell in a separate network zone. As a result, if one cell is impacted, end-user traffic is still served from the other cell in a neighboring network zone.
Cellular architecture helps reduce the blast radius of a single customer. Instead of an issue with one customer impacting all other customers in a network zone, the issue instead affects just a specific cell in the zone. It also limits the blast radius to only customers who share the same cell with the first customer.
Global Accelerator also uses a methodology called shuffle sharding to further reduce the impact of customer issues across network zones. Using shuffle sharding to allocate customers to cells minimizes the scenarios in which the same set of customers share cells in different network zones.
Sharding is a process of slicing up the infrastructure and then allocating it to different customers’ applications. With standard sharding, customers are divided into groups and associated with a specific shard. This limits the blast radius emanating from issues with one customer to only those customers that share a shard with that customer. Shuffle sharding improves on this by creating virtual shards, and then putting each customer in more than one shard. Then, each virtual shard is mapped to the underlying physical resources.
Shuffle sharding provides advantages when one customer’s resources are overwhelmed because it prevents impact on other customers in the network zone. With shuffle sharding, customer B and customer C don’t share the same cells in both network zones (see Figure 5). When there’s a scaling event with customer B, the cells used by customer B can be impacted in both network zones. But customer C can avoid being impacted simply by switching to use the other IP address, advertised by the other network zone, where it doesn’t share a cell with customer B (see Figure 6).
Figure 5: Cells per zone
Figure 6: Limiting impact to a single network zone with shuffle sharding
Multi-Region Route 53 health checkers
Global Accelerator automatically checks the health of application endpoints by using Amazon Route 53 health checks. Route 53 operates its health checkers in multiple AWS Regions worldwide. By continuously monitoring the health of application endpoints, Global Accelerator can immediately fail over and route traffic to healthy endpoints.
Each Route 53 health checker independently monitors the health of the application endpoints associated with an accelerator. In addition, each health checker operates independently across Regions and uses independent infrastructure across Availability Zones. Each health checker is also managed individually, using software automation. These precautions make sure that events that impact the health checkers in one Region don’t affect those in other Regions, which can continue to check endpoints.
In the unlikely event that all Route 53 health checkers globally are impacted, Global Accelerator continues to serve traffic to application endpoints, using the last known endpoint health.
Conclusion
In this post, we explained how AWS Global Accelerator uses static anycast IP addresses, network zones, and other AWS system design constructs, like cell-based architecture and shuffle sharding, to deliver a highly-resilient global service. You can use the static IP addresses provided by Global Accelerator in your application to maximize the availability of your application endpoints to your end users over the internet. In addition, Global Accelerator provides resiliency by continuously monitoring endpoint health to provide automatic failover for your application endpoints.
To learn more, check out AWS Global Accelerator docs and other AWS Global Accelerator blog posts.