Well-Architecting online applications with CloudFront and AWS Global Accelerator
Worldwide, millions of customers are actively using AWS to build applications for every imaginable use case, with a variety of regions in which they can deploy infrastructure. An AWS Region is a physical location where AWS clusters data centers and operates regional services, like AWS Elastic Compute Cloud (EC2) and Amazon Simple Storage Service (S3). In the specific case of online applications, user traffic may traverse multiple public networks to reach an AWS customer’s regional infrastructure. Customers who want to address the drawbacks of traversing uncontrolled networks in terms of performance and reliability should consider adding AWS edge services to their architectures.
AWS edge services like Amazon CloudFront and AWS Global Accelerator, operate across hundreds of worldwide distributed Points of Presence (PoPs) outside of AWS Regions. Users are served from these PoPs within 20 to 30 milliseconds on average, and, when needed, their traffic is carried back to customers’ regional infrastructure over the AWS global network instead of going over the public internet. The AWS Global Infrastructure is a purpose-built, highly available, and low-latency private infrastructure built on a global, fully redundant, metro fiber network that is linked via terrestrial and trans-oceanic cables across the world.
In addition to performance and reliability, AWS edge services help customers enhance their resiliency against infrastructure Distributed Denial of Service (DDoS) attacks. AWS edge services customers benefit from a larger and more distributed DDoS mitigation systems, providing a mitigation capacity of multiple hundreds of Tbps across PoPs. AWS edge services also employ advanced DDoS mitigation techniques such as SYN Proxy, which provides protection against SYN floods by sending SYN cookies to challenge new connections before they are allowed to continue upstream.
In this blog, I explain how online applications can be well-architected using CloudFront and Global Accelerator.
CloudFront, a foundational component for web applications
Amazon CloudFront is Amazon’s Content Delivery Network (CDN). To use this service, customers create a CloudFront distribution, configure their origin (any origin that has a publicly accessible domain name), attach a valid TLS certificate using Amazon Certificate Manager, and then configure their authoritative DNS server to point their web application’s domain name to the distribution’s generated domain name. During the DNS resolution phase, when users navigate to the web application, the HTTP(S) request is dynamically routed to the best CloudFront PoP in terms of latency and availability. Once the PoP is selected, the user terminates the TCP connection, including the TLS handshake, on one of the PoP’s servers, and then sends the HTTP request. If the content is cached in one of the cache layers of CloudFront, the request will be fulfilled locally by CloudFront. Otherwise, the request is forwarded to the origin. To learn more about these steps, watch A Few Milliseconds in the Life of an HTTP Request.
In general, users interact with web applications using HTTP(S), which makes CloudFront the recommended component in the architecture. Typically, customers use CloudFront in use cases such as:
- Full website delivery. Olx uses CloudFront to deliver their e-commerce brands across their worldwide markets.
- API protection and acceleration. Slack halved their API response times globally by leveraging CloudFront as a reverse proxy.
- Adaptive video streaming (VoD/Live). M6 uses CloudFront to improve video playback quality in terms of startup time, bitrate, and buffering rates.
- Software download. Nordcurrent uses CloudFront to offload their infrastructure thanks to caching, and gamer wait times have been reduced up to 90%.
In the following paragraphs, you will learn more about how CloudFront improves the security, reliability, and performance of web applications. CloudFront also allows you to optimize your egress traffic costs. In fact, the pricing model for CloudFront is primarily based on the number of HTTP(S) requests and bytes served from PoPs. When the origin is hosted on AWS, those charges replace Data Transfer Out charges that would otherwise apply to the origin.
CloudFront processes incoming HTTP(S) requests and only forwards well-formed ones, which protects customers against certain attacks at both session layer (ex: TLS abuse, SlowLoris) and application layer (ex: malformed HTTP(S) requests).
On their side of the shared responsibility model, customers can implement controls in CloudFront to enhance their security posture. For example, you can configure access control using signed URLs, or associate a WebACL from AWS Web Application Firewall (AWS WAF). When used with CloudFront, AWS WAF rules are enforced in CloudFront PoPs, which allows HTTP(S) inspection at the scale of millions of requests per second. You can find in-depth information about CloudFront’s security controls in this whitepaper.
CloudFront measures the internet in real time to route requests to the best available PoP and avoid potential issues, such as internet congestion. You can also leverage CloudFront to increase the resiliency of your web applications in the event of an origin failure. For example, customers can configure caching, origin failover, and custom error pages for graceful failover. More sophisticated origin failover can be achieved in combination with Amazon Route 53 or AWS Lambda@Edge.
Since CloudFront processes HTTP(S) requests in distributed PoPs, the performance of web applications is enhanced thanks to the following:
- Serving content over modern internet protocols such as HTTP/2 or TLS1.3, even if the origin doesn’t support it. For example, HTTP/2 has a feature of multiplexing requests over the same TCP connection, which addresses the issue of Head of Line Blocking in HTTP 1.1, and as a consequence helps load web applications faster.
- Serving cacheable content locally from CloudFront PoPs. By avoiding a round trip to the origin and the origin response time, requests can shave off tens to hundreds of milliseconds in latency.
- Persisting connections to the origin. Sometimes, the request must be forwarded to the origin, such as when the content is not present in local cache or when it is purely dynamic, such as APIs. Requests forwarded over persistent connections from PoPs do not need to establish a new TCP/TLS connection to the origin, which removes the latency of multiple round trips.
- Moving application logic to CloudFront. Some application logic like authorization, routing or redirections can be moved to CloudFront and executed with lower latency thanks to the distributed nature of PoPs. CloudFront natively provides features like compression, signed URls, path based routing, etc. For more advanced logic, customers can use AWS Lambda@Edge or CloudFront Functions.
Global Accelerator, an acceleration at network level
AWS Global Accelerator is a networking service that improves the performance, reliability and security of your online applications using AWS Global Infrastructure. AWS Global Accelerator can be deployed in front of your Network Load Balancers, Application Load Balancers, AWS EC2 instances, and Elastic IPs, any of which could serve as Regional endpoints for your application.
To use this service, you create an accelerator, which provides two global static anycast IPv4 addresses that act as a fixed entry point to your application. With Global Accelerator, you can have multiple application endpoints present in single or multiple AWS Regions but they can all be accessed by the same anycast IP address. You then configure your authoritative DNS server to point your web application’s domain name to the accelerator’s dedicated static IPs. These anycast IPs are announced across all Global Accelerator PoPs to route user traffic to the nearest PoP, and then forward them to the regional endpoint over the AWS global network. You can either use the IP addresses that are allocated by default or choose to Bring Your Own IP (BYOIP) range for address assignments by accelerators in your account. To learn more about Global Accelerator, I recommend watching How AWS Global Accelerator improves performance, a talk from re:Invent 2020.
Since AWS Global Accelerator operates at layer 4 of the OSI model, it can be used with any TCP/UDP application. You pay the Data Transfer-Premium fee of AWS Global Accelerator (on top of Data Transfer Out charges) in addition to an hourly accelerator fee to improve the performance and availability of your applications. Example use cases include:
- UDP/TCP based Multi-player gaming. JoyCity saw network timeouts dropping by a factor of 8 in some countries thanks to Global Accelerator.
- Voice and Video over IP. CrazyCall uses Global Accelerator to ensure that their customers get the best quality of service.
- IoT. BBPOS improved latency by an average of 25% using Global Accelerator’s fixed Ips to ingest data from mobile point-of-sale devices.
- Video ingest and FTP uploads. FlowPlayer uses AWS Global Accelerator to improve the performance and availability of video ingest for their users around the world.
- Other use cases include VPN, Git, and AdTech bidding. Customers can also consider Global Accelerator for HTTP workloads such as non-cacheable APIs in specific scenarios (described further down in this section).
In the following paragraphs, you will learn more about how Global Accelerator improves the security, reliability, and performance of your user-facing applications.
In addition to increasing the resiliency of applications against layer 3 and layer 4 DDoS attacks, Global Accelerator only accepts traffic on configured listeners, which allows you to reduce the attack surface at the edge. Moreover, Global Accelerator doesn’t require exposing your applications built on Application Load Balancers and EC2 instances directly to the internet. This also allows customers to obfuscate their origin from being directly attacked by keeping them in a private subnet in their VPC.
Global Accelerator has a fault-isolating design that increases the availability of online applications. An accelerator’s two IP addresses are serviced by independent network zones. Similar to Availability Zones, these network zones are isolated units in an edge PoP with their own physical infrastructure and serve static IP addresses from a unique IP subnet. If one static IP address becomes unavailable due to IP address blocking or unreachable networks, applications can fall back to a healthy static IP address from the other isolated network zone.
Global Accelerator also provides a traffic management feature that propagates configuration changes to all edge PoPs within seconds. Health checks are run from each edge PoP to every availability zone and Region, monitoring the health status of endpoints. Any changes in endpoint health are detected within seconds. This allows customers to implement highly available, multi-Region architectures with fast failover capability without any dependency on DNS.
Finally, Global Accelerator’s static IP addresses provide benefits to your application’s availability in three main scenarios:
- When your end-users have devices or browsers that are not DNS aware. In this scenario, Global Accelerator allows you to modify and replace application endpoints, or even move endpoints to a different Region at will, without needing to make any client-facing changes.
- When your end-users have devices or browsers that do not honor DNS caching. With Global Accelerator’s Static IP addresses, your application’s availability is immune to devices and resolvers that cache DNS for longer than your desire.
- When your end-users need to allow-list a small set of fixed IP addresses. This is common in enterprise deployments where there are strict firewalling policies. It also appears in IoT deployments. Consequently, availability risks due to misconfigured IP addresses on devices or firewalls are reduced.
Global Accelerator routes user traffic to the nearest PoP using BGP Anycast. From there, Global Accelerator carries your user traffic to your Regional endpoints over the Amazon backbone. Global Accelerator further enhances performance thanks to the following techniques:
- Jumbo frame support. By enabling jumbo frames between the AWS edge location and the application endpoint in the AWS Region, Global Accelerator is able to send and receive up to 6X more data (payload) in each packet. Jumbo frame support cuts down the total time required to transmit data between users and your application.
- TCP termination at the edge. Global Accelerator reduces initial TCP setup time by establishing a TCP connection between the client and the AWS PoP closest to the client. Almost concurrently, a second TCP connection is made between the PoP and the application endpoint in the AWS Region.
- Large receive side window, TCP buffers and congestion window. For TCP terminated traffic, Global Accelerator is able to receive and buffer larger amounts of data from your application in a shorter time period by tuning receive side window and TCP buffer settings on the AWS edge infrastructure. This provides faster downloads to your clients, who are now fetching data in a shorter time directly from the AWS edge. By transmitting data over the AWS global network, Global Accelerator can scale up the TCP congestion window to send larger amounts of data than usually possible via the public internet.
Considerations for web applications
Customers use Amazon CloudFront for most HTTP(S) based Web applications. AWS Global Accelerator should be considered by customers for HTTP(S) workloads in the following common scenarios:
- Static IPs, including BYOIP. Customers may want to expose their APIs through a limited number of static IPs to their partners or to their devices with hard coded IPs.
- Turn key Global Traffic Management. Customers looking for an off-the-shelf solution to implement a multi-Region architecture for their APIs can use Global Accelerator instead of building this solution using CloudFront based on AWS Route 53 or Lambda@Edge.
- Accelerating tens of thousands of domain names. CloudFront and Certificate Manager have quotas on the number of domains that can be configured (excluding wild card setup such as *.example.com). In this scenario, as a SaaS providing tens of thousands of APIs using custom domain names, customers can use Global Accelerator with an AWS EC2 fleet behind NLB to handle the very large number of TLS certificates.
In this blog post, we explained how you can use CloudFront and Global Accelerator to improve your online application across three pillars of the AWS Well-Architected framework: Security, Reliability, and Performance. AWS shows the same considerations when designing their services. For example, the Edge Optimized Endpoint of Amazon API Gateway is integrated with CloudFront to accelerate dynamic APIs, and AWS Amplify Hosting uses CloudFront behind the scenes to cache static and server side rendered web applications. This is also seen in AWS Site to Site VPN, which provides an acceleration option through Global Accelerator to improve the performance of VPN tunnels. A third example is Amazon S3 Multi-Region Access Points, an Amazon S3 feature powered by Global Accelerator that allows you to define Amazon S3 endpoints that span buckets in multiple AWS Regions.
Finally, CloudFront and Global Accelerator are illustrations of the breadth of services offered by AWS. We are committed to providing a range of options to our customers instead of a one-size-fits-all solution. Please consider well-architecting your online applications using these AWS edge services to maximize performance, security, and availability of your applications.