Accelerate and protect your websites using Amazon CloudFront and AWS WAF

Internet users increasingly expect responsive web applications and APIs with lower latency and higher availability. Additionally, publicly accessible web applications and APIs are exposed to threats such as commonly occurring vulnerabilities described in the OWASP Top 10, SQL injection, automated requests, and HTTP floods (Denial of Service (DoS)) that can affect availability, compromise security, or consume excessive resources.

Developers looking to keep their web application performant, resilient, and secure, introduce Amazon CloudFront‘s global edge network with AWS WAF to their hosting infrastructure. Both services protect web applications from being exposed to potential attacks and being vulnerable to unpredictable traffic spikes that impact performance and availability. In this post, you learn the basic concepts of configuring CloudFront and AWS WAF to add them to your web application technology stack.

Why you should use CloudFront together with AWS WAF

CloudFront is a reverse proxy that serves as a single point of entry for your application globally. Viewers anywhere in the world connect to one of CloudFront’s hundreds of edge locations nearest to them: CloudFront either responds directly from its cache or forwards the request to your application. Because traffic is spread across CloudFront’s edge locations, you can leverage CloudFront’s network architecture to deliver low latency applications that can withstand large bursts of traffic or sustained traffic as compared to web applications served from single locations. This makes CloudFront the ideal place to protect against DDoS and other application attacks.

AWS WAF, a web application firewall, analyzes incoming requests and blocks the aforementioned threats before they reach your servers. In addition to protecting against common web exploits and volumetric attacks, AWS WAF can protect against more sophisticated attacks, such as account creation fraud, unauthorized access to user accounts, and bots that attempt to evade detection, as shown in the following figure.

Figure 1: Architecture of a web application with Amazon CloudFront and AWS WAF

Adding CloudFront and AWS WAF to your application technology stack has the following benefits:

Content acceleration: With caching, compression, and modern internet protocols like HTTP/3 and TLS 1.3. Static and dynamic applications are accelerated by terminating TLS connections close to viewers from distributed edge locations. This maintains persistent connections to your origin to avoid costly round trip connection establishment, and carries traffic to your origin over AWS’ private global network rather than the public internet.
High availability: With origin failovers, connection retries, and multi-Region architectures.
Security controls: With TLS policy enforcements, HTTP protocol validation, authorization using tokens, and geo-blocking. DDoS protection is included at Layer 3/Layer 4 through AWS Shield Standard, and can be configured at Layer 7 using AWS WAF to block malicious Layer 7 requests before they reach your web servers.

Understand the building blocks of CloudFront

A distribution is the most basic CloudFront construct. It serves as the container for your application config that tells CloudFront how to serve your application and where to route requests. You can access your application through a unique cloudfront.net URL assigned to your distribution, or by attaching one or more custom domains.

Every distribution is composed of origins, cache behaviors, and settings. An origin is where you host your web application that CloudFront sits in front of: this could be an Amazon Simple Storage Service (Amazon S3) bucket, Amazon API Gateway endpoint, Application Load Balancer (ALB), or any publicly accessible URL. Cache behaviors – or routes – are URI paths that instruct CloudFront on how to process requests, if and how to cache your content, and the origin to which the route should be requested. Global settings cover TLS certificate, domain names, viewer-facing protocols (HTTP/2, HTTP3, TLS1.3), and AWS WAF association.

When CloudFront receives a request, CloudFront attempts to match the URI path to the correct cache behavior based on the path pattern and priority you defined. Every CloudFront distribution includes a default cache behavior that matches when no other cache behaviors match the request. For example, you can configure a cache behavior for the /api/* path that routes to API Gateway and disables caching, while your default cache behavior routes to your static website in an S3 bucket where all content is cached.

Policies – If you need more granular control over how CloudFront caches your content or what request data CloudFront forwards to your origin, then CloudFront has the concept of policies: reusable configuration templates that can be applied to one or more distributions at the cache behavior level. A Cache Policy defines cache settings, including what components of the HTTP request to include in your cache key. An Origin Request Policy defines additional request data to forward to your origin beyond the HTTP request data included in the cache key. A Response Header Policy defines HTTP headers to append to the response, and/or HTTP headers to remove from the response. For each type of policy, you can create a custom policy specific to your application, or use a managed policy provided by CloudFront. To learn more about CloudFront Policies, read this AWS documentation.

Edge functions – If you need to customize the processing of HTTP requests and responses beyond what is made possible by the previously mentioned building blocks of CloudFront, then CloudFront provides the concept of edge functions that can be associated to cache behaviors where they need to implement custom logic, such as HTTP redirections, URL rewriting, advanced cache key normalization, and customized authorization. Edge functions are based on code you write that is executed by CloudFront close to viewers to manipulate HTTP requests and responses with low latency, as shown in the following figure. To learn more about Edge functions, read this AWS documentation.

Figure 2: Relationships between CloudFront configuration constructs. The numbers on the arrows represent the multiplicity expressed in UML. For example, a cache behavior is associated with only one origin, while an origin can be associated with multiple cache behaviors.

CloudFront policies

Cache Policies – When CloudFront receives a request, it creates a cache key from elements of the HTTP request, such as URI path, query strings, headers, and cookies. That cache key is a unique identifier used by CloudFront to store and retrieve cached content for your distribution. The instructions to compose the cache key live in cache policies. Optimizing the configuration of cache keys is one of the most effective ways to improve your cache hit rate – the percentage of requests served from CloudFront’s cache. A simple question to know for whether to include an HTTP request element in your cache key is: does our application serve different content based on the value of that request element?

For example, consider two requests: (1) /about.html and (2) /about.html?utm_medium=social. Because we only need the URI path to serve the html file, and query strings do not impact the html file we serve, we would not include query strings in our cache key. This makes sure both requests generate the same cache key. Alternatively, if an application serves articles that have relative paths such as /articles?id=1234, then we should include the id query string parameter in the cache key because different ids lead to different articles, and thus should be treated as separate requests. CloudFront makes it easier for you to get started with managed cache policies. For example, you can use the CachingDisabled managed cache policy to disable caching for dynamic content and APIs and the CachingOptimized managed cache policy to enable caching for static content.

In addition to the cache key, you can also configure TTL boundaries in a cache policy when caching is enabled. CloudFront honors the Cache-Control header sent by the origin, as long as it’s above the Minimum TTL and below the Maximum TTL configured in the cache policy. Otherwise, CloudFront uses the Minimum TTL or Maximum TTL respectively. If the origin does not return a Cache-Control header, then CloudFront uses the Default TTL.

Origin Request Policies – When CloudFront does not respond to an HTTP request from cache, it forwards the HTTP request to the origin. When forwarding the request to the origin, CloudFront mainly includes the query strings, cookies, and headers of the incoming HTTP request that are configured in the cache policy. If you need additional elements from the HTTP request forwarded to the origin that are not part of your cache key, then you should configure those in the Origin Request Policy. Additionally, you can include headers generated by CloudFront, such as the viewer’s device type or location.

Response Header Policies – Finally, Response Header Policies allow you to add additional HTTP response headers to the response generated by your origin before the response is sent to viewers. This makes it easy to configure CORS, add Cache-Control headers if not already done so by the origin servers, or add various HTTP security headers such as HTTP Strict Transport Security (HSTS) and Content Security Policy (CSP) without modifying your application code.

Model CloudFront configuration to your application specifics

To configure a CloudFront distribution for your web application (such as www.example.com), consider the following modeling exercise: First you need to identify the different parts of your web application and how each should be processed by CloudFront. For example:

Figure 3: Application delivery requirements

Next, translate the preceding table to a CloudFront configuration. First, you need to select which content should be served by the default cache behavior. It’s usually used for content that cannot be identified with just a few cache behaviors. In this case, we use the default cache behavior for HTML content. Then, for each content type, map a cache behavior:

Figure 4: CloudFront configuration

Finally work backward from the preceding table, and configure the following constructs in the following order:

Required cache policies, required origin request policies, required response header policies, required edge functions, and a TLS certificate covering www.example.com. To use a TLS certificate with CloudFront, you must create it using AWS Certificate Manager (ACM) in the North Virginia Region (us-east-1).
CloudFront distribution with default cache behavior for HTML pointing to the ALB origin. Configure the desired global settings such as minimum TLS versions, logging, AWS WAF, etc.
Additional Origins.
Additional cache behaviors for images, CSS, JavaScript and APIs, with each one configured with an origin.

Add AWS WAF WebACL to the CloudFront distribution

To quickly use AWS WAF with your CloudFront distribution, select the Enable security protections option in the Web Application Firewall (WAF) section of the CloudFront console. CloudFront handles creating and configuring AWS WAF for your distribution with out-of-the-box protections recommended by AWS for all applications. Alternatively, you can also choose to create your own AWS WAF web ACL – the container for your AWS WAF configuration – in the North Virginia (us-east1) Region and associate it with your CloudFront distribution manually.

Once enabled, AWS WAF evaluates every HTTP request your distribution receives against your configured rules within single digit milliseconds, since it runs on the same physical edge hosts as CloudFront. Based on the rule evaluation, AWS WAF instructs CloudFront on how to process the request (such as block, forward, and challenge).

You can optionally add additional rules of different types:

Custom rules are rules you create and manage. Custom rules can be conditional based on the attributes of the inspected HTTP request (such as IP, headers, cookies, URL).
Managed Rules from AWS, or vendors on the AWS Marketplace, are added as configurable rule groups to your Web ACL. For example, you can add AWS Managed Groups such as Core Rule Set, Known Bad Inputs, and Anonymous IP list (these are included automatically when using the Enable security protections option mentioned earlier). More advanced managed rules such as Bot Control and Account Takeover Prevention are recommended with client-side SDK integration. Some managed rules may incur additional fees per processed request.
Shield Advanced subscribers can benefit from its managed rules for automatic application layer DDoS mitigation.

Each rule can be configured with a specific action when they match. For example, you can configure a rule to allow and count requests (with the possibility to send signal headers upstream), block requests (with the possibility to return a custom response), rate limit requests, or challenge using CAPTCHA or JavaScript.

AWS WAF evaluates rules in order. If there is a terminating match such as block or allow, subsequent rules are not evaluated. Non-terminating rules such as count can emit signals called Labels that can be used in the logic of subsequent rules. For example, instead of blocking requests with an identified suspicious IP using an IP reputation based rule, you can configure this rule to count and add a custom label to the request, and in a subsequent rule, rate limit requests tagged with this label.

Considerations for configuring rules in a Web ACL

Consider the following when designing rules in your AWS WAF web ACL.

First, complete the threat modeling exercise to determine the set of rules needed to protect your web application. For example, whether your application requires protection against SQL injections, and whether you have a business challenge cause by malicious bot activity.

Second, for each rule, make sure you scope it to where it’s specifically needed in your web application. For example, only enable protection against SQL injection on the paths of your application that requires it. On the one hand, this reduces false positives, and on the other hand, it reduces the cost of invoking rules with additional fees, such as Bot Control managed rule group.

Third, once you know which rules you need, consider the order of evaluation. By placing less expensive terminating rules ahead of more expensive rules, you can optimize your AWS WAF cost by limiting the execution of the more expensive rules. For example, place rate-based rules and IP reputation based rules ahead of rules such as Bot Control. Similarly, allow trusted sources of traffic such as search engines ahead of other rules to avoid mistakenly blocking them.

Finally, you need to consider the following operational aspects of managing AWS WAF rules:

Keep the total WebACL WCU (WebACL Capacity Unit) consumption below 5000. The web ACL WCU equates the sum the WCU of configured rules, with each depending on the complexity of the configured rule.
Establish a methodology to manage false positives.
Think about the deployment model: Manual using the console, or automated using Firewall Manager, or automated using IaaC tools such as AWS CloudFormation.
Enable visibility for your rules using Amazon CloudWatch metrics, or AWS WAF access logs.

Conclusion

In this post, you learned the basic concepts of CloudFront and AWS WAF. With this high-level methodology to model the configuration of these AWS Edge Services, you can apply what you learned to one of your web applications, thereby improving and securing the online experience of your users.

To learn more, see the CloudFront Developer Guide, and the AWS WAF Developer Guide.

If you have feedback about this post, submit comments in the following Comments section. If you have questions about this post, start a new thread on the AWS WAF forum, the CloudFront Forum or contact AWS Support.

Achraf Souk

Achraf Souk is leading the Edge Specialist Solutions Architects team in EMEA. This team helps companies and developers to secure and accelerate their web applications using AWS Edge Services.

Cristian Graziano

Cristian Graziano is a Senior Product Manager with Amazon CloudFront based out of Seattle. He works across product, engineering, and UX to help first-time and experienced AWS customers quickly onboard, configure, and manage Amazon CloudFront and related AWS services.

Networking & Content Delivery