Increase origin offload
Overview
Increasing the Cache Hit Ratio (CHR) with CloudFront improves the performance of a web application, and reduces the load on its origin. CHR is the ratio of HTTP requests that are served from CloudFront cache to the total number of requests. Requests served from CloudFront cache benefit from better latency (e.g. time to last byte), making CHR a good indicator of origin offload and application performance. The CHR of a CloudFront distribution can be monitored using the Cache hit rate CloudWatch metric. To increase CHR, you can optimize the caching configuration in CloudFront, enable Origin Shield and optimize your application behavior.
Increase cache time to live (TTL)
You can control for how long an object is cached in CloudFront using the Cache-Control header sent by your origin, bounded by the configured Time To Live settings in Cache Policies. Increasing TTLs has positive impact on CHR, and consequently it's recommended to:
- Configure Cache-Control headers on the origin to better control TTL in CloudFront, and leverage browser caching.
- Cache static assets as immutable objects (e.g. Cache-Control: max-age=31536000, immutable), and version their URL path (e.g. /static/app.1be87a.js).
- Implement ETAGs on your objects to benefit from conditional HTTP requests to your origin.
- Strike the right balance for more dynamic content such as HTML, between caching (high TTL), and how much the application tolerates stale content (low TTL).
Optimize cache key settings
The cache key settings using Cache Policies, dictates whether or not CloudFront reuses a cached object for an HTTP request. An optimized cache key settings results in a 1 to 1 relation between unique cache keys and unique objects. Consider a web application that serves the same /about.html regardless of the appended query parameters (e.g. /about.html?utm_medium=social). If the cache key is configured to include the utm_medium query parameter, CloudFront will cache different urls (e.g. /about.html?utm_medium=social & /about.html?utm_medium=email) using two distinct cache keys. This results in two cache misses to the origin, even though both requests are for exactly the same file on the origin, which is sub-optimal.
The first best practice to optimize cache key settings is to exclusively include in the cache key request attributes that vary origin responses. To achieve this, it's recommended to:
- Configure separate Cache Behaviors for objects that require different cache key settings.
- If query parameters, headers or cookies vary responses at the origin, only include the ones which actually do (e.g. cookie user_id instead of all cookies).
- Use Response Header Policies in CloudFront to manage CORS, instead of adding CORS headers (e.g. Origin, Access-Control-Request-Method, Access-Control-Request-Headers) to the cache key and managing CORS at the origin level.
- Offload access control to CloudFront using signed URLs, CloudFront Functions or Lambda@Edge, instead of adding Authorization header to the cache key and manage it at the origin level.
- Use Origin Request Policy in CloudFront to forward HTTP attributes to the origin instead of adding them to the cache key.
The second best practice is to normalize a request attribute before adding it to the cache key to reduce the cardinality of its possible values, each resulting in a unique cache key. To achieve this, it's recommended to:
- Send query parameters, when used, in the same order and with same case
- Use CloudFront generated headers such as CloudFront-Is-Mobile-Viewer to identify a device type instead of adding User-agent header to the cache key.
- Using CloudFront Functions to apply advanced normalizations, such as: Re-ordering and case lowering query parameters; Serving a different version of a web page based on the existence of a cookie, instead of adding the cookie to the cache key; Reducing the variance of responses that vary based on countries, when the same response can be sent for a group of countries; Reducing further the cardinality of Accept-Encoding when compression is needed, or the cardinality of CloudFront device detection headers when multiple ones are used.
Enable Origin Shield
By default, CloudFront reduces the number of cache misses to the origin by using two high level layers of caching: A layer at the PoP level, and another layer at the Regional Edge Cache (REC) level. A CloudFront PoP is nominally associated to one of the 10+ RECs globally. When a request results in a cache miss at the PoP level, CloudFront checks the cache of the associated REC to fulfill the request, and only if it's not cached in that REC, CloudFront forwards the request to the origin. RECs are isolated from each other to maintain high availability, and in consequence do not share their caches. As a result, when popular objects are requested from different locations around the world, CloudFront will send multiple requests for the same objects to the origin from multiple RECs.
To reduce the number of requests forwarded to the origin, you can enable Origin Shield in CloudFront, which is a third high level layer of caching between RECs and your origin. Instead of requesting objects directly from the origin, RECs first try to fulfill requests from the Origin Shield cache.
Optimize application behavior
Consider application-level changes that can positively influence the cache hit ratio. Examples include:
- Reducing the cardinality objects served by your application. When your application produces multiple renditions of the same asset, for example different sizes of an image to fit different screen, consider limiting the number the possible values for width and height.
- Leveraging the browser caching. This can be done using the Cache-Control header sent by your origin. You can differentiate TTLs between CloudFront and the browser by either using max-age with s-maxage directives in the Cache-Control header, or by using the Cache-Control header for the browser and control CloudFront TTLs using Cache Policies.
- Use byte range requests in your clients to only download what is needed.