AWS Security Blog

How to Prevent Hotlinking by Using AWS WAF, Amazon CloudFront, and Referer Checking

by Alex Smith | on | in How-to guides | | Comments

At some point, you might have to deal with hotlinking: when third parties embed in their websites the content they find on your websites. The third-party website does not incur the cost of hosting the content, which means your website can end up paying for the content other sites use.

Now, you can use AWS WAF to help prevent hotlinking. AWS WAF is a web application firewall that is closely integrated with Amazon CloudFront (AWS’s content delivery network [CDN]), and it can help protect your web applications from common web exploits that could affect application availability, compromise security, and consume excessive resources. In this blog post, I will show you how to prevent hotlinking by using header inspection in AWS WAF, while still taking advantage of the improved user experience from a CDN such as CloudFront.

Process overview

You can address hotlinking in a number of ways. For instance, you can validate the Referer header (sent by a browser to indicate to the server which page they were referred from) at your web server (for example, by using the Apache module mod_rewrite), and issue either a redirect back to your site’s main page, or return a “403 Forbidden” error to the visitor’s browser.

If you are using a CDN such as CloudFront to speed up your site’s delivery of content, validating the Referer header at the web server becomes less practical. The CDN stores a copy of your content in the edge of its network of servers, so even if your web server validates the original request’s headers (in this case, the Referer), additional requests for that content must be validated by the CDN itself because they are unlikely to hit the origin web server.

The following diagram illustrates this process.

Process diagram

As illustrated in the preceding diagram, when (1) a request comes in from an end-user client to (2) an Amazon CloudFront edge location, the edge location attempts to return a cached copy of the file requested. This request, if fulfilled from the cache, is considered a cache hit. In the case of a cache miss—when the content is either not in the edge, or is not valid (for instance, if the content is out of date)—the (3) request goes back to the origin (for example, the origin could be Amazon S3) for a new copy of the object. In the case of a cache hit, the origin cannot apply any validation to the end user’s request, because the edge server does not need to contact the origin in order to fulfil the end user’s request.

I will show you how to inspect headers with AWS WAF to block or allow requests at the CDN.

Solution implementation—two approaches

Terms

The following list includes key terms I use in this post:

  • AWS WAF configurations consist of a web access control list (web ACL), which is associated with a given CloudFront distribution.
  • Each web ACL is a collection of one or more rules, and each rule can have one or more match conditions.
  • Match conditions are made up of one or more filters, which inspect components of the request (such as its headers or URI) to match for certain conditions.

AWS WAF setup

I will show two approaches to preventing hotlinking:

  1. A separate subdomain – All static files (such as images or styling components like CSS) to be protected are separated onto a separate subdomain such as static.example.com so that I only need to validate the Referer header.
  2. The same domain – Static files sit under a folder on the same domain. In this case, I will also extend our example to check for an empty Referer header.

Approach 1: A separate subdomain

In this case, I create an AWS WAF rule set that contains a single rule with a single match condition, which in turn comprises a single filter. The match condition checks the Referer header and verifies that it contains a given value. If the rule is matched, the traffic is allowed. Otherwise, the default rule blocks the traffic. In the following steps, I show how to set this up by using the AWS WAF console.
Step 1: Determine what you need to protect.
Because I have all of my static files on a separate subdomain (static.example.com), accessed only from example.com, I will block hotlinking for any files accessible under static.example.com that do not have a Referer ending with example.com.

Because AWS WAF web ACLs can be applied only to Amazon CloudFront, be sure that you already have a distribution set up to serve this traffic. In this blog post, I will not cover the creation of CloudFront distributions, but this video covers this in more detail.
Step 2: Create and name a new web ACL.
Because this is the first time I have created a web ACL, I open the AWS WAF console (shown in the following screenshot) and then click Get started.

Screenshot of the AWS WAF console

If you have created a web ACL before, click Create web ACL on the AWS WAF console landing page.

Image of web ACL button

I then provide the name of the web ACL I am creating. At the same time, the page will automatically populate an associated Amazon CloudWatch metric name. CloudWatch is a monitoring service that allows you to gather and report on metrics of various services. This CloudWatch metric can be used later to report on how your newly created AWS WAF configuration is being used. After I have supplied the name of the web ACL, I click Next to go to the next page.
Step 3: Create a string match condition on Referer
In the String match conditions section, I click Create condition. I could use several types of conditions, but for AWS WAF to evaluate a string of a Referer header, I choose a string match condition (see the following screenshot). This string match condition will inspect the Referer header on web requests for any string containing example.com/, which will allow me to embed content from other sites under my domain. In this case, I will not allow a blank Referer. I will assume that only our website can embed content under this domain.

If you need to increase security further, you can have additional match conditions for only valid Referer values by using Starts With (be sure to include the protocol, such as http:// or https://). For example, by using a value such as https://example.com, you could prevent someone from registering stealfromexample.com and using that to hotlink your content, or you could prevent someone from including example.com/ in the domain itself.

I have also included a Transformation match, which changes the header to lowercase before parsing it. This is not required for most modern browsers; however, HTTP header fields can be case sensitive.

Be sure to click Add another filter after you have entered the configuration information, or your original filter will not be populated. Then click Create.

Image of creating the string match condition

When the string match condition has been populated, it will appear in the String match conditions section (see the following screenshot). I can now use this string match condition in a rule.

Image of the string match condition section

I click Next to go to the Create rules page.
Step 4: Create a new rule with the specified string match condition.
I now need to create a new rule that will filter based on the string match condition I just created.

First, I click Create rule, which allows me to specify a Name for the rule, an associated CloudWatch metric name, and the logic behind how the conditions are applied (as shown in the following screenshot).

Image of specifying logic behind how conditions are applied

After I have specified the conditions, I click Create, and the new rule is added automatically to the web ACL. As shown in the following screenshot, I set this newly created rule to Allow, and the Default Action to Block. I then click Next.

Image showing the setting of the rule to Allow and the Default Action to Block
Step 5: Associate the new rule with the relevant CloudFront distribution (and test with cURL).
From the Resource drop-down list on the Choose AWS resource page, I can choose the relevant CloudFront distribution used for my static site delivery, which will allow me to easily associate the newly created AWS WAF web ACL with this distribution.

Image of choosing the CloudFront distribution

I click Review and create, which gives me a review page covering all of the details I have covered so far.

Image of the Review and create page

I have checked that this is the correct distribution, so I can click Confirm and create. This will begin the process of associating the web ACL with my CloudFront distribution, which will typically take around 10–15 minutes.

The result

Now when I request files without the whitelisted Referer header, the requests are blocked at the CDN. However, valid requests still are allowed through.
When a third party embeds our content (request blocked at the CDN)

» curl –H "Referer: https://example.net/" -I https://static.example.com/favicon.ico
« HTTP/1.1 403 Forbidden

When I embed our content (request allowed through the CDN)

» curl –H "Referer: https://example.com/" -I https://static.example.com/favicon.ico
« HTTP/1.1 200 OK

With Approach 1, I must make the request with a whitelisted Referer header, and in this case, all paths are filtered. In Approach 2, I will allow a blank Referer header, and I also will show how to filter by a given URL path.

Approach 2: All content under the same domain, with filtering by path

In this second approach, I will create an AWS WAF web ACL that contains multiple rules with additional match conditions, which in turn comprise multiple filters. As with the first approach, the match condition looks at the Referer header; however, I now validate it in two ways: first, I validate whether it contains my expected header, and if not, I move on to my second validation, which checks to see whether it has any “URL style” Referer header. This allows me to access the assets directly in a browser when the assets are not otherwise embedded in a website, but still provides protection against hotlinking.

I also validate the path (in this case /wp-content) used in the request, which allows AWS WAF to protect individual folders under a single domain name
Step 1: Determine what you need to protect.
As in the first approach, rather than filter on everything under a domain in this second approach, I will filter based on the path, /wp-content. This allows me to protect my uploaded content that sits under /wp-content, but without having to separate this out into a separate subdomain.
Step 2: Create and name a new web ACL.
As with the previous approach, I create a new web ACL in the AWS WAF console by clicking Create web ACL. I will assume you have already created the web ACL from Approach 1, but if not, check Approach 1’s instructions about this step.

As I did in the first approach, I supply the name of the web ACL I am creating. After I have supplied the name of the web ACL, I click Next to go to the next page.
Step 3: Create string match conditions on the Referer
For Approach 2, I am assuming that everything exists under a single domain, so rather than using the catch-all example.com/, I choose the more secure https://example.com/, and I mark the header as Starting With this value. Because I am explicitly filtering on one header, I need to watch out for two things:

  • Switching between www.example.com and example.com in my application.
  • Switching between https:// and http:// in my application.

If either of these switches occurs, I will see a “403 Forbidden” error returned instead of my embedded files. In this example, all content is delivered directly through https://example.com/.
First string match condition
To create these match conditions, I click Create condition next to String match condition, as shown in the following screenshot.

Image of first step in creating new string match condition

I then configure the new match conditions and filters, as shown in the following screenshot.

Image of configuring new string match condition

Again, remember to click Add another filter before you click Create, or the filter will not be added to the condition.
Second string match condition
After I have created this string match condition for Approach 2, I need to create two more string match conditions—one for the URL path (/wp-content) itself, and one to validate whether there is no Referer, which is useful in scenarios with noncompliant client applications or where you need to directly link (from an email, for instance).

For the URL itself, I want to protect content under /wp-content, so I will create a string match to validate that case. I go through the same steps as before. This time, I change the part of the request to filter on URI, and the value to match as /wp-content, as shown in the following screenshot.

Image of filtering content under /wp-content

Again, click Add another filter, and then click Create to create my second string match condition, which is shown in the following screenshot.
Third string match condition
With the first two string match conditions created, I move on to create my final string match condition, which I will use to determine whether the Referer is set or not set.

Again, I click Create condition above our previously created conditions. This time, I will create a filter matching on the Referer header, and match on the presence of ://.

Image of matching on the presence of ://

Again, click Add another filter before clicking and Create. I then have all three of my string match conditions that I will use, as shown in the following screenshot.

Image of filter with all three string match conditions

I click Next to go to the Create rules page.
Step 4: Create two rules with the specified string match conditions created in Step 3.
Creating the rules in Approach 2 is more complex than in Approach 1. Now I need to create two rules: one that validates a valid Referer header, and one that validates requests with no Referer header.

Rule 1: Validate a Referer header.
This first rule matches on the presence of the Referer header (https://example.com/) and the URL (/wp-content). First, I click Create rule, which allows me to specify a Name for the rule, an associated CloudWatch metric name, and the logic behind how the conditions are applied (as shown in the following screenshot).

Image of creating Rule 1

When the rule has been populated as above, I click Create and the rule is automatically added to my web ACL.

Rule 2: Validate requests with no Referer header.
This second rule is similar to the first, and matches when the Referer header includes ://. I use this as a simple way to check whether the Referer header has been set at all. If it has, I choose to block the request, which is configured when all the rules are created and added to the web ACL.

Image of creating Rule 2

Again, once the rule has been populated as above, I press Create and the rule is automatically added to my web ACL.

After I have created both of these rules, and the rules have been added to my web ACL, I can take advantage of the AWS WAF ordering capabilities, which control the order in which rules are applied. I have chosen to:

  • Match for the path of /wp-content and a Referer that is valid. If so, allow the request.
  • Match for a path of /wp-content and a Referer that is invalid. If so, block the request.
  • Otherwise, allow all requests by default (for paths that are not /wp-content).

This order of operations results in the following rule configuration.

Image of the rule configuration
Step 5: Associate the new rule with the relevant CloudFront distribution (and test with cURL).
From the Resource drop-down list on the Choose AWS resource page, I can choose the relevant CloudFront distribution used for my static site delivery, which will allow me to easily associate the newly created AWS WAF web ACL with this distribution.

Image of associating the distribution with the web ACL

I click Review and create, which gives me a page that shows all of the details I have covered so far in Approach 2.

Image of the Review and create page

If the details look right, I click Confirm and create. Again, it will take 10–15 minutes to push changes out.

The Result

As with Approach 1, I have filtering at the CDN, but this time the filtering is based on the path and direct linking is allowed (without a Referer header).

Here I use cURL to verify that the new AWS WAF web ACL correctly protects my content. I use the –H argument to send a different Referer header to the CloudFront distribution, which allows me to test as if I am embedding my content in an unauthorized page.
When a third party embeds our content

» curl –H "Referer: https://example.net/" -I https://example.com/wp-content/uploads/2013/03/shareable-image.jpg
« HTTP/1.1 403 Forbidden

When our content is directly linked (with no Referer)

» curl -I https://example.com/wp-content/uploads/2013/03/shareable-image.jpg
« HTTP/1.1 200 OK

When I embed our content

» curl –H "Referer: https://example.com/" -I https://example.com/wp-content/uploads/2013/03/shareable-image.jpg
« HTTP/1.1 200 OK

If you have comments about this blog post, submit them in the “Comments” section below. If you have questions about this solution or its implementation, start a new thread on the AWS WAF forum.

– Alex