AWS Partner Network (APN) Blog
Contentful Delivers Secure and Low-Latency Media Files Worldwide Using Amazon CloudFront and Lambda@Edge
By Leonardo Freitas dos Santos, Software Engineer – Contentful
By Darragh O’Flanagan, Sr. Partner Solutions Architect – AWS
Contentful |
In the digital-first era, being a content platform means serving thousands of requests per second to users all over the world.
In this post, we will explore how Contentful delivers media files using Amazon CloudFront with Lambda@Edge, achieving low latency on every region worldwide while also running custom-code at the edge that enables security features and high availability.
Amazon CloudFront is a global content delivery network (CDN) service built for high-speed, low-latency performance, security, and developer ease-of-use.
Lambda@Edge is a feature of Amazon CloudFront that lets you run code closer to users of your application, which improves performance and reduces latency.
Additionally, Amazon CloudFront integrates with AWS Shield and AWS WAF to add flexible layer-specific security types from attacks at the network transport and application layers. Moreover, CloudFront can easily integrate with a variety of AWS services to configure a service architecture.
Contentful is an AWS Competency Partner with an AWS-qualified software offering that lets digital builders leverage content structuring, orchestration, and reusability to manage the full lifecycle of content across the organization.
Challenges of Delivering Low-Latency Assets
Contentful extends the capabilities of a headless content management system (CMS) by offering an open and extensible platform that sits at the heart of the modern tech stack. Using an API-first approach, Contentful enables teams to scale, manage, and deliver content predictably across multiple channels such as websites, mobile apps, or other digital products.
Delivering assets to users worldwide with high availability and low latency can be quite challenging if you need custom code running before delivering them. Contentful has a handful of reasons to have such requirements.
The first is to ensure archived assets are not available anymore. It’s possible to put a tombstone (a deletion mark used for soft-deleting the file) on them on Amazon Simple Storage Service (Amazon S3) and proceed to restore them after they are unarchived, but this seems more complex.
Instead, Contentful has a validation that aligns the delivery of the asset file with its metadata status, blocking them to be served if necessary.
Contentful also has image transformation capabilities you can perform on the fly. If such transformation parameters are supplied, you need to route the request to the transformation API first instead of just going to Amazon S3.
Because you want to have load balancing and automatic failover for disaster recovery, Contentful has a replica (backup) S3 for all assets.
Contentful serves from the replica even if the primary is healthy, otherwise you won’t be certain that when facing an outage the replica bucket is not outdated or broken and that this went unperceived because it was never used until that moment.
Lastly, Contentful supports signed asset URLs by using embargoed assets, which allows customers to have fine-grained control on who, where, and for how long asset URLs should be accessible.
Instead of spinning up a service that needs to be deployed on every region and serves as an entry point for asset delivery requests to perform these validations and routings, that’s where Lambda@Edge comes in.
Contentful adds to the existing CloudFront distribution Lambda code to be executed before and/or after caching, both for the request and the response, and all of this is close to the user location.
Contentful can then, as an example, run cheap JSON Web Token (JWT) validations before the cache layer, and have dynamic origin selection, load balancing, and further validations after the cache layer.
Solution Overview
Let’s put the specific embargoed assets use case aside, and focus on the primary asset distribution use case. Here is the big picture of Contentful’s infrastructure for delivering assets:
Figure 1 – Asset delivery request diagram.
Let’s go through this diagram on how requests are fulfilled:
- CloudFront point of presence (PoP) receives the request and checks if there’s a cached and still valid version to serve.
- If you need to perform an uncached request, before going to the origin the Lambda@Edge is executed in the region closest to the user.
- It performs some sync validations to ensure you don’t spend time on an origin request for a malformed URL.
- Next, it performs an async asset validation to request some metadata for that asset. This prevents, as an example, archived assets from being served.
- It can also modify the request to change the origin the request will fetch the file. This happens on Contentful’s images.ctfassets.net distributions whenever the user provides transformation parameters. Instead of going to Amazon S3 to fetch the asset, the platform sends the request to an image transformation API that fetches the asset, performs some transformations, and returns to CloudFront.
- When the origin should still be S3, Contentful performs a DNS load balancing between the two S3 buckets. This enables load balancing and automatic failover in case one of the buckets is down; if so, its DNS health check will fail and traffic will be redirected to the healthy one.
It’s worth mentioning that Contentful has every async validation heavily cached because having the Lambda that runs close to the user always go to the datastores would counteract the improved latency and performance of Lambda@Edge.
With Lambda@Edge, you could avoid costs of maintaining such a distributed service worldwide, as Lambda executions are fairly inexpensive and only charged on-demand.
The complexity to achieve the desired functionality was much lower. Contentful focuses on the build request and core business logic instead of spending time scaffolding boilerplate code.
Also, disaster recovery is easily managed with automatic failover to the healthy bucket. Not only can Contentful swiftly react to disasters but it helps you remain fully operational with no functionality impaired.
Finally, performance is a big benefit of this setup. Even after having all of those features being executed, the code itself is executed close to the user and all external dependencies are heavily cached. Because of that, from the one million requests Contentful serves per minute, even the slowest 5% requests will perform under 100ms on average in most regions. When looking at the total average, the numbers are even lower, around 20ms.
Conclusion
With Lambda@Edge, Contentful can easily ship code that runs close to the user to deliver secure assets with asynchronous validations, high availability, and dynamic origin requests.
In the digital-first era, where everything is connected, it’s important those assets are also delivered with low latency without forsaking any feature or security, and Amazon CloudFront with Lambda@Edge is a key part for achieving this.
The abstracted complexity to run edge compute operations makes it easier to focus on what matters and deliver secure assets with asynchronous validations and dynamic origin requests. Lastly, high availability and automatic failover means being able to swiftly recover from disasters and keep delivery of media files up and running for customers even in unpredictable scenarios.
If you want to empower your delivery of files worldwide with custom code, low latency, and easy development, Lambda@Edge and the new Amazon CloudFront functions might be just what you are looking for.
Contentful – AWS Partner Spotlight
Contentful is an AWS Competency Partner that lets digital builders leverage content structuring, orchestration, and reusability to manage the full lifecycle of content across the organization.
Contact Contentful | Partner Overview
*Already worked with Contentful? Rate the Partner
*To review an AWS Partner, you must be a customer that has worked with them directly on a project.