Cutting Costs with AWS Lambda for Highly Scalable Image Processing

By Tomislav Capan, Principal Solution Architect at Toptal

Let’s say you’ve got thousands of users who love to upload big and heavy images from their 40+ megapixel smartphone cameras. Each image needs to be processed into several variations for thumbnails, previews, and reasonably-sized images for web and mobile display.

All of that processing eats up the computing power (CPU) and random access memory (RAM). The more concurrent uploads you get, the more they wreak havoc on your servers. As you scale, this process simply doesn’t work.

Toptal is an AWS Partner Network (APN) Advanced Consulting Partner with the AWS Lambda Service Delivery designation. We provide application development and DevOps solutions for Amazon Web Services (AWS) customers.

Toptal helped build and automate the cloud infrastructure for LEVELS, a social network with an integrated payment function that also finds and applies VIP benefits users can get for the credit card programs they are a member of, things they own, or places they live.

With LEVELS, a guest can visit a luxury restaurant, be immediately recognized by the staff, taken to preferred seating, and pay for dinner without ever taking out a credit card or managing logistics. Payments are processed through LEVELS’ built-in “Get Up and Go” payment method, available in more than 6,000 locations globally.

This allows users to focus more on enjoying unique experiences and less on managing them. They can share their experiences with other users on the LEVELS social network through verified testimonials, and get endorsed by one of the LEVELS partners.

In this post, I will review a horizontally scalable solution applied to LEVELS for their image upload processing. This type of serverless solution can reduce the strain on your application programming interface (API) servers, and eliminate the need for running separate servers to handle spikes without crashing.

How to Optimize the Delivery of Images

LEVELS accepts users’ uploaded images, which can easily reach more than 8000 x 5000 pixels in size and a few megabytes in weight on modern smartphones. Because of this, there was a clear need for LEVELS to optimize the delivery of such images by providing smaller sizes and thumbnails.

The main goal was to reduce the bandwidth usage by the web and mobile apps, thereby speeding up image delivery within the app.

Let’s take a look at the three solutions we proposed to LEVELS for resizing the user uploaded images.

Resize on the API Server

The LEVELS team’s initial idea was to simply resize images into the desired size variations on the API server, or as a background job on Amazon Elastic Compute Cloud (Amazon EC2) instances. Setting up PoC (proof-of-concept) image processing, running on AWS Lambda, yielded the following results for the largest images:

Processing time exceeded two seconds per uploaded image.
ImageMagick used 1.5 GB of RAM—the maximum on AWS Lambda.

Such processing times and RAM usage called for the use of bigger Amazon EC2 instances than what the application itself needs. When adding multiple concurrent user uploads into the equation, the problem gets worse. LEVELS ended up heavily over-provisioning the Amazon EC2 instances size just to handle occasional spikes.

This was not the best way to approach the problem or spend the money. Working on this solution opened an opportunity to take the AWS Lambda route once LEVELS committed to using the AWS Cloud for their infrastructure needs.

Processing as Background Jobs

One could argue this is solvable by queuing the image processing. Handing this over to background workers would allow them to tackle it at their own pace, while auto scaling could bring up more resources as needed.

Although this solves the problem of over-provisioning to some extent, the eventual consistency has real-life implications. This solution may result in unacceptable processing times for creating the images’ variations, further deferring the resized images’ availability.

Finally, don’t forget the cost of cool-down periods until the extra resources get shut down after the spike has passed.

Horizontal Scaling with AWS Lambda

Because image uploads are very stochastic and unpredictable events, the best way to handle them is with horizontal scaling instead of vertical scaling.

The most cost-effective solution would:

Bring up just enough computing resources.
Have these resources at the time of the events.
Release the resources immediately after the event.

The final solution we proposed to LEVELS was to leverage AWS Lambda, which can handle processing tasks at scale as they come in.

This serverless solution eliminates the additional cost of running an extra Amazon EC2 CPU, which often sits idle. Instead, these processing tasks using the Lambda solution cost next to nothing per processed image.

Serverless Computing Model

I strongly recommend you take a moment to understand the serverless computing model. It’s a cloud-computing execution model that’s cost is based on the actual number of resources consumed by an application, rather than on pre-purchased units of capacity.

The serverless computing model enables the cloud provider to:

Run the server.
Dynamically manage the allocation of machine resources.
Scale quickly and respond to the number of events happening concurrently.

The scenarios described above perfectly fit our use case with LEVELS. Users’ uploads happen rather unpredictably and with highly varying frequency throughout the day. With the serverless computing model, each event brings up a new processing instance without any delay, the task is handled, and the processing instance stops.

This serverless solution has the following benefits for LEVELS:

Infinitely horizontally scalable.
Instantly available.
No significant warm-up times.
No cool-down periods.
Practically no limits.
No queues.
Predictable processing time.
Charges only for the running time and computing resources used for the task.

LEVELS Image Processing Walkthrough

To increase the images’ loading speed and save the bandwidth used by the mobile and web applications, LEVELS needed images smaller than the uploaded originals.

The LEVELS team specified four different image sizes they would use, plus a medium-sized squared crop and a small-sized squared crop thumbnail. That’s a total of six variations for each original image, optimized for use in different application contexts.

Additionally, Toptal decided we wanted a backup copy of the original to be created, providing added insurance against accidental data loss. Those backup copies were to be stored in a separate location with highly restricted access permissions to reduce the risk of deletion, whether accidental or deliberate in case of a security breach.

All image variations would be produced immediately on upload so that they could be readily available when requested by the application’s end users.

Here’s how you can do this for optimal results.

Step 1: Image Manipulation Tool

Producing image variations requires an image manipulation tool, and ImageMagick is one such tool that is widely used. Until mid-2019, ImageMagick was available as part of AWS Lambda runtime, but now it must be included separately.

Customers can add custom runtime dependencies through AWS Lambda Layers. There’s an ImageMagick Lambda Layer freely available in the AWS Serverless Application Repository. With just a click of a Deploy button, it becomes available to all Lambda functions under the AWS account.

Figure 1 – Deploying the ImageMagick Lambda Layer to your AWS account.

The ImageMagick Lambda Layer needs to be added to the Lambda function, where it becomes available to the function’s code.

This is done in the AWS Lambda Function Management Console by selecting Layers in the Designer section. From there, select Add a Layer in the Layers section opened below.

Figure 2 – Adding a layer to the AWS Lambda function.

In the Add Layer to Function screen that opens, choose the “Select from list of runtime compatible layers” option, and then from the Compatible Layers drop-down choose a layer named “image-magick.” Finally, from the Version drop-down, choose the latest version available. Click Add button to complete the action.

Figure 3 – Selecting the ImageMagick layer to add.

After successfully adding the ImageMagick layer to the Lambda function, it’s displayed in the Function’s Layers section.

Figure 4 – Successfully added ImageMagick layer.

Step 2: Storage

For object storage, LEVELS uses Amazon Simple Storage Service (Amazon S3). Each image upload is a triggering event for creating the required variations of the image, as well as a backup copy of the original.

To achieve our intended outcome, we need three Amazon S3 buckets: an original uploads bucket, processed images bucket, and backup bucket. The original uploads bucket would store all image originals uploaded by the users, while the remaining two buckets would store the processed variations and backup copy, respectively.

Step 3: Separate Action Functions

We wanted the image processing logic and backup copy creation logic decoupled so they would become two separate Lambda functions.

With such a configuration, Amazon S3 triggering a Lambda function directly won’t work because two different Lambda functions need to be triggered. Amazon Simple Notifications Service (SNS) can trigger multiple Lambda functions from a single occurring event.

We created a designated SNS topic for new uploads, allowing S3 to send a notification to that SNS topic on each image upload. The two Lambda functions subscribed to that topic, are triggered on each upload event, and do their own thing when an upload occurs.

The result of running two functions was:

Six image variations created and stored in a designated S3 bucket for processed images.
Backup copy created and stored in a backup S3 bucket.

Step 4: Content Delivery Network Configuration

Finally, the Amazon CloudFront content delivery network (CDN) was configured to serve images for geographically optimized content delivery.

Original uploads and processed images S3 buckets were configured as the CloudFront Distribution Origins.

Figure 5 – CloudFront Distribution images serving Amazon S3 Origins configuration.

Fetching the particular images—either an original or one of the variations—is controlled through CloudFront Distribution Behaviors based on the image path prefix: the /uploads/images/* path points to the processed images S3 bucket, while the /uploads/images/original/* path points to the originals bucket, with higher precedence.

Figure 6 – CloudFront Distribution images serving Behaviors configuration.

Such a configuration allows originals to be instantly accessed and served over the CDN after being uploaded, while the variations become available after the Lambda processing finishes (it typically takes a few seconds).

This was a requirement from the LEVELS team, as they needed to display the image to the end user immediately after a successful upload, even if serving a non-optimized full-size image in this particular case. That was a price they were willing to pay considering the additional waiting time for more optimized variations to become available.

In Figure 7 below, you see the complete solution processing flow for resizing the uploaded images and serving those images over the CDN.

Figure 7 – LEVELS image processing flow.

Additional Resources

Delving more deeply into the topic opens up alternative approaches to the problem, such as:

On-the-fly resizing trades the storage cost for a small delay in serving the image for the first time. To learn more, read Resize Images on the Fly with Amazon S3, AWS Lambda, and Amazon API Gateway.
Dynamic sizing enables you to produce images of any size at request. To learn more, read Resizing Images with Amazon CloudFront & Lambda@Edge.

Both AWS blog posts provide readily available code samples to use in crafting your own solution.

Summary

By applying the AWS Lambda solution, LEVELS no longer needs to worry about scaling their resources. Nor is there a concern about handling bursts of uploaded image traffic. This solution allowed for cost-effective image processing.

Our team at Toptal added an additional image size variation after the implementation of the initial solution. The easiest way to produce a new variation for the existing images in this setup was to reprocess all the images, recreate all size variations, and add the new variation.

This resulted in processing roughly 300,000 images weighing more than 8 GB in size. The total one-time cost was $27.50, or less than 0.01¢ ($0.0001) per image.

A serverless computing model with AWS Lambda is a natural fit for utility classes of tasks, such as preparing and sending transactional emails, push notifications, resizing images, or any kind of background processing triggered by events occurring in the system.

For LEVELS, comparing the approaches discussed in this post, the serverless computing model was the best fit for processing users’ uploads that are stochastic in nature.

The content and opinions in this blog are those of the third party author and AWS is not responsible for the content or accuracy of this post.

.

.

Toptal – APN Partner Spotlight

Toptal is an APN Advanced Technology Partner that enables companies to scale their teams, on demand. With Toptal, business leaders are no longer limited by painful internal recruiting processes and geographical constraints. Instead, they can instantly connect and start working with the perfect AWS-certified talent for the job.

Contact Toptal | Solution Overview

*Already worked with Toptal? Rate this Partner

*To review an APN Partner, you must be an AWS customer that has worked with them directly on a project.