Optimizing the cost of serverless web applications
Web application backends are one of the most frequent types of serverless use-case for customers. The pay-for-value model can make it cost-efficient to build web applications using serverless tools.
While serverless cost is generally correlated with level of usage, there are architectural decisions that impact cost efficiency. The impact of these choices is more significant as your traffic grows, so it’s important to consider the cost-effectiveness of different designs and patterns.
This blog post reviews some common areas in web applications where you may be able to optimize cost. It uses the Happy Path web application as a reference example, which you can read about in the introductory blog post.
Serverless web applications generally use a combination of the services in the following diagram. I cover each of these areas to highlight common areas for cost optimization.
The API management layer: Selecting the right API type
Most serverless web applications use an API between the frontend client and the backend architecture. Amazon API Gateway is a common choice since it is a fully managed service that scales automatically. There are three types of API offered by the service – REST APIs, WebSocket APIs, and the more recent HTTP APIs.
HTTP APIs offer many of the features in the REST APIs service, but the cost is often around 70% less. It supports Lambda service integration, JWT authorization, CORS, and custom domain names. It also has a simpler deployment model than REST APIs. This feature set tends to work well for web applications, many of which mainly use these capabilities. Additionally, HTTP APIs will gain feature parity with REST APIs over time.
The Happy Path application is designed for 100,000 monthly active users. It uses HTTP APIs, and you can inspect the backend/template.yaml to see how to define these in the AWS Serverless Application Model (AWS SAM). If you have existing AWS SAM templates that are using REST APIs, in many cases you can change these easily:
Content distribution layer: Optimizing assets
Amazon CloudFront is a content delivery network (CDN). It enables you to distribute content globally across 216 Points of Presence without deploying or managing any infrastructure. It reduces latency for users who are geographically dispersed and can also reduce load on other parts of your service.
A typical web application uses CDNs in a couple of different ways. First, there is the distribution of the application itself. For single-page application frameworks like React or Vue.js, the build processes create static assets that are ideal for serving over a CDN.
The second common CDN use-case for web apps is for user-generated content (UGC). Many apps allow users to upload images, which are then shared with other users. A typical photo from a 12-megapixel smartphone is 3–9 MB in size. This high resolution is not necessary when photos are rendered within web apps. Displaying the high-resolution asset results in slower download performance and higher data transfer costs.
The Happy Path application uses a Resizer Lambda function to optimize these uploaded assets. This process creates two different optimized images depending upon which component loads the asset.
The upload S3 bucket shows the original size of the upload from the smartphone:
The distribution S3 bucket contains the two optimized images at different sizes:
The distribution file sizes are 98–99% smaller. For a busy web application, using optimized image assets can make a significant difference to data transfer and CloudFront costs.
Additionally, you can convert to highly optimized file formats such as WebP to reduce file size even further. Not all browsers support this format, but you can use CSS on the frontend to fall back to other types if needed:
<img src="myImage.webp" onerror="this.onerror=null; this.src='myImage.jpg'">
The data layer
AWS offers many different database and storage options that can be useful for web applications. Billing models vary by service and Region. By understanding the data access and storage requirements of your app, you can make informed decisions about the right service to use.
Generally, it’s more cost-effective to store binary data in S3 than a database. First, when the data is uploaded, you can upload directly to S3 with presigned URLs instead of proxying data via API Gateway or another service.
If you are using Amazon DynamoDB, it’s best practice to store larger items in S3 and include a reference token in a table item. Part of DynamoDB pricing is based on read capacity units (RCUs). For binary items such as images, it is usually more cost-efficient to use S3 for storage.
Many web developers who are new to serverless are familiar with using a relational database, so choose Amazon RDS for their database needs. Depending upon your use-case and data access patterns, it may be more cost effective to use DynamoDB instead. RDS is not a serverless service so there are monthly charges for the underlying compute instance. DynamoDB pricing is based upon usage and storage, so for many web apps may be a lower-cost choice.
This layer includes services like Amazon SQS, Amazon SNS, and Amazon EventBridge, which are essential for decoupling serverless applications. Each of these have a request-based pricing component, where 64 KB of a payload is billed as one request. For example, a single SQS message with a 256 KB payload is billed as four requests. There are two optimization methods common for web applications.
1. Combine messages
Many messages sent to these services are much smaller than 64 KB. In some applications, the publishing service can combine multiple messages to reduce the total number of publish actions to SNS. Additionally, by either eliminating unused attributes in the message or compressing the message, you can store more data in a single request.
For example, a publishing service may be able to combine multiple messages together in a single publish action to an SNS topic:
- Before optimization, a publishing service sends 100,000,000 1KB-messages to an SNS topic. This is charged as 100 million messages for a total cost of $50.00.
- After optimization, the publishing service combines messages to send 1,562,500 64KB-messages to an SNS topic. This is charged as 1,562,500 messages for a total cost of $0.78.
2. Filter messages
In many applications, not every message is useful for a consuming service. For example, an SNS topic may publish to a Lambda function, which checks the content and discards the message based on some criteria. In this case, it’s more cost effective to use the native filtering capabilities of SNS. The service can filter messages and only invoke the Lambda function if the criteria is met. This lowers the compute cost by only invoking Lambda when necessary.
For example, an SNS topic receives messages about customer orders and forwards these to a Lambda function subscriber. The function is only interested in canceled orders and discards all other messages:
- Before optimization, the SNS topic sends all messages to a Lambda function. It evaluates the message for the presence of an order canceled attribute. On average, only 25% of the messages are processed further. While SNS does not charge for delivery to Lambda functions, you are charged each time the Lambda service is invoked, for 100% of the messages.
- After optimization, using an SNS subscription filter policy, the SNS subscription filters for canceled orders and only forwards matching messages. Since the Lambda function is only invoked for 25% of the messages, this may reduce the total compute cost by up to 75%.
3. Choose a different messaging service
For complex filtering options based upon matching patterns, you can use EventBridge. The service can filter messages based upon prefix matching, numeric matching, and other patterns, combining several rules into a single filter. You can create branching logic within the EventBridge rule to invoke downstream targets.
EventBridge offers a broader range of targets than SNS destinations. In cases where you publish from an SNS topic to a Lambda function to invoke an EventBridge target, you could use EventBridge instead and eliminate the Lambda invocation. For example, instead of routing from SNS to Lambda to AWS Step Functions, instead create an EventBridge rule that routes events directly to a state machine.
Business logic layer
Step Functions allows you to orchestrate complex workflows in serverless applications while eliminating common boilerplate code. The Standard Workflow service charges per state transition. Express Workflows were introduced in December 2019, with pricing based on requests and duration, instead of transitions.
For workloads that are processing large numbers of events in shorter durations, Express Workflows can be more cost-effective. This is designed for high-volume event workloads, such as streaming data processing or IoT data ingestion. For these cases, compare the cost of the two workflow types to see if you can reduce cost by switching across.
Lambda is the on-demand compute layer in serverless applications, which is billed by requests and GB-seconds. GB-seconds is calculated by multiplying duration in seconds by memory allocated to the function. For a function with a 1-second duration, invoked 1 million times, here is how memory allocation affects the total cost in the US East (N. Virginia) Region:
|Memory (MB)||GB/S||Compute cost||Total cost|
|128||125,000||$ 2.08||$ 2.28|
|512||500,000||$ 8.34||$ 8.54|
|1024||1,000,000||$ 16.67||$ 16.87|
|1536||1,500,000||$ 25.01||$ 25.21|
|2048||2,000,000||$ 33.34||$ 33.54|
|3008||2,937,500||$ 48.97||$ 49.17|
There are many ways to optimize Lambda functions, but one of the most important choices is memory allocation. You can choose between 128 MB and 3008 MB, but this also impacts the amount of virtual CPU as memory increases. Since total cost is a combination of memory and duration, choosing more memory can often reduce duration and lower overall cost.
Instead of manually setting the memory for a Lambda function and running executions to compare duration, you can use the AWS Lambda Power Tuning tool. This uses Step Functions to run your function against varying memory configurations. It can produce a visualization to find the optimal memory setting, based upon cost or execution time.
Web application backends are one of the most popular workload types for serverless applications. The pay-per-value model works well for this type of workload. As traffic grows, it’s important to consider the design choices and service configurations used to optimize your cost.
Serverless web applications generally use a common range of services, which you can logically split into different layers. This post examines each layer and suggests common cost optimizations helpful for web app developers.