Performance Efficiency in AWS Multi-Tenant SaaS Environments

By Humberto Somensi, Partner Solutions Architect – AWS SaaS Factory

Maximizing performance is a key area of focus for all architects. Achieving optimal performance, however, can be challenging in software-as-a-service (SaaS) environments where multi-tenant workloads can make it difficult to efficiently profile and scale your environment.

While performance efficiency principles apply to all modern solutions, there are unique elements and considerations when these principles are analyzed in the context of multi-tenant SaaS solutions.

This post dives deep into the challenges, opportunities, and best practices of efficiently managing performance in multi-tenant SaaS environments on Amazon Web Services (AWS). To point out these practices, I will review these topics through the lens of an example multi-tenant search application.

Performance Efficiency in SaaS Environments

The performance profile of traditional, installed software environments is often simpler to model. These environments are generally focused on profiling the behavior of a single customer. Here, the loads tend to be more predictable.

When you look at a SaaS environment, it’s challenging to create a performance model that accommodates the unpredictable nature of multi-tenant workloads.

With SaaS, performance requirements are continually changing. New tenants can join and leave the system at any time, and tenants’ usage patterns may change on a day-to-day or even an hour-to-hour basis. Moreover, one tenant can adversely impact the experience of other tenants, creating a noisy neighbor scenario.

As a SaaS provider, your goal is to efficiently provide a consistent quality of service to your tenants, spanning your service’s range of tiers and configurations. If you don’t effectively address these dynamics, it can impact customer experience, costs, and agility, ultimately creating barriers for service adoption and growth.

In the next sections, I’ll review the main aspects of these challenges and describe how AWS can assist SaaS providers with overcoming them and building efficient solutions for their customers.

Tenant-Aware Metrics and Insights

Addressing performance-related challenges in SaaS environments starts with producing tenant-aware metrics and insights. These metrics allow you to detect tenant consumption trends, how tenants are experiencing the service, and evaluate how the service is responding to tenant workload variations.

The blog post Capturing and Visualizing Multi-Tenant Metrics Inside a SaaS Application on AWS describes in more detail the value of tenant-aware metrics in SaaS solutions. It also deep dives into a solution that provides all of the infrastructure to support the ingestion, aggregation, and analysis of SaaS metric data.

Let’s look at the metrics topic in the context of performance efficiency by looking at an example application:

Figure 1 – Example multi-tenant search cluster deployment.

The example application includes a search service that uses Amazon OpenSearch Service, where tenant users can upload items which will then be searched by tenant customers.

Two models can be considered here: a silo model for premium tenants, and a pooled model for basic tier tenants. This means premium tenants will have dedicated domains (or clusters) and basic tier tenants will use shared domains. Also, each tenant has its own index size and search rate, represented by the different size tenant icons and arrows.

To manage performance of the search service, the first step is to define metrics that correlate to how tenants are experiencing the service. Search Latency can encapsulate the performance value in this case. Search Rate and Inventory Size represent variable tenant specific workload attributes.

Finally, you need metrics that describe the overall health of your infrastructure resources. Amazon CloudWatch metrics for OpenSearch gives infrastructure health and consumption metrics.

The key takeaway is that SaaS applications must produce metrics describing how each tenant is imposing load in the system, how the system is responding to tenant load, and how each tenant is experiencing the service. This data will allow detecting changes in tenants’ consumption profiles, allow building strategies to accommodate distinct tenant needs and optimizing the application’s performance efficiency.

Now that we have defined the problem and reviewed the importance of tenant aware metrics, let’s see how we can use them to tackle the main challenges in efficiently managing performance in SaaS environments.

Aligning Resource Consumption and Tenant Usage

Once you have instrumented your SaaS solution with tenant-aware metrics, you can start to look at how these metrics can be used to create better alignment between your infrastructure costs and tenant activity.

The main goal is to reduce infrastructure waste by eliminating the need for over-provisioning infrastructure and relying on AWS elasticity to allocate resources as tenants’ activity requires.

Figure 2 shows a hypothetical example of an environment where resource consumption is aligned with tenant activity.

Figure 2 – Aligning tenant activity and consumption.

Correlating tenants’ performance needs with resource consumption is a key milestone for SaaS providers. The scaling policies and overall design strategy must allow the service to adjust resource consumption to meet tenant demands.

In the example above, resource consumption is directly linked to the size and utilization of the cluster where tenant demand and performance are measured by tenant metrics: Search Latency, Search Rate, and Inventory Size.

Now, let’s consider a scenario: a spike in tenant consumption is detected by an increase in Search Rate, while Search Latency indicates that performance is degraded. In this case, scaling events can increase the number of nodes in the cluster and also increase the number of replica shards, boosting the system’s ability to handle the load.

Search Latency should reflect these changes and measure back to expected levels. This is a simplified example to illustrate the value of tenant-aware metrics in creating accurate performance policies—managing the performance of ElasticSearch clusters can involve many metrics and techniques.

Finding the right metrics for a given service can require some iteration. By observing tenant consumption and activity patterns, you’ll be able to develop a clearer view of which metrics best represent the performance profile of your multi-tenant service. Longer-term, this could also influence the partitioning and tiering strategy of your solution.

The challenges with finding the right scaling metrics are somewhat different in serverless environments. When using serverless services, resources are only consumed when used and you only pay for what you use. This simplifies scaling tasks and shortens the path to performance efficiency.

The Serverless SaaS – Reference Solution provides a sample implementation that covers many of the common patterns and strategies that are used when creating multi-tenant solutions in Serverless environments.

Noisy Neighbor Mitigation

In SaaS environments, the ability to understand consumption profiles, detect deviations from expected consumption patterns and safeguard the experience of all tenants is a key requirement.

The goal is to create a multi-tenant experience that limits the opportunities for cross-tenant performance impacts. This means tenants should be able to operate within their expected usage quotas and defined SLA, and that new tenants can be added to the system without affecting the experience of other tenants.

Let’s return to our example application: consider that a single basic tier tenant is uploading thousands of items to their inventory. This may cause performance degradation for other tenants sharing the cluster by increasing the index size disproportionally.

By leveraging the tenant-aware metrics Inventory Size and Search Latency you can detect the impact of this activity to other tenants, and the cause of the issue. A simple solution is to limit the number of items that can be uploaded to basic tier tenants’ inventories, preventing such conditions to occur. Basic tier tenants will operate under lower limits than premium tenants, which also creates a tier boundary that aligns infrastructure consumption with business outcomes.

Another possible scenario is if a tenant’s consumption pattern suddenly increases beyond agreed levels, causing the service to overload. This can happen, for example, if a tenant mistakenly runs performance tests against the live service.

Scaling policies that help the service absorb the load are an option, but may not provide a fast-enough reaction to prevent the service to degrade. Another option is to impose limits, potentially by tier, and apply throttling to control the load level placed on the system. The Search Rate metric can indicate which tenant is imposing unexpected load in the system and allow you to decide if you want to accept this load or throttle it.

Amazon API Gateway usage plans can be an effective tool for throttling and quota management, since usage plans can be assigned to tenants’ keys and configured to control the rate of access to a given resource. For services that have critical SLAs, and noisy neighbor mitigation strategies or scaling policies do not ensure the appropriate levels of service, isolating tenant resources in silos is a valid approach.

Enabling Varying Performance Levels

SaaS providers will often use performance to create a value boundary that requires tenants to move to higher-level tiers to get a higher level of performance. Premium tier tenants can have increased quotas and more strict SLAs, whereas basic tier tenants can be capped to certain usage or experience higher latency during peak hours.

It’s important to align the performance expectation with the business outcomes for a given tier. In our example, basic tier tenants don’t warrant the need for their own indices, where premium tenants do.

In this case, the design of the service is reflecting the different tiering propositions. These differences will also manifest themselves in the scaling policies: premium tiers can accept a higher level of waste in exchange for ensuring quality of service versus a leaner approach for basic tier tenants where some risk of degradation is acceptable.

The point is to create a value boundary between tiers: design optimizations can be introduced to support a higher level of service to premium tenants, but the effort to introduce these optimizations should be offset with enough value to the customer and the business to justify the investment.

Summary

In this post, I have reviewed challenges, opportunities, and best practices of efficiently managing performance in SaaS environments. I also described how tenant-aware metrics enable you to profile the workload needs of your tenants, understand how changes in workload patterns reflect in changes in consumption patterns and ultimately build strategies to ensure your tenants have the appropriate level of service.

If you’d like to know more about the topics above, the AWS Well-Architected SaaS Lens Performance Efficiency pillar dives deep on performance management in SaaS environments. It also provides best practices and resources to help you design and improve performance efficiency in your SaaS application.

Get Started With the Well-Architected SaaS Lens

The AWS Well-Architected SaaS Lens focuses on SaaS workloads and is intended to drive critical thinking for developing and operating SaaS workloads. Each question in the lens has a list of best practices, and each best practice has a list of improvement plans to help guide you in implementing them.

The lens can be applied to existing workloads, or used for new workloads you define in the tool. You can use it to improve the application you’re working on, or to get visibility into multiple workloads used by the department or area you are working with.

The SaaS Lens is available in all regions where the AWS Well-Architected Tool is offered, as described in the AWS Regional Services List. There are no costs in using the Well-Architected Tool.

If you’re an AWS customer, find current AWS Partners that can conduct a review by learning about AWS Well-Architected Partners and AWS SaaS Competency Partners.

About AWS SaaS Factory

AWS SaaS Factory helps organizations at any stage of the SaaS journey. Whether looking to build new products, migrate existing applications, or optimize SaaS solutions on AWS, we can help. Visit the AWS SaaS Factory Insights Hub to discover more technical and business content and best practices.

SaaS builders are encouraged to reach out to their account representative to inquire about engagement models and to work with the AWS SaaS Factory team.