Security practices in AWS multi-tenant SaaS environments
Securing software-as-a-service (SaaS) applications is a top priority for all application architects and developers. Doing so in an environment shared by multiple tenants can be even more challenging. Identity frameworks and concepts can take time to understand, and forming tenant isolation in these environments requires deep understanding of different tools and services.
While security is a foundational element of any software application, specific considerations apply to SaaS applications. This post dives into the challenges, opportunities and best practices for securing multi-tenant SaaS environments on Amazon Web Services (AWS).
SaaS application security considerations
Single tenant applications are often deployed for a specific customer, and typically only deal with this single entity. While security is important in these environments, the threat profile does not include potential access by other customers. Multi-tenant SaaS applications have unique security considerations when compared to single tenant applications.
In particular, multi-tenant SaaS applications must pay special attention to identity and tenant isolation. These considerations are in addition to the security measures all applications must take. This blog post reviews concepts related to identity and tenant isolation, and how AWS can help SaaS providers build secure applications.
SaaS applications are accessed by individual principals (often referred to as users). These principals may be interactive (for example, through a web application) or machine-based (for example, through an API). Each principal is uniquely identified, and is usually associated with information about the principal, including email address, name, role and other metadata.
In addition to the unique identification of each individual principal, a SaaS application has another construct: a tenant. A paper on multi-tenancy defines a tenant as a group of one or more users sharing the same view on an application they use. This view may differ for different tenants. Each individual principal is associated with a tenant, even if it is only a 1:1 mapping. A tenant is uniquely identified, and contains information about the tenant administrator, billing information and other metadata.
When a principal makes a request to a SaaS application, the principal provides their tenant and user identifier along with the request. The SaaS application validates this information and makes an authorization decision. In well-designed SaaS applications, this authorization step should not rely on a centralized authorization service. A centralized authorization service is a single point of failure in an application. If it fails, or is overwhelmed with requests, the application will no longer be able to process requests.
There are two key techniques to providing this type of experience in a SaaS application: using an identity provider (IdP) and representing identity or authorization in a token.
Using an Identity Provider (IdP)
In the past, some web applications often stored user information in a relational database table. When a principal authenticated successfully, the application issued a session ID. For subsequent requests, the principal passed the session ID to the application. The application made authorization decisions based on this session ID. Figure 1 provides an example of how this setup worked.
In applications larger than a simple web application, this pattern is suboptimal. Each request usually results in at least one database query or cache look up, creating a bottleneck on the data store holding the user or session information. Further, because of the tight coupling between the application and its user management, federation with external identity providers becomes difficult.
When designing your SaaS application, you should consider the use of an identity provider like Amazon Cognito, Auth0, or Okta. Using an identity provider offloads the heavy lifting required for managing identity by having user authentication, including federation, handled by external identity providers. Figure 2 provides an example of how a SaaS provider can use an identity provider in place of the self-managed solution shown in Figure 1.
Once a user authenticates with an identity provider, the identity provider issues a standardized token. This token is the same regardless of how a user authenticates, which means your application does not need to build in support for multiple different authentication methods tenants might use.
Identity providers also commonly support federated access. Federated access means that a third party maintains the identities, but the identity provider has a trust relationship with this third party. When a customer tries to log in with an identity managed by the third party, the SaaS application’s identity provider handles the authentication transaction with the third-party identity provider.
This authentication transaction commonly uses a protocol like Security Assertion Markup Language (SAML) 2.0. The SaaS application’s identity provider manages the interaction with the tenant’s identity provider. The SaaS application’s identity provider issues a token in a format understood by the SaaS application. Figure 3 provides an example of how a SaaS application can provide support for federation using an identity provider.
For an example, see How to set up Amazon Cognito for federated authentication using Azure AD.
Representing identity with tokens
Identity is usually represented by signed tokens. JSON Web Signatures (JWS), often referred to as JSON Web Tokens (JWT), are signed JSON objects used in web applications to demonstrate that the bearer is authorized to access a particular resource. These JSON objects are signed by the identity provider, and can be validated without querying a centralized database or service.
The token contains several key-value pairs, called claims, which are issued by the identity provider. Besides several claims relating to the issuance and expiration of the token, the token can also contain information about the individual principal and tenant.
Sample access token claims
The example below shows the claims section of a typical access token issued by Amazon Cognito in JWT format.
The principal, and the tenant the principal is associated with, are represented in this token by the combination of the user identifier (the sub claim) and the tenant ID in the cognito:groups claim. In this example, the SaaS application represents a tenant by creating a Cognito group per tenant. Other identity providers may allow you to add a custom attribute to a user that is reflected in the access token.
When a SaaS application receives a JWT as part of a request, the application validates the token and unpacks its contents to make authorization decisions. The claims within the token set what is known as the tenant context. Much like the way environment variables can influence a command line application, the tenant context influences how the SaaS application processes the request.
By using a JWT, the SaaS application can process a request without frequent reference to an external identity provider or other centralized service.
Tenant isolation is foundational to every SaaS application. Each SaaS application must ensure that one tenant cannot access another tenant’s resources. The SaaS application must create boundaries that adequately isolate one tenant from another.
Determining what constitutes sufficient isolation depends on your domain, deployment model and any applicable compliance frameworks. The techniques for isolating tenants from each other depend on the isolation model and the applications you use. This section provides an overview of tenant isolation strategies.
Your deployment model influences isolation
How an application is deployed influences how tenants are isolated. SaaS applications can use three types of isolation: silo, pool, and bridge.
Silo deployment model
The silo deployment model involves customers deploying one set of infrastructure per tenant. Depending on the application, this may mean a VPC-per-tenant, a set of containers per tenant, or some other resource that is deployed for each tenant. In this model, there is one deployment per tenant, though there may be some shared infrastructure for cross-tenant administration. Figure 4 shows an example of a siloed deployment that uses a VPC-per-tenant model.
Pool deployment model
The pool deployment model involves a shared set of infrastructure for all tenants. Tenant isolation is implemented logically in the application through application-level constructs. Rather than having separate resources per tenant, isolation enforcement occurs within the application. Figure 5 shows an example of a pooled deployment model that uses serverless technologies.
In Figure 5, an AWS Lambda function that retrieves an item from an Amazon DynamoDB table shared by all tenants needs temporary credentials issued by the AWS Security Token Service. These credentials only allow the requester to access items in the table that belong to the tenant making the request. A requester gets these credentials by assuming an AWS Identity and Access Management (IAM) role. This allows a SaaS application to share the underlying infrastructure, while still isolating tenants from one another. See Isolation enforcement depends on service below for more details on this pattern.
Bridge deployment model
The bridge model combines elements of both the silo and pool models. Some resources may be separate, others may be shared. For example, suppose your application has a shared application layer and an Amazon Relational Database Service (RDS) instance per tenant. The application layer evaluates each request and connects to the database for the tenant that made the request.
This model is useful in a situation where each tenant may require a certain response time and one set of resources acts as a bottleneck. In the RDS example, the application layer could handle the requests imposed by the tenants, but a single RDS instance could not.
The decision on which isolation model to implement depends on your customer’s requirements, compliance needs or industry needs. You may find that some customers can be deployed onto a pool model, while larger customers may require their own silo deployment.
Your tiering strategy may also influence the type of isolation model you use. For example, a basic tier customer might be deployed onto pooled infrastructure, while an enterprise tier customer is deployed onto siloed infrastructure.
For more information about different tenant isolation models, read the tenant isolation strategies whitepaper.
Isolation enforcement depends on service
Most SaaS applications will need somewhere to store state information. This could be a relational database, a NoSQL database, or some other storage medium which persists state. SaaS applications built on AWS use various mechanisms to enforce tenant isolation when accessing a persistent storage medium.
IAM provides fine grain access controls access for the AWS API. Some services, like Amazon Simple Storage Service (Amazon S3) and DynamoDB, provide the ability to control access to individual objects or items with IAM policies. When possible, your application should use IAM’s built-in functionality to limit access to tenant resources. See Isolating SaaS Tenants with Dynamically Generated IAM Policies for more information about using IAM to implement tenant isolation.
AWS IAM also offers the ability to restrict access to resources based on tags. This is known as attribute-based access control (ABAC). This technique allows you to apply tags to supported resources, and make access control decisions based on which tags are applied. This is a more scalable access control mechanism than role-based access control (RBAC), because you do not need to modify an IAM policy each time a resource is added or removed. See How to implement SaaS tenant isolation with ABAC and AWS IAM for more information about how this can be applied to a SaaS application.
Some relational databases offer features that can enforce tenant isolation. For example, PostgreSQL offers a feature called row level security (RLS). Depending on the context in which the query is sent to the database, only tenant-specific items are returned in the results. See Multi-tenant data isolation with PostgreSQL Row Level Security for more information about row level security in PostgreSQL.
Other persistent storage mediums do not have fine grain permission models. They may, however, offer some kind of state container per tenant. For example, when using MongoDB, each tenant is assigned a MongoDB user and a MongoDB database. The secret associated with the user can be stored in AWS Secrets Manager. When retrieving a tenant’s data, the SaaS application first retrieves the secret, then authenticates with MongoDB. This creates tenant isolation because the associated credentials only have permission to access collections in a tenant-specific database.
Generally, if the persistent storage medium you’re using offers its own permission model that can enforce tenant isolation, you should use it, since this keeps you from having to implement isolation in your application. However, there may be cases where your data store does not offer this level of isolation. In this situation, you would need to write application-level tenant isolation enforcement. Application-level tenant isolation means that the SaaS application, rather than the persistent storage medium, makes sure that one tenant cannot access another tenant’s data.
This post reviews the challenges, opportunities and best practices for the unique security considerations associated with a multi-tenant SaaS application, and describes specific identity considerations, as well as tenant isolation methods.
If you’d like to know more about the topics above, the AWS Well-Architected SaaS Lens Security pillar dives deep on performance management in SaaS environments. It also provides best practices and resources to help you design and improve performance efficiency in your SaaS application.
Get Started with the AWS Well-Architected SaaS Lens
The AWS Well-Architected SaaS Lens focuses on SaaS workloads, and is intended to drive critical thinking for developing and operating SaaS workloads. Each question in the lens has a list of best practices, and each best practice has a list of improvement plans to help guide you in implementing them.
The lens can be applied to existing workloads, or used for new workloads you define in the tool. You can use it to improve the application you’re working on, or to get visibility into multiple workloads used by the department or area you’re working with.
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the Security Hub forum. To start your 30-day free trial of Security Hub, visit AWS Security Hub.
Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.