AWS Partner Network (APN) Blog

Tenant Switching and Custom Permissions in a Multi-Tenant Serverless Application

By John Ingram, Sr. Software Developer – 56K.Cloud
By Michael Schmid, Sr. Partner Solutions Architect – AWS

56K.Cloud-AWS-Partners-2024
56K.Cloud
56K.Cloud-APN-Blog-CTA-2024

Many providers of software-as-a-service (SaaS) applications want to reach as many different customers as possible to scale their offering while optimizing cost and operational efficiency. Achieving this while meeting security requirements and customer feature demands can be challenging.

Using a serverless model where you can rely on managed services and precise scaling of resources without the need to spend in pre-investments is a compelling fit for a SaaS application.

This post builds upon the core concepts presented in Building a Multi-Tenant SaaS Solution Using AWS Serverless Services to propose a cost-effective pooled multi-tenant serverless architecture that provides:

  • Isolation of tenant data.
  • Ability for users to belong to multiple tenants.
  • Seamless switching between tenants for users.
  • Custom user roles with custom permissions by tenant admins.

56K.Cloud is an AWS Partner and leading consulting and engineering firm providing turn-key solutions. It specializes on Amazon Web Services (AWS) as a whole process from cloud advisory and migration to prototyping and managed services.

With a strong background in application development, 56K.Cloud has a deep expertise of serverless architectures and guides customers through building reliable, secure, and cost-effective serverless applications similar to the one presented in this post.

Solution Overview

The proposed multi-tenant architecture utilizes the following managed services:

multi-tenant serverless architecture overview

Figure 1 – Multi-tenant serverless architecture overview.

Protecting Data with Amazon DynamoDB

Amazon DynamoDB is a fully managed NoSQL database, known for cost-effectiveness, scalability, and consistent performance. Proper data isolation between tenants is crucial, and a single-table design where all application data is stored in one table offers several advantages:

  • Reduced operational overhead; a single table to provision, and no separate resources for new tenants.
  • Streamlined feature deployments and database updates with a single database.
  • Consistent query performance through effective data modeling for required access patterns.

Relying only on application logic for tenant isolation carries risks. Trusting business logic and database queries alone does not guarantee tenant data isolation; an unintentional application misconfiguration with undesirable side-effects could always slip through the cracks.

Setting AWS Identity and Access Management (IAM) policies on the DynamoDB table to enforce access controls provides an extra measure of security and reflects best practice guardrails to maintain principle of least privileged access.

DynamoDB provides a dynamodb:LeadingKeys condition which can be used to implement row-based access control by granting access to only certain partition keys.

However, placing all tenant data within a single partition would mean a high volume of read and write requests to a single partition, which could cause unwanted throttling (hot partitions). IAM supports wildcards for the dynamodb:LeadingKeys condition, meaning one could implement partition key sharding where the data written to the table are spread evenly across multiple partitions. This results in better parallelism and higher overall throughput while still being able to secure partitions.

A table’s primary index is not the sole access method. Secondary indexes, both local and global, cater to additional access patterns. For instance, to allow users to belong to multiple tenants, the primary index can have a one-to-many tenant-to-user relationship (primary key TENANT#{tenant_id} and sort key USER#{user_id}).

A global secondary index can flip this to a one-to-many user-to-tenant relationship (primary key USER#{user_id} and sort key TENANT#{tenant_id}). Secondary indexes can also be secured by adding relevant resources to the IAM policy.

In summary, an IAM policy contains tenant_id and user_id principal tag variables.

Sample IAM policy for a DynamoDB table

Figure 2 – Sample IAM policy restricting access to partitions of a specific tenant or user in a DynamoDB table.

Identity Management

Amazon Cognito is an identity provider comprised of two types of pools: user pools for authenticating users and storing their properties; and identity pools for providing temporary AWS credentials to access resources and services.

The main goals are:

  • Use Cognito as the sole identity provider.
  • Provide tenant-scoped access to the DynamoDB table.
  • Allow users to belong to multiple tenants.

The IAM policy contains a PrincipalTag variable. When exchanging for credentials, this variable needs to be replaced by the user’s tenant ID. The user pool is configured with a custom:tenant_id user attribute. To map this attribute to the IAM PrincipalTag condition and serve temporary tenant-scoped AWS credentials, an identity pool is configured with:

  • User pool as the identity provider.
  • Custom claim mapping between the custom:tenant_id attribute and PrincipalTag variable.
  • IAM role as the authenticated role.

At this stage, we have a means of mapping users to a tenant and serving scoped credentials for them via the identity pool. However, we need to meet the requirement of a many-to-many relationship between users and tenants. We propose leveraging Cognito user groups, where each tenant has a corresponding group and Cognito allows users to belong to multiple groups. Furthermore, this approach allows Cognito user groups to appear in the ID token.

Since a user can belong to multiple tenants, the custom:tenant_id attribute needs to be mutable to request scoped credentials for the relevant tenant. However, the Cognito App Client used by the frontend should lack write privileges on this sensitive attribute; only the control plane Lambda should have write access.

Authentication Flows

To provide more context of how the above concepts work together, let’s go over key user flows of a multi-tenant application.

Signing Up and Creating a Tenant

Flow of user registration and tenant creation

Figure 3 – Step by step flow of user registration and tenant creation.

  1. New user sign-up.
  2. Account confirmation via SMS or email code.
  3. Upon confirmation, a Lambda trigger saves the user’s profile information to DynamoDB, as Cognito should not be used as a primary datastore.
  4. Amazon Cognito returns an ID token for the frontend application to authenticate API requests.
  5. Frontend application presents a form to create a new tenant, calling the create tenant API with the ID token and tenant details.
  6. Amazon API Gateway’s Lambda authorizer validates the user’s identity using the ID token and, upon success, performs the following tasks:
    1. Create random Tenant ID.
    2. Create new tenant partition in DynamoDB.
    3. Create a new group in the Cognito user pool with the tenant ID as name.
    4. Add the user to the group.
    5. Update the user’s Tenant ID attribute in the user pool.
    6. Call the InitiateAuth API with REFRESH_TOKEN_AUTH to get new tokens for the user’s updated status.
  7. New tokens are returned for subsequent API calls.

Scoped Database Access

Decoding the ID token (Figure 4) reveals its attributes align with Cognito settings:

  • cognito:groups – list of user’s affiliated groups.
  • custom:tenant_id – ID of user’s active tenant.

Example payload of an ID token

Figure 4 – Example payload of an ID Token.

The API endpoints handling CRUD database operations rely on a Lambda authorizer that does the following:

  • Validates ID token’s signature.
  • Leverages Cognito’s GetCredentialsForIdentity to acquire short-lived access credentials based on the ID token and identity pool’s authenticated role.
  • Returns credentials, tenant, and user IDs in the authorizer response context object as key-value pairs.

After successful authentication, Amazon API Gateway sends the authorization context with AWS credentials to your Lambda function, allowing it to initialize the DynamoDB SDK and make database calls restricted to the user’s permissions.

Initializing the DynamoDB SDK

Figure 5 – Initializing the DynamoDB SDK using credentials obtained from the caller’s role.

Switching Between Tenant Environments

SaaS apps often enable users to switch between multiple organizations, and the proposed architecture facilitates this by allowing users to change their active tenant.

The tenant_id Cognito user attribute was defined as mutable, meaning the control plan Lambda can use the UpdateUserAttributes API to update it. This is a sensitive operation and should only be successful if the user belongs to the requested tenant cross-referencing against the user’s Cognito user group membership.

On success, Cognito issues a new ID token for API calls, and database operations are scoped to the new tenant ID.

flow of a user switching tenants

Figure 6 – Step-by-step flow of a user switching tenants with a single click.

  1. User wants to change active tenants. The app retrieves the list of tenants the user is a member of from DynamoDB using a global secondary index with the one-to-many user to tenant relationship.
  2. App displays the list of tenants in a menu.
  3. User selects Tenant B, triggering an API call to change the user’s active tenant, passing Tenant B’s ID in the payload.
  4. The authorizer validates the JWT ID token, extracts the cognito:groups claim containing the user’s groups, and passes the list back in the authorizer’s response.
  5. Lambda verifies that Tenant B’s ID is included in the group list, updates the user’s tenant ID attribute in the user pool by having admin access to Cognito, and then refreshes the user’s tokens.
  6. Refreshed tokens are returned in the API response.
  7. Subsequent API requests by the app will be scoped to Tenant B.

Custom Roles for Specific Tenant Environments

Until now, we’ve focused on limiting access to tenant data. But what if we want to give customers freedom to create their own roles and permissions within their organization?

A proliferation of Cognito user groups, IAM roles, and policies is not practical, and you risk reaching IAM and Cognito resource quotas. Moreover, IAM policies cannot control access to DynamoDB sort keys, a crucial DynamoDB feature.

To set hard boundaries between tenants, IAM roles can be used, but for internal customer roles an alternative solution could be explored. This introduces custom business logic, which can be handled in various ways, both custom and fully service-based, including using Amazon Verified Permissions. We’ll examine using DynamoDB features and user attributes to support custom user roles in a simple and cost-effective manner without introducing new services, suitable for applications with relatively simple permission logic.

The tenant admin creates roles, which are simply DynamoDB items with ID, name, and action list. Role IDs can be assigned to each Cognito users as custom:role attribute. This doesn’t handle authorization/credential exchange, as previously discussed. The role attribute in the ID token simply reduces database queries by providing the role ID readily to the application Lambda. When the request reaches the backend, the Role ID is used to query the role item in the database and validate the requested action’s permissions.

To illustrate, consider an aircraft reservation system with user roles like pilots, instructors, and maintenance chiefs. Only the maintenance chief can create or update aircraft data. Figure 7 shows the flow of a CRUD request against the database, which has tenant roles with custom actions attached.

Tenant roles administered

Figure 7 – Fine-grained tenant roles administered by tenants that are free to implement their own permissions.

Next, Figure 8 demonstrates a DynamoDB write transaction with a condition check, allowing the write request to succeed only for the maintenance chief role.

DynamoDB SDK to perform an all-or-nothing transaction write request

Figure 8 – Using the DynamoDB SDK to perform an all-or-nothing transaction write request.

Conclusion

To ensure data isolation in a multi-tenant SaaS application, use IAM roles as the foundational layer, leveraging Amazon Cognito and Amazon API Gateway to securely assume these tenant-aware roles on behalf of users, rather than relying solely on DynamoDB partitioning strategies.

Once this foundational layer is in place, you can build upon it to offer commonly seen SaaS application features such as tenant switching and giving tenants freedom to create their own roles and permissions within their organization.

.
56K.Cloud-APN-Blog-Connect-2024
.


56K.Cloud – AWS Partner Spotlight

56K.Cloud is an AWS Partner and leading consulting and engineering firm providing turn-key solutions. It specializes on AWS as a whole process from cloud advisory and migration to prototyping and managed services.

Contact 56K.Cloud | Partner Overview