AWS Database Blog
Choose the right PostgreSQL data access pattern for your SaaS application
In a multi-tenant environment, the process of identifying the database isolation strategy most suitable for your workload is essential. Your isolation strategy ensures that one tenant cannot access another tenant’s data, and defines how the actions of tenants affect one another. The isolation strategy is driven by a combination of business, domain, security, cost, resiliency, compliance and technology requirements.
AWS provides guidance in the SaaS Tenant Isolation Strategies whitepaper and the decision matrix for SaaS applications using PostgreSQL to help guide towards the optimal isolation strategy for your workload. Once you have chosen a partitioning model (how the data will be organized and stored), you need a secure way for tenants to access their data with isolation from other tenants.
In this post, we show you what the different data access patterns look like once you’ve selected a data isolation strategy. It focuses on the considerations for each pattern, and provides code-samples for multi-tenant database access with Amazon Relational Database Service (Amazon RDS) for PostgreSQL and Amazon Aurora PostgreSQL-Compatible Edition.
You can find the code samples associated with this post in this Github repository. For deployment instructions, follow the instructions in the repository.
Overview of multi-tenant data access
There are three basic SaaS partitioning models: silo, bridge, and pool. In the silo model, each tenant has their own unique physical resource. The bridge model logically separates tenants on a physical resource, for example, a database-per-tenant on a shared physical database. In the pool model, all tenants share a single logical construct on a physical resource, for example, a single schema shared by all tenants.
The data access pattern varies depending on the partitioning model of the database, the available AWS Identify and Access Management (AWS IAM) granularity, and the compute resource accessing it. The following diagram provides an overview of the factors that will be discussed throughout this blog post:
Authentication options for Amazon RDS for PostgreSQL & Aurora PostgreSQL-Compatible Edition
Access to a specific Amazon RDS for PostgreSQL or Amazon Aurora database can be restricted via AWS IAM roles with IAM authentication or using AWS Secrets Manager for native username/password authentication.
AWS IAM authentication generates short-lived credentials. This improves the application security posture because there is no need to store and update long-lived credentials. This reduces the risk of a valid credential being compromised. AWS IAM authentication is particularly well-suited to a siloed or bridged architecture as typically these have a lower number of concurrent users than with a pooled approach. This is important as there are defined upper limits for the number of AWS IAM authentication requests per second that can be made on a single AWS RDS instance.
The ability to obtain these permissions varies based on the isolation strategy used. For example, if an Amazon Elastic Compute Cloud (Amazon EC2) instance is used for each tenant, an IAM instance profile can be attached that only allows this instance to access the tenants database. The same is true for AWS Lambda Execution Roles or Amazon ECS Task roles.
When a single Amazon EC2 instance or AWS Lambda function is shared between multiple tenants, then additional mechanisms are required. You can implement Dynamic IAM session policies via a Token Vending Machine service. Alternatively, you can use attribute-based access control (ABAC) to provide fine-grained tenant access at runtime.
An alternative to AWS IAM authentication is AWS Secrets Manager. AWS Secrets Manager lets you use password authentication included in your chosen database engine and has several other benefits, including:
- Automatic credential management
- Secrets are stored encrypted at rest
- Protect access to secrets using AWS IAM policies
- Update secret values without redeploying application
- Automatically rotate secrets on a schedule
There are several multi-tenant deployment models for databases that can benefit from using AWS Secrets Manager. This includes environments with many concurrent connections, such as bridge or pooled mode, or if you cannot change your application to support temporary access credentials.
The isolation patterns and trade-offs between each approach are explored in the next section.
Silo database isolation with siloed compute
This strategy provides the strongest isolation with a separate database instance and separate compute instance per tenant. By using compute and database instances per tenant, additional isolation controls are available. To set up network isolation, you can create database and compute resources in separate network segments, such as using different subnets for virtual private clouds (VPCs). Access to the resource in these network segments can then be controlled via route tables, security groups and network access control lists. At the most extreme, resources can be isolated by AWS account. This level of isolation is useful if there are concerns around noisy neighbours at an AWS application programming interface (API) or service quota level. In addition, dedicated compliance or encryption requirements can be driving factors for this level of isolation.
In this model, an IAM role is associated with a specific tenant’s compute resource. This could be an AWS IAM instance profile associated with an Amazon EC2 instance or an execution role associated with an AWS Lambda function. This AWS IAM policy is then scoped to only grant permissions to the database that contains that tenant’s data.
If using an IAM role per tenant, the maximum number of tenants you can scale to is dependent on the IAM service quotas. It is recommended to instead implement scope down session policies or attribute-based-access-control (ABAC) as discussed later in this post. If you need to use a separate IAM role per tenant for compliance or operational reasons, then you may need to split your tenants across AWS accounts to scale beyond the service quotas of a single account.
The trade-offs for this increased isolation are a higher cost and increased operational complexity. Each tenant requires their own infrastructure, which for production should be run in a highly-available configuration. This leads to a higher cost per tenant and means that tooling must be introduced to manage backups, maintenance and upgrades across these separate tenant stacks. There are also service quotas which limit the number of resources that can be created per AWS account & Region.
The following diagram visualizes the outlined approach:
As the silo example is similar to the following bridge isolation patterns, refer to the implementation code in the next section which uses a database per tenant on the same physical RDS instance.
Bridge database isolation with siloed compute
The bridge isolation model acknowledges that there is a sliding scale between the silo and pool models. The main motivation for this pattern is cost reduction by sharing the database instance between multiple tenants.
There are two ways to implement the bridge isolation model on a PostgreSQL instance: databases and schemas. Its recommended to use one database per tenant instead of one schema per tenant. Compared with databases, schemas are logical constructs in PostgreSQL and do not separate out data on disk. Schemas also store files on a disk within the same directory in a database — in extreme cases, a system with many tenants can hit certain system limits, such as inode exhaustion.
This approach has the same isolation characteristics at the instance level, where you can use native access controls to restrict which users have access to which database objects, but you lose the network isolation controls.
You can find an example implementation of this pattern in this GitHub repository.
The second example uses AWS Secrets Manager. This represents the same level of isolation as the previous example for the compute and database. This means using one secret per tenant. Each tenant compute resource has a role associated with it that only provides access to that specific secret.
The following diagram visualizes the outlined approach. You can find an example implementation of this pattern in our repo.
Bridge database Isolation with pooled compute
In this pattern, multiple tenants share a single compute resource but continue to have separate databases. The motivation for using a shared compute resource can either be cost (especially for Amazon EC2 or containers) or less operational overhead, because only one deployment resource is maintained. However, the complexity of enforcing security guardrails for every tenant increases.
In this model, a tenant must be identified at runtime, not based on the current compute instance. A common pattern in SaaS architectures is to use a tenant identifier on each request. One method is to use a JSON Web Token (JWT) that provides identifying information that is signed cryptographically. Using a signed JWT lets you trust that the information is not modified and correctly identifies the tenant.
To maintain tenant isolation when using AWS Secrets Manager, you must ensure that tenants only have access to their own secret. With the AWS Security Token Service (STS), a session policy can be passed in when the AWS IAM Role is assumed that scopes down the permissions to the current tenant’s secret. In addition, it makes use of Attribute-Based-Access-Control (ABAC) and dynamically sets the tenant via a session tag. This means only a single AWS IAM role needs to be created and the tenant context can be provided dynamically. The application assumes the tenant aware AWS IAM Role and passes in the tenant claim from the token to scope down the permissions for the session. This restricts the application to only access the tenant specific secret and database.
The following diagram visualizes the outlined approach. You can find an example implementation of this pattern in our repo.
This access pattern can also be implemented with AWS IAM Auth instead of AWS Secrets Manager. An important consideration with IAM Auth in this model is that the IAM policy can only restrict access to the database instance, not to the database or any database objects within it. As such, create separate database users for each of your tenants and dynamically interpolate the resource ARN as shown in this sample.
Pool database Isolation with pooled compute
Instead of a database per tenant, a single database or schema can be shared between tenants. This reduces operational complexity even further since only a single schema and secret needs to be managed for all tenants. It also allows connections to the database to be pooled more effectively and avoids the need to dynamically retrieve the permissions for the secrets. However, noisy neighbours might have a larger impact on performance and errors related to a single schema affect all tenants simultaneously.
The pool model can introduce cost efficiencies over other isolation and partitioning strategies due to economies of scale from sharing resources between tenants. This model reduces service quota impacts and simplifies maintenance and backup operations. For considerations on multi-tenant backup patterns, see managed database backup and recovery in a multi-tenant SaaS application.
From a security perspective, cross-tenant data access can be prevented via PostgreSQLs built-in row level security (RLS) mechanism. Row level security works by applying a policy that limits which rows can be accessed, which is based on a defined condition. In a multi-tenant SaaS application, this condition could be where the tenant_id property on a table is equal to a variable indicating the current tenant.
The current tenant’s information can be retrieved from the authentication context (for example from a JWT token) and set as a session variable with the following SQL statement:
This RLS policy statement ensures that only records for the current tenants are retrieved.
The isolation here is weak as anyone with access to query the table can modify the session variable. To ensure sufficient tenant isolation is enforced, combine RLS policies with secure coding standards. These coding standards should ensure that all SQL statements have additional predicates (WHERE
clauses) that explicitly filter queries to return only a given tenants data. All SQL code changes should go through a peer review process to ensure that the session variable is not being manipulated and that these coding standards are met. For more information on RLS, read Multi-tenant data isolation with PostgreSQL Row Level Security.
The following diagram visualizes the outlined approach. You can find an example implementation of this pattern in our repo.
This access pattern can also be implemented with AWS IAM Auth instead of AWS Secrets Manager. As in previous examples, this only grants authorization to the database instance, so you still need to rely on the row-level security policy and application logic to enforce your tenant isolation. See the GitHub repo for an example implementation here.
Combining multiple isolation strategies
Often, a SaaS application will be comprised of multiple partitioning and isolation models. This could be for tiering, where “premium” tier tenants have their own dedicated resources. It could also be at a service level, where some services may require a separate approach to tenant isolation. Consider that any combination of the above approaches can be used across the application. An Amazon RDS instance can simultaneously use both IAM and password-based authentication. If your SaaS application uses a microservice architecture, using one database for multiple services is an anti-pattern. If you are using multiple isolation strategies and authentication modes on the same database instance, you may need to decompose your services into more dedicated data stores.
The table that follows illustrates the considerations when implementing one of the isolation strategies outlined in this post:
Isolation Strategy | Partitioning Model | Silo Database & Silo Compute |
Bridge Database & Silo Compute | Bridge Database & Pool Compute | Pool Database & Pool Compute |
Authentication Method | IAM / Secrets Manager | IAM / Secrets Manager | IAM / Secrets Manager | IAM/ Secrets Manager | |
Security Layers | Network, Compute IAM role, Session Policies, IAM ABAC, Password auth | Compute IAM role, Session Policies, IAM ABAC, Password auth | Application, IAM Session Policies, IAM ABAC, Password auth | Application, Row-Level-Security (RLS), Password auth | |
Noisy Neighbor Impact | None | Moderate | Moderate | High | |
Operational Impact | Minimal blast radius | Increased blast radius | Increased blast radius | Large blast radius | |
Cost Efficiency | Least efficient | Moderately efficient | Moderately efficient | Most efficient | |
Tenant Onboarding | Complex & time consuming | Complex | Less complex | Simple & fast | |
Tenant Metering & Billing | Simple | Moderate effort | Moderate effort | Complex | |
Monitoring | Least effort | Moderate effort | Moderate effort | Significant effort | |
Implementation Options | Account, VPC, Subnet or DB Instance per tenant | Database or table per tenant | Database or table per tenant | Single shared table for all tenants |
Conclusion
In this post, we explored the different access patterns for Amazon RDS for PostgreSQL and Amazon Aurora PostgreSQL-Compatible Edition in a multi-tenant environment. It explores how factors such as AWS IAM Auth, compute isolation model and the database isolation model influence your implementation strategy. The patterns in this post help you make an informed decision about your SaaS workload. As with every collection of patterns, it doesn’t claim to be complete and there are new access patterns evolving from future developments. To view the full code implementation examples for all the patterns in this post, please visit the GitHub repository.
About the Authors
Josh Hart is a Senior Solutions Architect at Amazon Web Services (AWS). He works with ISV customers in the UK to help them build and modernize their SaaS applications on AWS.
Maximilian Schellhorn works as a Solutions Architect at Amazon Web Services (AWS). Previously, he worked for over 10 years as a Software Engineer and Architect on distributed system design and monolith-to-microservice transformations.