Multi-tenancy implementation with Postgres: It's simpler than you imagine

Explore the simplicity of implementing multi-tenancy for your applications using Postgres.

YijunDeveloper

February 29, 202412 min read

Multi-tenancy implementation with Postgres: It's simpler than you imagine

In some of our previous articles, we delved into the concept of multi-tenancy and its applications in products and real-world business scenarios. In this article, we'll explore how to implement a multi-tenant architecture for your application using Postgres from a technical standpoint.

If you're unfamiliar with multi-tenancy, you can refer to our previously published articles:

A brief overview of multi-tenant architecture

Single-tenant architecture vs multi-tenant architecture

Single-tenant architecture and multi-tenant architecture differ in aspects such as data isolation, resource utilization, scalability, management and maintenance, and security.

In single-tenant architecture, each customer has an independent data space, leading to lower resource utilization but relatively simpler for customization. Typically, single-tenant software tailored to specific customer needs, such as inventory systems for a particular fabric supplier or a personal blog web app. The commonality among them is that each customer occupies a separate instance of the application service, facilitating customization to meet specific requirements.

In a multi-tenant architecture, multiple tenants share the same underlying resources, resulting in higher resource utilization. However, it's crucial to ensure data isolation and security.

Multi-tenant architecture is often the preferred software architecture when service providers offer standardized services to different customers. These services typically have low levels of customization, and all customers share the same application instance. When an application requires an update, updating one application instance is equivalent to updating the application for all customers.

For instance, CRM (Customer Relationship Management) is a standardized requirement. These systems typically use a multi-tenant architecture to provide the same service to all tenants:

Tenant Data Isolation in multi-tenancy

In a multi-tenant architecture, all tenants share the same underlying resources, making the isolation of resources between tenants crucial. This isolation doesn't necessarily need to be physical; it simply requires ensuring that resources between tenants are not visible to each other.

In the design of the architecture, various degrees of resource isolation between tenants can be achieved:

In general, the more resources shared among tenants, the lower the cost of system iteration and maintenance. Conversely, the fewer shared resources, the higher the cost.

A simple but practical multi-tenant architecture

In this article, we will use a CRM system as an example to introduce a simple yet practical multi-tenant architecture.

We recognize that all tenants use the same standard services, so we decided to have all tenants share the same basic resources, and we will implemented data isolation between different tenants at the database level using Postgres's Row-Level Security.

Additionally, we will create a separate data connection for each tenant to facilitate better management of tenant permissions.

Next, we will introduce how to implement this multi-tenant architecture.

Implement multi-tenant architecture

Add tenant identifier for all resources

In a CRM system, we will have a lot of resources and they're stored in different tables. For instance, customer information is stored in the customers table.

Before implementing multi-tenancy, these resources are not associated with any tenant:

To differentiate the tenants owning different resources, we introduce a tenants table to store tenant information (where db_user and db_user_password are used to store the database connection information for each tenant, will be detailed below). Additionally, we add a tenant_id field to each resource to identify which tenant it belongs to:

Now, each resource is associated with a tenant_id, theoretically enabling us to add a where clause to all queries to restrict access to resources for each tenant:

select * from customers where tenant_id = "specific_tenant_id"

At first glance, this seems simple and feasible. However, it will have the following issues:

Almost every query will include this where clause, cluttering the code and making it harder to maintain, especially when writing complex join statements.
Newcomers to the codebase may easily forget to add this where clause.
Data between different tenants isn't truly isolated, as each tenant still has permissions to access data belonging to other tenants.

Therefore, we will not adopt this approach. Instead, we will use Postgres' Row Level Security to address these concerns. However, before proceeding, we will create a dedicated database account for each tenant to access this shared database.

Setup DB roles for tenants

It’s a good practice to assign a database role to each user who can connect to the database. This allows for better control over each user's access to the database, facilitating isolation of operations between different users and improving system stability and security.

Since all tenants have the same database operation permissions, we can create a base role to manage these permissions:

-- Create base role for all tenants
create role crm_tenant noinherit;

Then, to differentiate each tenant role, a role inherited from the base role is assigned to each tenant upon creation:

-- Create role for tenant with id 'x2euic'
create role crm_tenant_x2euic with inherit login password 'pa55w0rd' in role crm_tenant;

Next, the database connection information for each tenant will be stored in the tenants table:

id	db_user	db_user_password
x2euic	crm_tenant_x2euic	pa55w0rd

This mechanism provides each tenant with its own database role, and these roles share the permissions granted to the crm_tenant role.

We can then define the permission scope for tenants using the crm_tenant role:

Tenants should have CRUD access to all CRM system resource tables.
Tables not related to CRM system resources should be invisible to tenants (assuming only systems table).
Tenants should not be able to modify the tenants table, and only the id and db_user fields should be visible to them for querying their own tenant id when performing database operations.

-- Grant CRUD access to tenants for all tables
grant select, insert, update, delete
	on all tables in schema public
  to crm_tenant;

-- Exclude tables which are not CRM system resources
revoke all privileges on table systems from crm_tenant;

-- Add limitations for tenants table
revoke all privileges on table tenants from crm_tenant;
-- Allow limited select
grant select (id, db_user) on table tenants to crm_tenant;

Once the roles for tenants are set up, when a tenant requests access to the service, we can interact with the database using the database role representing that tenant:

Secure tenant data using Postgres Row-Level Security

So far, we've established corresponding database roles for tenants, but this doesn't restrict data access between tenants. Next, we'll leverage Postgres' Row-Level Security feature to limit each tenant's access to their own data.

In Postgres, tables can have row security policies that control which rows can be accessed by queries or modified by data manipulation commands. This feature is also known as RLS (Row-Level Security).

By default, tables have no row security policies. To utilize RLS, you need to enable it for the table and create security policies that execute every time the table is accessed.

Taking the customers table in the CRM system as an example, we'll enable RLS and create a security policy to restrict each tenant only have access to their own customers' data:

-- Enable Row-Level security
alter table customers enable row level security;

-- Create security policy
create policy crm_tenant_access_policy
on customers
for all
to crm_tenant
as restrictive
using (tenant_id = (select id from tenants where db_user = current_user))
with check (tenant_id = (select id from tenants where db_user = current_user))

In the statement creating the security policy:

for all (optional) indicates that this access policy will be used for select, insert, update, and delete operations on the table. You can specify an access policy for specific operations using for followed by the command keyword.
to crm_tenant indicates that this policy applies to users with the database role crm_tenant, meaning all tenants.
as restrictive specifies the enforcement mode of the policy, indicating that access should be strictly limited. By default, a table can have multiple policies, multiple permissive policies will be combined with an OR relationship. In this scenario, we declare this policy as restrictive because we want this policy check to be mandatory for users belonging to CRM system tenants.
using expression defines the conditions for actual access, restricting the current querying database user to only view data belonging to their respective tenant. This constraint applies to rows selected by a command (select, update, or delete).
with check expression defines the constraint necessary when modifying data rows (insert or update), guaranteeing that tenants can only add or update records for themselves.

Using RLS to constrain tenant access to our resource tables offers several benefits:

This policy effectively adds where tenant_id = (select id from tenants where db_user = current_user) to all query operations (select, update, or delete). For instance, when you execute select * from customers, it's equivalent to executing select * from customers where tenant_id = (select id from tenants where db_user = current_user). This eliminates the need to explicitly add where conditions in the application code, simplifying it and reducing the likelihood of errors.
It centrally controls data access permissions between different tenants at the database level, mitigating the risk of vulnerabilities or inconsistencies in the application, thus enhancing system security.

However, there are some points to note:

RLS policies are executed for every row of data. If the query conditions within the RLS policy are too complex, it could significantly impact system performance. Fortunately, our tenant data check query is simple enough and won't affect performance. If you plan to implement other functionalities using RLS later, you can follow the Row-Level Security performance recommendations from Supabase to optimize RLS performance.
RLS policies don't automatically populate tenant_id during insert operations. They only restrict tenants to inserting their own data. This means that when inserting data, we still need to provide the tenant ID, which is inconsistent with the querying process and can lead to confusion during development, increasing the likelihood of errors (this will be addressed in subsequent steps).

In addition to the customers table, we need to apply the same operations to all CRM system resource tables (this process may be a bit tedious, but we can write a program to configure it during table initialization), thus isolating data from different tenants.

Create trigger function for data inserting

As mentioned earlier, RLS (Row-Level Security) allows us to execute queries without worrying about the existence of tenant_id, as the database handles it automatically. However, for insert operations, we still need to manually specify the corresponding tenant_id.

To achieve a similar convenience as RLS for data insertion, we need the database be able to handle tenant_id automatically during data insertion.

This has a clear benefit: at the application development level, we no longer need to consider which tenant the data belongs to, reducing the likelihood of errors and easing our mental burden when developing multi-tenant applications.

Fortunately, Postgres provides powerful trigger functionality.

Triggers are special functions associated with tables that automatically execute specific actions (such as insert, update, or delete) when performed on the table. These actions can be triggered at the row level (for each row) or statement level (for the entire statement). With triggers, we can execute custom logic before or after specific database operations, allowing us to easily achieve our goal.

First, let's create a trigger function set_tenant_id to be executed before each data insertion:

create function set_tenant_id() returns trigger as
$$ begin
  if new.tenant_id is not null then
    return new;
  end if;

  select tenants.id into new.tenant_id
    from tenants
    where tenants.db_user = current_user;

  return new;
end; $$ language plpgsql;

Next, associate this trigger function with the customers table for insert operations (similar to enabling RLS for a table, this trigger function needs to be associated with all relevant tables):

create trigger set_tenant_id before insert on customers
  for each row execute procedure set_tenant_id();

This trigger ensures that inserted data contains the correct tenant_id. If the new data already includes a tenant_id, the trigger function does nothing. Otherwise, it automatically populates the tenant_id field based on the information of the current user in the tenants table.

This way, we achieve automatic handling of tenant_id at the database level during data insertion by tenants.

Summary

In this article, we delve into the practical application of multi-tenant architecture, using a CRM system as an example to demonstrate a practical solution utilizing PostgreSQL database.

We discuss database role management, access control, and PostgreSQL's Row-Level Security feature to ensure data isolation between tenants. Additionally, we utilize trigger functions to reduce the cognitive burden of developers in managing different tenants.

That's all for this article. If you want to further enhance your multi-tenant application with user access management, you can refer to An easy guide to begin with Logto organizations - for building a multi-tenant app for more insights.

Try Logto