AWS Database Blog

Performance impact of idle PostgreSQL connections

The first post of this series, Resources consumed by idle PostgreSQL connections, talked about how PostgreSQL manages connections and how even idle connections consume memory and CPU. In this post, I discuss how idle connections impact PostgreSQL performance.

Transaction rate impact

When PostgreSQL needs data, it first looks for the required page in its own shared buffers. If it can’t find the page in shared buffers, it fetches the page from the operating system (OS) cache, if available. If the page isn’t available in the OS cache, it’s read from the storage volume. The page lookup from share buffer is the fastest, followed by the OS cache lookups. The page fetch from the storage volume is slowest.

As the count of the PostgreSQL connections increases, the free memory available for OS cache goes down. This causes the OS to remove pages from the cache. The next lookup of these pages results in a fetch from the storage volume and is therefore slower.

If the instance is low on free memory, it starts using the swap space, which is again on the storage volume and therefore slow. Using swap space helps free up some memory, but if the swapped pages are again needed by the OS, they have to be read back, which results in increased I/O utilization. For more information, see Swap Management.

The impact of free memory on performance depends on the workload, working dataset size, and total available memory. If the working dataset size is smaller than the total memory available, the decrease in free memory doesn’t have any noticeable impact. However, if the working dataset size is greater than the available memory, the impact is noticeable.

Performance tests

The following sections show the performance test results generated using the PostgreSQL simple benchmarking utility pgbench. The Amazon RDS for PostgreSQL instance used in the tests was db.m5.large instance type with 2 vCPUs and 8GB memory. The IO1 EBS volume was used with 3000 IOPS.

Each test has two phases. In the first phase, pgbench was run for 600 seconds against Amazon RDS for PostgreSQL server with no other traffic on the database. This provided a baseline transaction rate.

In the second phase, 1,000 connections were opened before running pgbench again. Each of these 1,000 connections fetched one row from the tables in the PostgreSQL internal schema named information_schema. The following are the steps performed for each of these 1,000 connections:

  1. Open a connection.
  2. Fetch the names of all the tables and views in information_schema:
    SELECT table_schema||'.'||table_name as relname from information_schema.tables WHERE table_schema='information_schema';
  1. In a loop, run select on each of these tables with LIMIT 1:
    SELECT * FROM information_schema.columns LIMIT 1;
  1. Repeat the steps for all 1,000 connections.
  2. Leave the connections in idle state.

When you restart the instance, it doesn’t have any pages cached in the memory. The first time you run pgbench, it loads the required pages from the storage. The subsequent pgbench runs have some pages already available in cache so they can use them instead of loading from storage.

To minimize the impact of page caching, an initialization step was performed, in which pgbench ran before starting the actual test. The results of this initial run weren’t used in any analysis. This made sure that any common pages that can be cached were already available for the two test runs.

The following chart shows how the instance memory drops from around 4.88 GB to 90 MB as these 1,000 connections are opened.

The following chart shows how the instance memory drops from around 4.88 GB to 90 MB as these 1,000 connections are opened.

As explained in the previous post in this series, although these connections were idle, they consumed memory and CPU resources. The results show that there is a performance impact due to these idle connections.

Transaction rate test #1: Standard pgbench

In the first test, pgbench ran in a standard configuration with 100 client connections. The following code is a result for this pgbench run:

transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1000
query mode: simple
number of clients: 100
number of threads: 2
duration: 600 s
number of transactions actually processed: 749572
latency average = 80.058 ms
tps = 1249.096708 (including connections establishing)
tps = 1249.116996 (excluding connections establishing)

With 1,000 idle connections, the result changed to the following:

transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1000
query mode: simple
number of clients: 100
number of threads: 2
duration: 600 s
number of transactions actually processed: 684434
latency average = 87.686 ms
tps = 1140.430155 (including connections establishing)
tps = 1140.449899 (excluding connections establishing)

The results show that the transaction rate dropped from 1,249 to 1,140 transactions per second. This is an 8.7% drop in the transaction rate due to the idle connections.

Transaction rate test #2: Select-only pgbench

Because the memory consumed by the idle connections reduces the memory available for page cache, the impact of these idle connections was expected to be more evident on the read queries. To test this, I ran pgbench using the -S configuration, which runs a built-in select-only script for the benchmarking. The following output is a result for this test run:

transaction type: <builtin: select only>
scaling factor: 1000
query mode: simple
number of clients: 100
number of threads: 2
duration: 600 s
number of transactions actually processed: 1181937
latency average = 50.778 ms
tps = 1969.344251 (including connections establishing)
tps = 1969.377751 (excluding connections establishing)

With 1,000 idle connections, the result changed to the following:

transaction type: <builtin: select only>
scaling factor: 1000
query mode: simple
number of clients: 100
number of threads: 2
duration: 600 s
number of transactions actually processed: 966656
latency average = 62.095 ms
tps = 1610.440842 (including connections establishing)
tps = 1610.470585 (excluding connections establishing)

The results show that the transaction rate dropped from 1,969 to 1,610 transactions per second (an 18.2% drop).

Transaction rate test #3: Custom pgbench

To see the impact of these idle connections on a read-intensive workload, I performed another test with a custom pgbench query that was reading a large number of rows. The following code is the content of the custom pgbench script that ran using pgbench:

\set nbranches :scale
\set naccounts 100000 * :scale
\set aid random(1, :naccounts)
\set bid random(1, :nbranches)
BEGIN;
SELECT * FROM pgbench_accounts WHERE aid >= :aid AND aid < (:aid + 5000) AND bid=:bid LIMIT 1;
END;

Each transaction of this custom pgbench script reads 5,000 rows from the pgbench_accounts table and returns only one of these rows. The idea it to make each transaction read more pages.

The pgbench run with the custom script and without any idle connections shows the following result:

transaction type: pgbench_script.sql
scaling factor: 5000
query mode: simple
number of clients: 100
number of threads: 2
duration: 600 s
number of transactions actually processed: 227484
latency average = 264.140 ms
tps = 378.586790 (including connections establishing)
tps = 378.592772 (excluding connections establishing)

With 1,000 idle connections that run queries against information schema, the result changed to the following:

transaction type: pgbench_script.sql
scaling factor: 5000
query mode: simple
number of clients: 100
number of threads: 2
duration: 600 s
number of transactions actually processed: 124114
latency average = 484.485 ms
tps = 206.404854 (including connections establishing)
tps = 206.507645 (excluding connections establishing)

The results show that the transaction rate dropped from 378 to 206 transactions per second (a 46% drop).

You can use Amazon RDS Performance Insights to view the engine wait events details during the two pgbench runs. The following figure shows that majority of the time was spent in the DataFileRead wait event in both the runs. In the second run with 1,000 idle connections, almost all the time is consumed by this wait event. The DataFileRead event highlights the waiting for a read from a relation data file.

The DataFileRead event highlights the waiting for a read from a relation data file.

The following chart shows the read throughput Amazon CloudWatch metric for the two pgbench runs.

The following chart shows the read throughput Amazon CloudWatch metric for the two pgbench runs.

During the first run, the read throughput was around 87 MB/s, whereas in the second run with 1,000 open connections, the read throughput increased to around 117 MB/s. This happened because the idle connections consumed the free OS memory that resulted in smaller OS cache. Due to this, more pages needed to be fetched from the disk instead of reading from the OS cache. Because a read from disk is much slower than a read from memory, it results in a degradation in query performance.

Connection poolers

Connection poolers help minimize the impact of database connections by maintaining a pool of connections that can be shared by all the client connections. You can use an application connection pooler that is part of your application code or an independent connection pooler such as pgbouncer or Amazon RDS Proxy. These connection poolers help limit the number of connections you can open and also reuse the connections so that each new client connection doesn’t need to open a new database connection. The following sections provide more details of pgbouncer and RDS Proxy.

pgbouncer

pgbouncer is a lightweight connection pooler that allows the following three modes:

  • Session mode – Each application connection is tied to one database connection. If the connection goes into an idle state, pgbouncer can’t reuse it for other application connections.
  • Transaction mode – pgbouncer can reuse the open connection as soon as the application connection completes the transaction.
  • Statement mode – The connection can be reused for other clients as soon as one SQL statement is completed.

In most applications, using transaction mode pooling gives the best results. To test the benefit of connection poolers, I installed pgbouncer on the server running pgbench. pgbouncer was configured to allow up to 5,000 client connections but only open a maximum of 200 connections against our test RDS PostgreSQL instance. pgbench was then run against this pgbouncer pool.

The following code is the test result of the custom pgbench run with pgbouncer:

transaction type: pgbench_script.sql
scaling factor: 5000
query mode: simple
number of clients: 100
number of threads: 2
duration: 600 s
number of transactions actually processed: 227064
latency average = 264.600 ms
tps = 377.928241 (including connections establishing)
tps = 377.928476 (excluding connections establishing)

While the pgbench run is in progress, you can look at the pgbouncer pool to see the connections status:

pgbouncer=# show pools;
-[ RECORD 1 ]-----------
database   | pgbench
user       | postgres
cl_active  | 100
cl_waiting | 0
sv_active  | 100
sv_idle    | 0
sv_used    | 0
sv_tested  | 0
sv_login   | 0
maxwait    | 0
maxwait_us | 0
pool_mode  | transaction

This pool status output shows that there are 100 client connections (cl_active) that result in 100 active server connections (sv_active).

In the second test run, I opened 1,000 connections and left them in idle state. The pooler didn’t need to maintain any server connections for these idle client connections. The following pool stats show that there are 1,000 client connections but there is only one idle PostgreSQL connection:

pgbouncer=# show pools;
-[ RECORD 1 ]-----------
database   | pgbench
user       | postgres
cl_active  | 1000
cl_waiting | 0
sv_active  | 0
sv_idle    | 1
sv_used    | 0
sv_tested  | 0
sv_login   | 0
maxwait    | 0
maxwait_us | 0
pool_mode  | transaction

Connection poolers have different settings to control how they behave. For more information, see pgbouncer Configuration.

With these 1,000 idle client connections still open, running pgbench using pgbouncer results in the following:

transaction type: pgbench_script.sql
scaling factor: 5000
query mode: simple
number of clients: 100
number of threads: 2
duration: 600 s
number of transactions actually processed: 226827
latency average = 264.935 ms
tps = 377.451418 (including connections establishing)
tps = 377.451655 (excluding connections establishing)

This shows that when connection pooler is used, there is no impact on performance with or without the 1,000 idle connections. Pgbouncer shows the following status while pgbench in is progress:

pgbouncer=# show pools;
-[ RECORD 1 ]-----------
database   | pgbench
user       | postgres
cl_active  | 1100
cl_waiting | 0
sv_active  | 100
sv_idle    | 0
sv_used    | 0
sv_tested  | 0
sv_login   | 0
maxwait    | 0
maxwait_us | 0
pool_mode  | transaction

There are 1,100 total client connections (cl_active) but only 100 server connections (sv_active) are actively being used.

In this test, the RDS instance has only 2 vCPUs, therefore 100 processes running in parallel can result in a lot of context switching, causing some performance degradation. Depending on the workload, you can benefit from reducing the maximum number of connections to a smaller number. The following code is a test result with pgbouncer configured to allow a maximum of 20 database connections:

transaction type: pgbench_script.sql
scaling factor: 5000
query mode: simple
number of clients: 100
number of threads: 2
duration: 600 s
number of transactions actually processed: 256267
latency average = 234.286 ms
tps = 426.828543 (including connections establishing)
tps = 426.828801 (excluding connections establishing)

You get a higher transaction rate with a smaller connection pool size. pgbouncer shows the following status while pgbench in is progress:

pgbouncer=# show pools;
-[ RECORD 1 ]-----------
database   | pgbench
user       | postgres
cl_active  | 20
cl_waiting | 80
sv_active  | 20
sv_idle    | 0
sv_used    | 0
sv_tested  | 0
sv_login   | 0
maxwait    | 0
maxwait_us | 125884
pool_mode  | transaction

At any given time, only 20 client connections (cl_active) are active. The remaining 80 connections (cl_waiting) wait for their turn and are assigned a connection as soon as one of the database connections is available. This result shows that more connections don’t necessarily mean more throughput. This smaller number of database connections helps reduce the context switching and resource contention, and therefore improves the overall performance for this custom workload. This is true in general for other databases as well.

RDS Proxy

As shown in the previous section, using connection poolers can help improve performance. Depending on your workload and the available instance resources, the performance improvement can be significant if a connection pooler is used to handle a high number of connections.

However, because all connections now route through the connection pooler, it can become a single point of failure. It’s essential to make sure you set it up for high availability. This can be challenging. AWS solves this problem by providing a fully managed, highly available database proxy: RDS Proxy.

RDS Proxy lets you set up and manage the connection pooler with just a few clicks. RDS Proxy allows you to set a maximum threshold for the connections that can be opened against the database. All client connections then use this connection pool. This helps with idle connections and makes sure that any unexpected surge in connection count doesn’t degrade the database performance.

The following figure shows the RDS Proxy usage stats for a pgbench run that opened 100 client connections. In this test, RDS Proxy was configured to only allow only 1% of the max connections allowed by the database.

In this test, RDS Proxy was configured to only allow only 1% of the max connections allowed by the database.

In another test, 1,000 idle connections were opened up before starting the pgbench run. The following figure shows that only two database connections that were already opened by RDS Proxy were used for the 1,000 new idle connections. When the pgbench run started, RDS Proxy opened up additional connections.

In another test, 1,000 idle connections were opened up before starting the pgbench run.

RDS Proxy also provides the option to configure the idle client connection timeout. By default, the proxy closes any client connection that is idle for 30 minutes. You can change this parameter as per your workload requirements. For more information, see Creating an RDS Proxy.

Summary

The results in this post show that a higher number of database connections doesn’t mean higher throughput. As you increase the number of database connections, the context switching and resource contention also increases, which impacts performance.

The post also shows that PostgreSQL connections consume resources even when they’re idle, so the common assumption that idle connections don’t have any performance impact is not correct.

If your application is designed in a way that results in a high number of connections, irrespective of whether they’re active or idle, you should consider making changes so that your memory and CPU resources aren’t wasted just to manage these connections. The change may be in the application so that it limits the number of connections, or the solution may be using a connection pooler. You can consider using pgbouncer or go with RDS Proxy, which is a managed service that allows you to set up connection pooling with a few clicks.


About the Author

Yaser Raja is a Senior Consultant with Professional Services team at Amazon Web Services. He works with customers to build scalable, highly available and secure solutions in AWS cloud. His focus area is homogenous and heterogeneous migrations of on-premise databases to AWS RDS and Aurora PostgreSQL.