n8n - 💡(How to fix) Fix PostgreSQL connection pool becomes permanently congested after a DB proxy outage, blocking self-recovery

Error Message

} catch (error) { 2. Put the proxy into a state where it accepts TCP connections but hangs on the PostgreSQL startup handshake (e.g., a certificate error that breaks the proxy's backend connections while the proxy's frontend remains open) {"level":"warn","message":"Database connection timed out","metadata":{"file":"db-connection.js","function":"ping"}}

Code Example

private async ping() {
    if (!this.dataSource.isInitialized) return;
    const abortController = new AbortController();
    try {
        await Promise.race([
            this.dataSource.query('SELECT 1'),
            setTimeoutP(this.timeout, undefined, { signal: abortController.signal }).then(() => {
                throw new OperationalError('Database connection timed out');
            }),
        ]);
        this.connectionState.connected = true;
        return;
    } catch (error) {
        this.connectionState.connected = false;
        ...
    } finally {
        abortController.abort();       // cancels the timer only — NOT the query
        this.scheduleNextPing();
    }
}

---

{"level":"warn","message":"Database connection timed out","metadata":{"file":"db-connection.js","function":"ping"}}

Bug Description

After a brief PostgreSQL proxy outage, n8n can get stuck permanently returning 503 Database is not ready! even after the proxy fully recovers. Only a container restart fixes it.

Root cause summary

DbConnection.ping() runs SELECT 1 every 2 seconds to check DB connectivity. When the proxy hangs (accepts TCP connections but stalls on the PostgreSQL handshake), each ping query hangs until the ping's internal 5-second timeout fires. The problem is that timing out the ping does not cancel the underlying query — it keeps running in the background, holding a pg-pool connection slot for up to connectTimeoutMS (default 20s). With only 2 pool slots (the default), the pool fills up within a few seconds and stays congested. When the proxy recovers, there are no free slots for a fresh connection to get through.

Code detail

The issue is in packages/@n8n/db/src/connection/db-connection.ts:

private async ping() {
    if (!this.dataSource.isInitialized) return;
    const abortController = new AbortController();
    try {
        await Promise.race([
            this.dataSource.query('SELECT 1'),
            setTimeoutP(this.timeout, undefined, { signal: abortController.signal }).then(() => {
                throw new OperationalError('Database connection timed out');
            }),
        ]);
        this.connectionState.connected = true;
        return;
    } catch (error) {
        this.connectionState.connected = false;
        ...
    } finally {
        abortController.abort();       // cancels the timer only — NOT the query
        this.scheduleNextPing();
    }
}

abortController.abort() only cancels the setTimeoutP timer. The this.dataSource.query('SELECT 1') promise is not cancelled — it continues running in the background, holding its pg-pool connection slot for up to connectTimeoutMS (default 20,000ms).

Congestion cascade

With default settings (poolSize: 2, connectTimeoutMS: 20s, ping timeout 5s), against a proxy that hangs rather than refusing connections:

Time	Event
t=0s	Proxy breaks. Ping 1: `SELECT 1` hangs → holds pool slot 1 for 20s. Ping times out at 5s.
t=2s	Ping 2: `SELECT 1` hangs → holds pool slot 2 for 20s. Pool is now full.
t=4s+	Subsequent pings queue their `connect()` call; the 5s ping timeout fires before any slot frees. Steady state: pool permanently congested.

When the proxy recovers, both pool slots are still occupied by the previously leaked queries. New pings cannot get a fresh connection through until those slots clear (up to 20s each). Even if a leaked query eventually succeeds after recovery, its result is discarded — the corresponding ping already timed out and moved on. connectionState.connected has no path back to true without a restart.

To Reproduce

Deploy n8n in queue mode with PostgreSQL accessed via a connection proxy (ProxySQL, PgBouncer, HAProxy TCP, etc.)
Put the proxy into a state where it accepts TCP connections but hangs on the PostgreSQL startup handshake (e.g., a certificate error that breaks the proxy's backend connections while the proxy's frontend remains open)
Observe repeated "Database connection timed out" warnings in n8n logs — confirming the ping loop is running but each SELECT 1 is hanging
Restore the proxy to a healthy state
Observe that n8n does not recover — continues returning 503 Database is not ready! indefinitely

Expected behavior

After the proxy recovers and accepts PostgreSQL connections normally, the next successful SELECT 1 in ping() should set connectionState.connected = true and restore service without requiring a container restart.

Debug Info

Log output during the failure (repeated every ~5s):

{"level":"warn","message":"Database connection timed out","metadata":{"file":"db-connection.js","function":"ping"}}

No recovery log ("Database connection recovered") is ever emitted, even after the proxy is restored.

Suggested Fix

The mismatch between N8N_DB_PING_TIMEOUT (5s) and DB_POSTGRESDB_CONNECTION_TIMEOUT (20s) is the proximate cause of slot accumulation — pings give up in 5s but the underlying connection hangs for 20s.

Option 1 — correct fix: make the query cancellable when the ping times out, so the pool slot is released immediately. This requires either a pg client-level abort or wrapping the query in a connection with a short statement_timeout/connect_timeout that the DB itself enforces.

Option 2 — mitigation: document that DB_POSTGRESDB_CONNECTION_TIMEOUT should be set below N8N_DB_PING_TIMEOUT (e.g., 3,000ms vs 5,000ms) so the connection attempt itself fails before the ping timer fires, freeing the slot within the ping window and preventing accumulation.

Operating System

Ubuntu Server 22.04 (host); Alpine Linux (n8n Docker image)

n8n Version

2.19.2

Node.js Version

bundled in Docker image

Database

PostgreSQL

Execution mode

queue

Hosting

self hosted

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

n8n - 💡(How to fix) Fix PostgreSQL connection pool becomes permanently congested after a DB proxy outage, blocking self-recovery

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Root cause summary

Fix Action

Fix / Workaround

Code Example

Bug Description

Root cause summary

Code detail

Congestion cascade

To Reproduce

Expected behavior

Debug Info

Suggested Fix

Operating System

n8n Version

Node.js Version

Database

Execution mode

Hosting

FAQ

Expected behavior

Still need to ship something?

TRENDING

n8n - 💡(How to fix) Fix PostgreSQL connection pool becomes permanently congested after a DB proxy outage, blocking self-recovery

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Root cause summary

Fix Action

Fix / Workaround

Code Example

Bug Description

Root cause summary

Code detail

Congestion cascade

To Reproduce

Expected behavior

Debug Info

Suggested Fix

Operating System

n8n Version

Node.js Version

Database

Execution mode

Hosting

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING