n8n - 💡(How to fix) Fix PostgreSQL connection pool becomes permanently congested after a DB proxy outage, blocking self-recovery

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

} catch (error) { 2. Put the proxy into a state where it accepts TCP connections but hangs on the PostgreSQL startup handshake (e.g., a certificate error that breaks the proxy's backend connections while the proxy's frontend remains open) {"level":"warn","message":"Database connection timed out","metadata":{"file":"db-connection.js","function":"ping"}}

Root Cause

Root cause summary

Fix Action

Fix / Workaround

Option 2 — mitigation: document that DB_POSTGRESDB_CONNECTION_TIMEOUT should be set below N8N_DB_PING_TIMEOUT (e.g., 3,000ms vs 5,000ms) so the connection attempt itself fails before the ping timer fires, freeing the slot within the ping window and preventing accumulation.

Code Example

private async ping() {
    if (!this.dataSource.isInitialized) return;
    const abortController = new AbortController();
    try {
        await Promise.race([
            this.dataSource.query('SELECT 1'),
            setTimeoutP(this.timeout, undefined, { signal: abortController.signal }).then(() => {
                throw new OperationalError('Database connection timed out');
            }),
        ]);
        this.connectionState.connected = true;
        return;
    } catch (error) {
        this.connectionState.connected = false;
        ...
    } finally {
        abortController.abort();       // cancels the timer only — NOT the query
        this.scheduleNextPing();
    }
}

---

{"level":"warn","message":"Database connection timed out","metadata":{"file":"db-connection.js","function":"ping"}}
RAW_BUFFERClick to expand / collapse

Bug Description

After a brief PostgreSQL proxy outage, n8n can get stuck permanently returning 503 Database is not ready! even after the proxy fully recovers. Only a container restart fixes it.

Root cause summary

DbConnection.ping() runs SELECT 1 every 2 seconds to check DB connectivity. When the proxy hangs (accepts TCP connections but stalls on the PostgreSQL handshake), each ping query hangs until the ping's internal 5-second timeout fires. The problem is that timing out the ping does not cancel the underlying query — it keeps running in the background, holding a pg-pool connection slot for up to connectTimeoutMS (default 20s). With only 2 pool slots (the default), the pool fills up within a few seconds and stays congested. When the proxy recovers, there are no free slots for a fresh connection to get through.

Code detail

The issue is in packages/@n8n/db/src/connection/db-connection.ts:

private async ping() {
    if (!this.dataSource.isInitialized) return;
    const abortController = new AbortController();
    try {
        await Promise.race([
            this.dataSource.query('SELECT 1'),
            setTimeoutP(this.timeout, undefined, { signal: abortController.signal }).then(() => {
                throw new OperationalError('Database connection timed out');
            }),
        ]);
        this.connectionState.connected = true;
        return;
    } catch (error) {
        this.connectionState.connected = false;
        ...
    } finally {
        abortController.abort();       // cancels the timer only — NOT the query
        this.scheduleNextPing();
    }
}

abortController.abort() only cancels the setTimeoutP timer. The this.dataSource.query('SELECT 1') promise is not cancelled — it continues running in the background, holding its pg-pool connection slot for up to connectTimeoutMS (default 20,000ms).

Congestion cascade

With default settings (poolSize: 2, connectTimeoutMS: 20s, ping timeout 5s), against a proxy that hangs rather than refusing connections:

TimeEvent
t=0sProxy breaks. Ping 1: SELECT 1 hangs → holds pool slot 1 for 20s. Ping times out at 5s.
t=2sPing 2: SELECT 1 hangs → holds pool slot 2 for 20s. Pool is now full.
t=4s+Subsequent pings queue their connect() call; the 5s ping timeout fires before any slot frees. Steady state: pool permanently congested.

When the proxy recovers, both pool slots are still occupied by the previously leaked queries. New pings cannot get a fresh connection through until those slots clear (up to 20s each). Even if a leaked query eventually succeeds after recovery, its result is discarded — the corresponding ping already timed out and moved on. connectionState.connected has no path back to true without a restart.

To Reproduce

  1. Deploy n8n in queue mode with PostgreSQL accessed via a connection proxy (ProxySQL, PgBouncer, HAProxy TCP, etc.)
  2. Put the proxy into a state where it accepts TCP connections but hangs on the PostgreSQL startup handshake (e.g., a certificate error that breaks the proxy's backend connections while the proxy's frontend remains open)
  3. Observe repeated "Database connection timed out" warnings in n8n logs — confirming the ping loop is running but each SELECT 1 is hanging
  4. Restore the proxy to a healthy state
  5. Observe that n8n does not recover — continues returning 503 Database is not ready! indefinitely

Expected behavior

After the proxy recovers and accepts PostgreSQL connections normally, the next successful SELECT 1 in ping() should set connectionState.connected = true and restore service without requiring a container restart.

Debug Info

Log output during the failure (repeated every ~5s):

{"level":"warn","message":"Database connection timed out","metadata":{"file":"db-connection.js","function":"ping"}}

No recovery log ("Database connection recovered") is ever emitted, even after the proxy is restored.

Suggested Fix

The mismatch between N8N_DB_PING_TIMEOUT (5s) and DB_POSTGRESDB_CONNECTION_TIMEOUT (20s) is the proximate cause of slot accumulation — pings give up in 5s but the underlying connection hangs for 20s.

Option 1 — correct fix: make the query cancellable when the ping times out, so the pool slot is released immediately. This requires either a pg client-level abort or wrapping the query in a connection with a short statement_timeout/connect_timeout that the DB itself enforces.

Option 2 — mitigation: document that DB_POSTGRESDB_CONNECTION_TIMEOUT should be set below N8N_DB_PING_TIMEOUT (e.g., 3,000ms vs 5,000ms) so the connection attempt itself fails before the ping timer fires, freeing the slot within the ping window and preventing accumulation.

Operating System

Ubuntu Server 22.04 (host); Alpine Linux (n8n Docker image)

n8n Version

2.19.2

Node.js Version

bundled in Docker image

Database

PostgreSQL

Execution mode

queue

Hosting

self hosted

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

After the proxy recovers and accepts PostgreSQL connections normally, the next successful SELECT 1 in ping() should set connectionState.connected = true and restore service without requiring a container restart.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING