n8n - 💡(How to fix) Fix job.finished() leaks EventEmitter listeners and setInterval in queue mode when removeOnComplete/removeOnFail is true [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
n8n-io/n8n#30392Fetched 2026-05-14 03:45:24
View on GitHub
Comments
1
Participants
2
Timeline
4
Reactions
0
Author
Timeline (top)
commented ×1labeled ×1mentioned ×1subscribed ×1

Error Message

  • error: none

Root Cause

Root Cause

RAW_BUFFERClick to expand / collapse

Bug Description

In queue mode, webhook/main pods permanently leak Bull job.finished() resources when a global:completed or global:failed event is missed (e.g. during a Redis reconnect). Over time this causes the process to OOM-crash.

Root Cause

WorkflowRunner.enqueueExecution() calls job.finished() to wait for the worker to complete a Bull job. Internally, Bull's job.finished() does three things:

  1. Adds a global:completed listener to the queue EventEmitter
  2. Adds a global:failed listener to the queue EventEmitter
  3. Starts a watchdog setInterval every 5 seconds that polls scripts.isFinished()

These are cleaned up by removeListeners() only when a completion event fires.

scaling.service.ts sets removeOnComplete: true and removeOnFail: true. This means completed/failed jobs are immediately removed from Redis. If the global:completed/global:failed event is missed (Redis reconnect, timing issue), the watchdog calls scripts.isFinished() — but the job no longer exists in Redis, so it returns -1 (not found). The watchdog interprets this as "still running" and keeps polling forever. removeListeners() is never called.

Each affected job permanently retains:

  • 2 Bull EventEmitter listeners on the queue object
  • 1 setInterval handle (5s polling, never cleared)
  • The activeExecutions[executionId] entry (closure holds full executionData, request body, and HTTP response socket via httpResponse)

Impact

  • Webhook pods OOM-crash on a daily basis under production load
  • Each leaked entry holds the full execution payload + open HTTP socket
  • Confirmed present in all 1.x versions and 2.x up to 2.20.7

To Reproduce

  1. Run n8n in queue mode with separate webhook and worker pods
  2. Trigger a brief Redis reconnection while jobs are in flight (or under sustained high load where events can be occasionally missed)
  3. Monitor Object.keys(activeExecutions).length on the webhook pod — it grows monotonically and never shrinks

Expected behavior

job.finished() should resolve (or reject) for every job regardless of whether removeOnComplete/removeOnFail is set, and all associated EventEmitter listeners and timers should always be cleaned up after a job completes or fails.

Debug Info

core

  • n8nVersion: 1.123.26
  • platform: docker (self-hosted)
  • nodeJsVersion: 24.13.1
  • nodeEnv: production
  • database: postgres
  • executionMode: scaling (multi-main)
  • concurrency: -1
  • license: enterprise (sandbox)
  • consumerId: a891fcab-3387-4047-ba6a-a954540e45f1

storage

  • success: none
  • error: none
  • progress: false
  • manual: true
  • binaryMode: memory

pruning

  • enabled: true
  • maxAge: 336 hours
  • maxCount: 2000000 executions

client

  • userAgent: mozilla/5.0 (macintosh; intel mac os x 10_15_7) applewebkit/537.36 (khtml, like gecko) chrome/148.0.0.0 safari/537.36
  • isTouchDevice: false

security

  • secureCookie: false

Generated at: 2026-05-13T10:50:34.320Z

Operating System

node:24.13.1-alpine

n8n Version

1.123.26

Node.js Version

24.13.1

Database

PostgreSQL

Execution mode

queue

Hosting

self hosted

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

job.finished() should resolve (or reject) for every job regardless of whether removeOnComplete/removeOnFail is set, and all associated EventEmitter listeners and timers should always be cleaned up after a job completes or fails.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING