openclaw - ✅(Solved) Fix mcp/channel-bridge: pendingClaudePermissions / pendingApprovals leak — no TTL, no close-clear, no cap [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#71646Fetched 2026-04-26 05:10:15
View on GitHub
Comments
0
Participants
1
Timeline
1
Reactions
0
Author
Participants
Timeline (top)
cross-referenced ×1

OpenClawChannelBridge (src/mcp/channel-bridge.ts) holds two instance-bound pending Maps that lack the same three guards their siblings already have, so a long-running openclaw mcp serve process accumulates entries monotonically.

CollectionCapClose-clearTTL / sweeper
queue (Array, L48)QUEUE_LIMIT=1000 while-shift (L355)n/an/a
pendingWaiters (Set, L49)n/aclose() clears (L148)per-waiter setTimeout fallback (L265)
pendingClaudePermissions (Map, L50)nonenonenone
pendingApprovals (Map, L51)nonenoneexpiresAtMs is stored at L382 but never drives a timer/sweeper

Root Cause

OpenClawChannelBridge (src/mcp/channel-bridge.ts) holds two instance-bound pending Maps that lack the same three guards their siblings already have, so a long-running openclaw mcp serve process accumulates entries monotonically.

CollectionCapClose-clearTTL / sweeper
queue (Array, L48)QUEUE_LIMIT=1000 while-shift (L355)n/an/a
pendingWaiters (Set, L49)n/aclose() clears (L148)per-waiter setTimeout fallback (L265)
pendingClaudePermissions (Map, L50)nonenonenone
pendingApprovals (Map, L51)nonenoneexpiresAtMs is stored at L382 but never drives a timer/sweeper

Fix Action

Fixed

PR fix notes

PR #71648: fix(mcp): bound pendingClaudePermissions / pendingApprovals via TTL sweeper + close clear

Description (problem / solution / changelog)

Summary

  • Problem: OpenClawChannelBridge (src/mcp/channel-bridge.ts) holds two instance-bound pending Maps — pendingClaudePermissions (L50) and pendingApprovals (L51) — that lack TTL/sweeper, close-clear, and cap. Sibling collections in the same class (queue cap=1000 at L355, pendingWaiters close-clear at L148 + per-waiter setTimeout fallback at L265) are all bounded; only these two are not.
  • Why it matters: long-running openclaw mcp serve accumulates entries on (a) missed Claude permission replies (operator typos / cross-channel responses / --dangerously-skip-permissions) and (b) gateway WebSocket drops that silence *.approval.resolved frames (onClose at L122-124 is reject-only, no missed-event catchup).
  • What changed: lazy-start a 5-min sweepPendingExpired() interval, .unref()'d so it never keeps the process alive. Per-Map TTL — 1h for Claude permissions (anchored at createdAtMs), respects payload.expiresAtMs for approvals (falls back to trackedAtMs + 30min when absent). close() now clears both Maps and stops the sweeper. Handler entry early-returns when this.closed to prevent post-close ghost writes.
  • What did NOT change: public listPendingApprovals shape, message protocols, or any other class collection. cap/FIFO is intentionally split to a follow-up PR (one-thing-per-PR; sweeper covers the slow leak, cap addresses adversarial bursts).

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Memory / storage
  • Integrations
  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

Closes #71646.

  • This PR fixes a bug or regression

Root Cause

Upstream commit history shows channel-bridge.ts evolving across refactor/seam-split commits over the last 6 weeks, but no commit added a TTL/sweep/cap guard for these two pending Maps. The asymmetry is structural:

CollectionCapClose-clearTTL / sweeperStatus
queue (Array, L48)QUEUE_LIMIT=1000 while-shiftn/an/abounded
pendingWaiters (Set, L49)n/aclose() clears (L148)per-waiter setTimeout (L265)bounded
pendingClaudePermissions (Map, L50)nonenonenoneleaks
pendingApprovals (Map, L51)nonenoneexpiresAtMs stored at L382 but never drives a timerleaks

pendingClaudePermissions only deletes when the operator reply matches /^(yes|no)\s+([a-km-z]{5})$/i AND has(requestId) is true (L455-459) — a single conditional-edge cleanup. pendingApprovals only deletes on the matching *.resolved frame (L389) — another single conditional-edge. No primary unconditional cleanup path exists.

Regression Test Plan

New file src/mcp/channel-bridge.test.ts (7 it blocks, fake timers, no production-branch divergence):

  1. pendingClaudePermissions evicted by sweeper at TTL boundary
  2. pendingApprovals evicted at expiresAtMs
  3. pendingApprovals evicted at default TTL fallback when expiresAtMs is absent
  4. pendingApprovals evicted even when both createdAtMs and expiresAtMs are absent (regression test for the now-anchored fallback bug caught in pre-PR review)
  5. close() clears both Maps, stops the sweeper interval, and vi.getTimerCount() === 0
  6. handleClaudePermissionRequest is a no-op after close() — prevents post-close ghost writes
  7. Sweeper interval is not started before any pending entry is added (lazy-init)

channel-server.test.ts:366 fixture updated to the new {request, createdAtMs} wrapper shape.

User-visible / Behavior Changes

  • Long-running openclaw mcp serve sees pending Map sizes converge to a steady state instead of growing monotonically.
  • Operators who never reply to a Claude permission_request no longer keep the entry forever; it expires after 1h and the next permission_request for the same tool starts fresh.
  • Approvals announced over a dropped gateway WebSocket are auto-evicted at expiresAtMs (or trackedAtMs + 30min if the gateway omitted expiresAtMs).
  • No protocol surface changes; client tools see no difference.

Security Impact

None. The fix is purely about bounded memory growth in an opt-in long-running daemon. No auth / scope / token / sandbox surface touched. CODEOWNERS protected paths (/src/cron/service/jobs.ts, /src/cron/stagger.ts, /src/agents/*auth*, /src/agents/sandbox*) untouched.

Repro + Verification

Environment

  • pnpm 10.33.0 / Node (project default) / vitest 4.1.5
  • Worktree: clean checkout of upstream/main + this branch
  • Platform: macOS (Darwin 25.4.0)

Steps

pnpm install
pnpm check                                                       # type + lint
pnpm build                                                       # build
pnpm exec vitest run src/mcp/channel-bridge.test.ts              # new tests
pnpm exec vitest run src/mcp/channel-server.test.ts              # existing mcp tests after fixture sync
pnpm exec vitest run --config test/vitest/vitest.unit.config.ts  # unit project regression

Expected

All green. New file 7/7. Existing channel-server 6/6. Unit project 1575/1575 (was 1573 pre-fix; +2 from new file's count vs prior file count).

Actual

src/mcp/channel-bridge.test.ts: Tests 7 passed (7), 640ms
src/mcp/channel-server.test.ts: Tests 6 passed (6)
unit project: Test Files 189 passed (189), Tests 1575 passed (1575), 6.48s
pnpm check: clean
pnpm build: clean

Evidence

  • src/mcp/channel-bridge.ts:50-51 — pre-fix declarations of the two leaking Maps
  • src/mcp/channel-bridge.ts:136-152 — pre-fix close() body (only pendingWaiters.clear())
  • src/mcp/channel-bridge.ts:355-356 — pre-existing queue cap pattern (sibling reference)
  • src/mcp/channel-bridge.ts:265-267 — pre-existing per-waiter setTimeout (sibling reference)
  • Diff: ~80 lines prod (channel-bridge.ts +89 -4) + 9 lines fixture sync (channel-server.test.ts) + 173 lines new test file

Human Verification

I confirm I have reviewed every diff hunk, ran the build/check/test suites locally, and the regression tests fail on upstream/main (no fix) and pass on this branch (with fix).

Review Conversations

This is a fresh PR. No prior review threads.

Compatibility / Migration

Fully backwards-compatible. No public API, no protocol field, no config surface changed. The internal Map value shape ({request, createdAtMs} wrapper, {approval, trackedAtMs} wrapper) is private — only the test that previously cast through Record<string, unknown> was updated to match.

Risks and Mitigations

  • Risk: TTL constants (PENDING_CLAUDE_PERMISSION_TTL_MS=1h, PENDING_APPROVAL_DEFAULT_TTL_MS=30min, PENDING_SWEEP_INTERVAL_MS=5min) are hardcoded. Mitigation: chosen for low operator-disruption (1h is longer than typical permission-think time; 30min matches existing expiresAtMs field intent). Happy to migrate to a config surface in a follow-up if requested.
  • Risk: PR #56420 (sessionKey binding, OPEN) touches the same set() line in handleClaudePermissionRequest. Mitigation: line-level rebase only; semantic axes are orthogonal (binding vs leak). Whichever PR merges first, the other rebases cleanly.
  • Risk: cap/FIFO is not in this PR — adversarial bursts within a single sweep tick (5min) could still grow large in pathological cases. Mitigation: deferred to a follow-up PR (one-thing-per-PR). Sweeper closes the slow leak that motivated the report; cap is a separate concern.

🤖 AI-Assisted (openclaw-audit pipeline). 5-agent post-harness cross-review (proceed: 4 real + 1 fix-insufficient → cap deferred to follow-up). 3-agent pre-PR cross-review round 1 caught a fallback-recalculation bug in the original sweeper (createdAtMs ?? now re-anchored expiry on every tick) → fixed via trackedAtMs instance bookkeeping; round 2 cleared 3/3 real-problem-real-fix.

Changed files

  • src/mcp/channel-bridge.test.ts (added, +202/-0)
  • src/mcp/channel-bridge.ts (modified, +85/-16)
  • src/mcp/channel-server.test.ts (modified, +6/-3)
RAW_BUFFERClick to expand / collapse

Summary

OpenClawChannelBridge (src/mcp/channel-bridge.ts) holds two instance-bound pending Maps that lack the same three guards their siblings already have, so a long-running openclaw mcp serve process accumulates entries monotonically.

CollectionCapClose-clearTTL / sweeper
queue (Array, L48)QUEUE_LIMIT=1000 while-shift (L355)n/an/a
pendingWaiters (Set, L49)n/aclose() clears (L148)per-waiter setTimeout fallback (L265)
pendingClaudePermissions (Map, L50)nonenonenone
pendingApprovals (Map, L51)nonenoneexpiresAtMs is stored at L382 but never drives a timer/sweeper

Why it matters

  • pendingClaudePermissions: Claude SDK sends a notifications/claude/channel/permission_request for every tool call (channel-server.ts:42-49). The entry is only removed when the operator replies with yes <id> / no <id> matching /^(yes|no)\s+([a-km-z]{5})$/i (channel-bridge.ts:455-459). Missed/typo'd/cross-channel responses leave the entry forever. requestId is fresh per request so there is no overwrite.
  • pendingApprovals: trackApproval (L369-) sets the entry on exec.approval.requested / plugin.approval.requested and only deletes on the matching *.resolved event (L389). A gateway WebSocket drop (channel-bridge.ts:122-124 onClose is reject-only) silently loses the resolved frame and the entry persists.
  • close() (L136-152) clears pendingWaiters but does not clear either pending Map.

In a long-running openclaw mcp serve process (operator hours-units, claudeChannelMode != 'off' busy automation), both Maps grow monotonically.

Affected paths

  • src/mcp/channel-bridge.ts:50 (pendingClaudePermissions decl)
  • src/mcp/channel-bridge.ts:51 (pendingApprovals decl)
  • src/mcp/channel-bridge.ts:136-152 (close() body — pending Map clears missing)
  • src/mcp/channel-bridge.ts:273-295 (handleClaudePermissionRequest set without TTL)
  • src/mcp/channel-bridge.ts:369-391 (trackApproval stores expiresAtMs but no expiry mechanism)

Severity

P2 — slow memory-growth leak in opt-in long-running CLI path. Not an immediate OOM, but unbounded over time. Maintainer-priority axes hit: memory + reliability + plugin loading (MCP).

Notes

  • AI-assisted (openclaw-audit pipeline, FIND-mcp-memory-001/002 → CAND-025 epic).
  • 5-agent post-harness cross-review and 3-agent pre-PR cross-review both proceed (consensus on real problem + fix sufficient).
  • Upstream-dup check clean: 6w channel-bridge.ts commits all refactor/seam-split (no leak axis); PR #56420 (sessionKey binding, OPEN) is orthogonal.
  • Fix PR follows.

extent analysis

TL;DR

Implementing TTL or sweeper mechanisms for pendingClaudePermissions and pendingApprovals Maps can mitigate the memory leak.

Guidance

  • Add a TTL (time-to-live) mechanism to pendingClaudePermissions to automatically remove entries after a certain time period.
  • Implement a sweeper mechanism to periodically clean up expired entries in pendingApprovals based on the stored expiresAtMs value.
  • Modify the close() method to clear both pendingClaudePermissions and pendingApprovals Maps.
  • Consider adding a maximum size limit to the Maps to prevent unbounded growth.

Example

// Example of implementing TTL for pendingClaudePermissions
const ttl = 3600000; // 1 hour
const pendingClaudePermissions = new Map();
setInterval(() => {
  const now = Date.now();
  pendingClaudePermissions.forEach((value, key) => {
    if (now - value.timestamp > ttl) {
      pendingClaudePermissions.delete(key);
    }
  });
}, 60000); // clean up every 1 minute

Notes

The provided example is a basic illustration and may need to be adapted to the specific requirements of the openclaw mcp serve process.

Recommendation

Apply a workaround by implementing TTL or sweeper mechanisms for the affected Maps, as this will help mitigate the memory leak and prevent unbounded growth.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix mcp/channel-bridge: pendingClaudePermissions / pendingApprovals leak — no TTL, no close-clear, no cap [1 pull requests, 1 participants]