openclaw - ✅(Solved) Fix Memory leak: Gateway process memory grows indefinitely (OpenClaw 2026.4.5-4.9) [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#63643Fetched 2026-04-10 03:42:26
View on GitHub
Comments
1
Participants
2
Timeline
6
Reactions
0
Timeline (top)
referenced ×4commented ×1cross-referenced ×1

Fix Action

Fixed

PR fix notes

PR #63709: fix: plug three memory leaks in long-running gateway (#63643)

Description (problem / solution / changelog)

Summary

Fixes #63643

Long-running gateway instances (24/7 deployments) exhibit steady memory growth of ~15-20 MB/hour even when idle. This PR plugs three independent memory leaks found via deep code audit.

Root Cause

Three module-level Map instances grow monotonically without cleanup:

1. seqByRun in agent-events.ts

clearAgentRunContext(runId) deletes the run context from runContextById but forgets to delete the corresponding sequence counter from seqByRun. Every completed agent run (including heartbeats) leaves a permanent runId → number entry.

Fix: Delete seqByRun[runId] in clearAgentRunContext.

2. controlPlaneBuckets in control-plane-rate-limit.ts

Each unique deviceId|clientIp combination creates a rate-limit bucket. Buckets are reset when their window expires, but the map entry is never removed. With connection churn (especially connId fallback keys), the map grows without bound.

Fix: Add pruneStaleControlPlaneBuckets() that removes buckets whose window expired more than 5 minutes ago. Called from the existing 60s gateway maintenance timer.

3. TRANSCRIPT_SESSION_KEY_CACHE in session-transcript-key.ts

Each unique transcript file path is cached permanently with no TTL or size limit.

Fix: Add a 256-entry LRU cap using Map insertion order.

Testing

  • agent-events regression test: Verifies that clearAgentRunContext resets the sequence counter (seq restarts from 1), proving the old entry was deleted.
  • control-plane-rate-limit test: Verifies stale bucket pruning and empty-map safety.

All 11 tests pass.

Impact

  • Fixes unbounded memory growth in long-running gateway processes
  • Zero behavior change for existing functionality
  • Minimal code footprint: +100/-1 lines across 6 files

Root cause analysis and fix by 鲁班, cross-validated by B仔.

Changed files

  • src/gateway/control-plane-rate-limit.test.ts (added, +40/-0)
  • src/gateway/control-plane-rate-limit.ts (modified, +27/-0)
  • src/gateway/server-maintenance.ts (modified, +5/-0)
  • src/gateway/server-methods/nodes.ts (modified, +9/-0)
  • src/gateway/server/ws-connection.ts (modified, +2/-0)
  • src/gateway/session-transcript-key.test.ts (modified, +40/-0)
  • src/gateway/session-transcript-key.ts (modified, +11/-0)
  • src/infra/agent-events.test.ts (modified, +24/-0)
  • src/infra/agent-events.ts (modified, +3/-1)
RAW_BUFFERClick to expand / collapse

Bug Description

Environment:

  • Platform: Windows (Node.js standalone gateway)
  • Version: OpenClaw 2026.4.5 (later upgraded to 4.9, issue persisted)
  • Node.js: v22.22.2
  • Gateway mode: local

Problem: The Gateway process (openclaw gateway) exhibits a progressive memory leak. Over the course of approximately 12-24 hours of idle operation, the process memory footprint grows from ~180MB to ~400MB+, eventually requiring a restart.

Symptoms:

  • Memory usage climbs steadily even when the agent is idle (no active conversations)
  • The leak is observable in Windows Task Manager
  • No OOM crashes observed yet, but memory growth is consistent and monotonic
  • After restarting the gateway, memory resets to baseline (~180MB)

Timeline of observation:

  • Observed over multiple days (2026-04-07 to 2026-04-09)
  • Memory growth rate: approximately 15-20MB per hour during idle
  • Gateway was started clean on 2026-04-09 after upgrading from 4.5 to 4.9

Additional context:

  • Session locks accumulate over time (observed 3-4 lock files, though processes were alive)
  • Plugin system loaded (46 plugins loaded, 52 disabled)
  • No active cron jobs causing frequent wakeups during observation period

Expected behavior: Gateway process memory should remain stable during idle operation without continuous growth.

Possible related factors:

  • Session transcript accumulation
  • Periodic heartbeat/keepalive mechanism
  • Session store (sessions.json) growing to 109 entries
  • Hooks system (emotion-analyzer, exec-guard, emotion-dynamic-soul) enabled

Reported via OpenClaw agent (self-reported during normal operation)

extent analysis

TL;DR

The most likely fix for the progressive memory leak in the OpenClaw gateway process is to investigate and address the accumulation of session locks and the growth of the session store.

Guidance

  • Investigate the session transcript accumulation and session store growth to determine if they are contributing to the memory leak.
  • Review the plugin system and hooks configuration to ensure that they are not causing unnecessary memory allocation or retention.
  • Monitor the gateway process for any periodic tasks or cron jobs that may be triggering the memory growth, even if none were observed during the initial investigation.
  • Consider implementing a mechanism to periodically clean up or rotate the session store to prevent it from growing indefinitely.

Example

No specific code snippet can be provided without further information on the OpenClaw gateway's internal implementation. However, a potential approach to addressing the session store growth could involve implementing a periodic task to clean up or archive old session entries.

Notes

The exact cause of the memory leak is unclear, and further investigation is needed to determine the root cause. The provided information suggests that the session store and plugin system may be contributing factors, but other factors may also be at play.

Recommendation

Apply a workaround by implementing a periodic task to clean up or rotate the session store, as this is a potential contributing factor to the memory leak and can be addressed without requiring a full understanding of the underlying cause.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING