openclaw - 💡(How to fix) Fix Memory bloat from unbounded session/task record accumulation [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#73114Fetched 2026-04-28 06:27:26
View on GitHub
Comments
0
Participants
1
Timeline
1
Reactions
0
Author
Participants
Timeline (top)
cross-referenced ×1

Root Cause

  • No session archival: completed subagent/cron sessions are never removed from sessions.json
  • No TTL on task records: task_runs grows unbounded
  • skillsSnapshot duplication: each of the 471 entries stores the full list of available skills (~41KB each), totaling ~19MB of duplicated data
  • V8 never releases memory: allocated heap is never returned to the OS
  • Zombie task detection gap: cli runtime tasks that die are not auto-transitioned to failed/timed_out
RAW_BUFFERClick to expand / collapse

Problem

Gateway memory usage grows to 2.5GB+ within hours due to unbounded accumulation of session and task records.

Detailed Analysis

1. sessions.json — 471 entries, 26.8MB on disk, loaded entirely at startup

Every session ever created is stored and reloaded:

Session TypeCountSize
subagent37518MB
cron898MB
dreaming5470KB
main150KB
feishu193KB

Each entry has 29 fields, including a skillsSnapshot (~41KB per entry). That is 471 × 41KB ≈ 19MB just for duplicated skills snapshots — the same skill prompt repeated 471 times.

In V8, JSON objects bloat 5–7x. So 26.8MB on disk → ~150–200MB in heap.

2. Zombie running tasks (13 tasks, oldest 26 days)

openclaw tasks list --status running reported 13 tasks still in running state, with creation dates from April 2 to April 21. These were dead subagent/exec-approval CLI tasks whose processes had terminated, but the task state was never updated.

After manual cancellation: task count went from 377 → 364.

3. task_runs SQLite — 377 records, 12MB

No TTL or archival. Records accumulate forever.

4. Memory breakdown (vmmap)

RegionResident
Memory Tag 255 (V8 JIT + heap)2.1 GB
MALLOC zones340 MB
ReadOnly Libraries (code)320 MB
Total Physical2.5 GB (peak 2.8 GB)

After restart: dropped to 1.2 GB, but will grow again as sessions accumulate.

Root Cause

  • No session archival: completed subagent/cron sessions are never removed from sessions.json
  • No TTL on task records: task_runs grows unbounded
  • skillsSnapshot duplication: each of the 471 entries stores the full list of available skills (~41KB each), totaling ~19MB of duplicated data
  • V8 never releases memory: allocated heap is never returned to the OS
  • Zombie task detection gap: cli runtime tasks that die are not auto-transitioned to failed/timed_out

Suggested Fixes

  1. Session archival: Archive completed sessions older than N days (default 7d) to a separate file. Only keep active/recent sessions in the live sessions.json.
  2. skillsSnapshot dedup: Store skills snapshot once, reference by version hash in each session entry.
  3. Task TTL: Auto-archive task_runs records older than N days.
  4. Zombie task detection: Periodically check cli runtime tasks whose processes have exited and update their status.
  5. Memory-aware pruning: When heap exceeds a threshold, trigger automatic session archival.
  6. Lazy loading: Do not load all historical session metadata at startup.

Environment

  • OpenClaw: 2026.4.24
  • Node.js: v23.11.0
  • OS: macOS 26.3.1 (arm64)
  • Memory: 16 GB (gateway using 2.5 GB / 15.4%)

extent analysis

TL;DR

Implement session archival and task TTL to prevent unbounded memory growth.

Guidance

  • Archive completed sessions older than a specified number of days (e.g., 7 days) to reduce memory usage.
  • Implement a TTL (time to live) for task records to auto-archive older records and prevent infinite growth.
  • Consider deduplicating the skillsSnapshot data to reduce memory usage.
  • Periodically check for zombie tasks and update their status to prevent memory leaks.
  • Explore lazy loading of historical session metadata to reduce memory usage at startup.

Example

// Example of deduplicated skillsSnapshot
{
  "sessions": [
    {
      "id": 1,
      "skillsSnapshot": "v1"
    },
    {
      "id": 2,
      "skillsSnapshot": "v1"
    }
  ],
  "skillsSnapshots": {
    "v1": ["skill1", "skill2", ...]
  }
}

Notes

The provided suggestions are based on the analysis of the issue and may require additional implementation details. The effectiveness of these suggestions may vary depending on the specific use case and requirements of the application.

Recommendation

Apply the suggested fixes, starting with session archival and task TTL, to prevent unbounded memory growth and reduce the memory usage of the gateway.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Memory bloat from unbounded session/task record accumulation [1 participants]