openclaw - ✅(Solved) Fix Session trajectory-path files accumulate indefinitely, causing node.list latency to degrade over time (20-35s observed at ~950 sessions) [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#73000Fetched 2026-04-28 06:28:52
View on GitHub
Comments
1
Participants
2
Timeline
3
Reactions
0
Timeline (top)
closed ×1commented ×1cross-referenced ×1

Session trajectory-path files (.trajectory-path.json) are never pruned from the agent sessions directory. As the gateway accumulates sessions over days of operation, directory scan–dependent operations — particularly node.list and session-write-lock — degrade proportionally. At ~950 files we observe node.list calls taking 20–35 seconds, causing UI stalls, missed Slack WebSocket pong frames, and cron scheduling delays.

This is distinct from #72826 (synchronous sessionStore maintenance / OOM), which is closed and fixed in 2026.4.25. That fix addresses sessions.json write contention; it does not address trajectory-path file accumulation or directory-scan-based latency.


Root Cause

Root cause hypothesis

Fix Action

Fix / Workaround

Workaround (applied locally)

None applied — manually deleting trajectory files is risky without knowing which are actively referenced. Running a daily gateway restart as a partial mitigation (clears in-memory pressure, does not reduce on-disk file count).

PR fix notes

PR #72978: fix(gateway): limit session row enrichment

Description (problem / solution / changelog)

Summary

  • avoid building enriched session rows before sessions.list filtering/sorting/limit is applied
  • avoid cloning session stores in the gateway combined-store read path
  • add a regression test proving transcript-backed enrichment is not run for rows outside the requested limit

Testing

  • pnpm exec vitest run --config test/vitest/vitest.gateway.config.ts src/gateway/session-utils.subagent.test.ts
  • pnpm build
  • pnpm ui:build

Changed files

  • src/config/sessions/combined-store-gateway.ts (modified, +2/-2)
  • src/gateway/session-utils.subagent.test.ts (modified, +76/-0)
  • src/gateway/session-utils.ts (modified, +49/-19)

Code Example

$ ls ~/.openclaw/agents/main/sessions/*.trajectory-path.json | wc -l
949

$ du -sh ~/.openclaw/agents/main/sessions/
29M    ~/.openclaw/agents/main/sessions/

$ ls -lh ~/.openclaw/agents/main/sessions/sessions.json
-rw------- 1 root root 759K  sessions.json     <- capped at 25 entries via maxEntries

---

[ws] res node.list 34632ms
[ws] res node.list 20231ms

---

[session-write-lock] releasing lock held for 73416ms (max=15000ms):
  ~/.openclaw/agents/main/sessions/sessions.json.lock

---

[ws] res node.list 43422ms     <- full event-loop stall
RAW_BUFFERClick to expand / collapse

Summary

Session trajectory-path files (.trajectory-path.json) are never pruned from the agent sessions directory. As the gateway accumulates sessions over days of operation, directory scan–dependent operations — particularly node.list and session-write-lock — degrade proportionally. At ~950 files we observe node.list calls taking 20–35 seconds, causing UI stalls, missed Slack WebSocket pong frames, and cron scheduling delays.

This is distinct from #72826 (synchronous sessionStore maintenance / OOM), which is closed and fixed in 2026.4.25. That fix addresses sessions.json write contention; it does not address trajectory-path file accumulation or directory-scan-based latency.


Environment

  • OpenClaw version: 2026.4.24 (v2026.4.25 with #72826 fix not yet installed)
  • Host: 2 vCPU, 8 GB RAM, Ubuntu 24.04
  • Agent: main (primary agent handling WhatsApp, Slack, and ~50 scheduled crons)
  • Operational age: ~7 days continuous operation

Observed data

sessions directory

$ ls ~/.openclaw/agents/main/sessions/*.trajectory-path.json | wc -l
949

$ du -sh ~/.openclaw/agents/main/sessions/
29M    ~/.openclaw/agents/main/sessions/

$ ls -lh ~/.openclaw/agents/main/sessions/sessions.json
-rw------- 1 root root 759K  sessions.json     <- capped at 25 entries via maxEntries

Despite session.maintenance.maxEntries=25 keeping sessions.json at 759 KB, the directory holds 949 .trajectory-path.json pointer files and 16 active .trajectory.jsonl files (largest: 2.3 MB). Every session ever created leaves a permanent .trajectory-path.json entry that is never removed.

node.list latency (gateway journal, 2026-04-27)

[ws] res node.list 34632ms
[ws] res node.list 20231ms

When the directory had far fewer files (early in the week): node.list was sub-second. The web UI polls node.list on every page load; at 20-35s, the control panel becomes unusable during stalls.

session-write-lock contention (same day)

[session-write-lock] releasing lock held for 73416ms (max=15000ms):
  ~/.openclaw/agents/main/sessions/sessions.json.lock

Lock held for 73 seconds vs. the documented 15-second max.

Previous day (2026-04-26, ~400 fewer files)

[ws] res node.list 43422ms     <- full event-loop stall

Latency tracks file count, not gateway uptime.


Root cause hypothesis

node.list (and related operations that walk the session store) appears to stat or read the sessions directory proportionally to the number of files present. .trajectory-path.json files are created for every session but never removed, even when the session is pruned from sessions.json by the maxEntries maintenance pass.

The result: sessions.json stays bounded by maxEntries, but the directory grows without bound. Orphaned trajectory-path files (and their .jsonl counterparts) continue to inflate directory-scan costs indefinitely.


Suggested fixes

  1. Prune orphaned .trajectory-path.json files when a session is evicted from sessions.json by the maintenance pass. The corresponding .jsonl and any .jsonl.reset.* rotations should be removed at the same time.

  2. Include trajectory-path cleanup in openclaw sessions cleanup --enforce (added in #72826). Currently enforces maxEntries on sessions.json; it should also sweep the directory and remove .trajectory-path.json / .trajectory.jsonl files whose session ID is no longer present in sessions.json.

  3. Make node.list independent of trajectory-path file count. If the operation does a readdir to enumerate active sessions, it should only consult sessions.json (bounded by maxEntries) rather than the on-disk directory. The on-disk count and the in-memory store count are already intentionally divergent once eviction begins.


Workaround (applied locally)

None applied — manually deleting trajectory files is risky without knowing which are actively referenced. Running a daily gateway restart as a partial mitigation (clears in-memory pressure, does not reduce on-disk file count).

Note: Upgrading to 2026.4.25 (which fixes #72826) is expected to reduce session-write-lock hold time via async maintenance, but will not reduce node.list latency since the trajectory-path file count is unaffected.


Impact

  • UI lagginess: Control panel becomes unresponsive during 20-35s node.list stalls
  • Slack socket reconnections: 34-second event-loop stalls cause @slack/socket-mode's 5s pong timeout to fire, producing ~50-90 socket reconnects per 15-minute window
  • Cron scheduling jitter: Jobs fire late because the scheduler cannot acquire the event loop during a stall

Happy to share the directory listing or sessions.json (redacted) if useful.

extent analysis

TL;DR

Prune orphaned .trajectory-path.json files when a session is evicted from sessions.json to reduce directory scan latency.

Guidance

  • Identify and remove .trajectory-path.json files whose session ID is no longer present in sessions.json to prevent directory growth.
  • Consider modifying node.list to only consult sessions.json instead of the on-disk directory to reduce latency.
  • Implement a cleanup mechanism, such as including trajectory-path cleanup in openclaw sessions cleanup --enforce, to regularly remove orphaned files.
  • Monitor directory size and file count to ensure the fix is effective.

Example

No code snippet is provided as the issue does not require a specific code change, but rather a logical modification to the existing session maintenance process.

Notes

The provided openclaw version (2026.4.24) does not have the fix for #72826, which addresses a different issue. Upgrading to 2026.4.25 will not resolve the node.list latency issue.

Recommendation

Apply workaround by pruning orphaned .trajectory-path.json files when a session is evicted from sessions.json, as this directly addresses the root cause of the issue.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix Session trajectory-path files accumulate indefinitely, causing node.list latency to degrade over time (20-35s observed at ~950 sessions) [1 pull requests, 1 comments, 2 participants]