openclaw - 💡(How to fix) Fix sessions.list latency around 10s and fixed 10s pi-trajectory-flush timeout under moderate session load [2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#75839Fetched 2026-05-02 05:29:16
View on GitHub
Comments
2
Participants
2
Timeline
7
Reactions
2
Timeline (top)
commented ×2subscribed ×2closed ×1mentioned ×1

We are observing consistent performance issues in OpenClaw related to session handling.

There are two related symptoms:

  1. sessions.list consistently takes about 10 to 16 seconds under moderate session load.
  2. pi-trajectory-flush regularly times out at exactly 10000 ms.

Local cleanup and pruning improve disk usage and general stability, but do not resolve the core latency.

Root Cause

These workarounds reduce pressure and improve stability, but they do not address the root cause of the observed sessions.list latency.

Fix Action

Fix / Workaround

Local workarounds applied

The following local mitigations were applied successfully to improve disk usage and general stability:

  • cleanup of stale plugin-runtime-deps
  • cleanup of main.sqlite.tmp-*
  • cleanup of sessions.json.*.tmp
  • archiving old and large *.trajectory.jsonl
  • limiting active trajectory files to 200
  • archiving session artifacts:
    • *.jsonl.reset.*
    • *.jsonl.bak-*
    • *.trajectory.jsonl.deleted.*
    • *.checkpoint.*.jsonl
  • local maintenance job for nightly cleanup
  • planned local configuration workaround:
    • session.maintenance.maxEntries
    • session.maintenance.pruneDays
    • OPENCLAW_SESSION_CACHE_TTL_MS=120000

Code Example

⇄ res ✓ sessions.list 10186ms
⇄ res ✓ sessions.list 10230ms
⇄ res ✓ sessions.list 15978ms
⇄ res ✓ sessions.list 16379ms

---

agent cleanup timed out: runId=... sessionId=... step=pi-trajectory-flush timeoutMs=10000
RAW_BUFFERClick to expand / collapse

Summary

We are observing consistent performance issues in OpenClaw related to session handling.

There are two related symptoms:

  1. sessions.list consistently takes about 10 to 16 seconds under moderate session load.
  2. pi-trajectory-flush regularly times out at exactly 10000 ms.

Local cleanup and pruning improve disk usage and general stability, but do not resolve the core latency.

Environment

  • OpenClaw: current local Docker deployment
  • Image: openclaw:local
  • Platform: Debian on ARM64 / Raspberry Pi
  • Storage: local filesystem
  • Deployment mode: Docker
  • Session store: agents/main/sessions/sessions.json

Observed state before local cleanup and pruning:

  • sessions.json: about 4.1 MB
  • Session entries: about 153
  • Active session directory had many trajectory and session artifact files
  • Trajectory files totaled several hundred MB

After cleanup and limiting active trajectories, sessions.list still remained around 10 seconds.

Problem 1: sessions.list latency

The sessions.list command consistently takes around 10 to 16 seconds.

Observed log examples:

⇄ res ✓ sessions.list 10186ms
⇄ res ✓ sessions.list 10230ms
⇄ res ✓ sessions.list 15978ms
⇄ res ✓ sessions.list 16379ms

This continued even after:

  • removing stale plugin-runtime-deps
  • deleting stale SQLite temporary files
  • deleting stale session temporary files
  • archiving large trajectory files
  • limiting active trajectory files to about 200
  • archiving session artifacts such as .reset, .bak, .deleted, .checkpoint

The current evidence suggests the issue is not only raw disk usage, but the session store loading path itself.

Problem 2: pi-trajectory-flush timeout

The agent cleanup step pi-trajectory-flush regularly times out at exactly 10000 ms.

Observed log examples:

agent cleanup timed out: runId=... sessionId=... step=pi-trajectory-flush timeoutMs=10000

Local code inspection suggests the timeout is currently hardcoded/defaulted around 10000 ms and is not externally configurable.

Local workarounds applied

The following local mitigations were applied successfully to improve disk usage and general stability:

  • cleanup of stale plugin-runtime-deps
  • cleanup of main.sqlite.tmp-*
  • cleanup of sessions.json.*.tmp
  • archiving old and large *.trajectory.jsonl
  • limiting active trajectory files to 200
  • archiving session artifacts:
    • *.jsonl.reset.*
    • *.jsonl.bak-*
    • *.trajectory.jsonl.deleted.*
    • *.checkpoint.*.jsonl
  • local maintenance job for nightly cleanup
  • planned local configuration workaround:
    • session.maintenance.maxEntries
    • session.maintenance.pruneDays
    • OPENCLAW_SESSION_CACHE_TTL_MS=120000

These workarounds reduce pressure and improve stability, but they do not address the root cause of the observed sessions.list latency.

Expected behavior

  • sessions.list should remain responsive with about 100 to 200 sessions.
  • A few MB of sessions.json should not lead to consistent 10 to 16 second latency.
  • Session store cache misses should not cause long UI stalls.
  • pi-trajectory-flush timeout should be configurable or adaptive.

Suggested improvements

  1. Improve session store loading and indexing

    • avoid full parse/clone for each cache miss where possible
    • consider indexed or incremental session metadata loading
    • reduce synchronous filesystem work in the sessions.list path
  2. Improve session cache behavior

    • smarter invalidation
    • avoid unnecessary full deep clone via JSON serialization if possible
    • allow better tuning for dashboard polling patterns
  3. Make trajectory flush timeout configurable

    • for example via environment variable such as OPENCLAW_AGENT_CLEANUP_TIMEOUT_MS
    • or allow a specific OPENCLAW_TRAJECTORY_FLUSH_TIMEOUT_MS
  4. Consider a lightweight sessions.list mode

    • no heavy per-session processing
    • explicit metadata depth flags
    • dashboard-oriented fast path

Reproduction outline

  1. Run OpenClaw with around 100 to 200 sessions.
  2. Let agents/main/sessions/sessions.json grow to a few MB.
  3. Call sessions.list from the dashboard or API.
  4. Observe latency around 10 seconds, especially after cache expiry or cache invalidation.
  5. Run agent interactions and observe recurring pi-trajectory-flush timeout warnings at 10000 ms.

Notes

This issue was identified during operational maintenance on a Raspberry Pi based OpenClaw installation. Filesystem cleanup and session artifact archiving reduced disk usage significantly and lowered general system pressure, but sessions.list remained slow. This suggests the remaining issue is in the session loading and cleanup implementation rather than only in accumulated local artifacts.

extent analysis

TL;DR

Improve session store loading and indexing to reduce latency in sessions.list and make the pi-trajectory-flush timeout configurable to prevent regular timeouts.

Guidance

  • Investigate optimizing the session store loading path to avoid full parsing and cloning for each cache miss, considering indexed or incremental session metadata loading.
  • Review the session cache behavior to implement smarter invalidation and avoid unnecessary deep cloning via JSON serialization.
  • Introduce a configuration option for the pi-trajectory-flush timeout, such as an environment variable OPENCLAW_AGENT_CLEANUP_TIMEOUT_MS, to prevent regular timeouts.
  • Consider implementing a lightweight sessions.list mode with explicit metadata depth flags for a faster dashboard-oriented path.

Example

No specific code example is provided due to the lack of direct code references in the issue, but optimizing the session store loading and introducing configuration for timeouts are key steps.

Notes

The provided workarounds have improved disk usage and stability but do not address the root cause of the sessions.list latency. The issue seems to be related to the session loading and cleanup implementation rather than accumulated local artifacts.

Recommendation

Apply a workaround by improving session store loading and indexing and making the pi-trajectory-flush timeout configurable, as these steps directly address the identified performance issues and timeouts.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

  • sessions.list should remain responsive with about 100 to 200 sessions.
  • A few MB of sessions.json should not lead to consistent 10 to 16 second latency.
  • Session store cache misses should not cause long UI stalls.
  • pi-trajectory-flush timeout should be configurable or adaptive.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix sessions.list latency around 10s and fixed 10s pi-trajectory-flush timeout under moderate session load [2 comments, 2 participants]