openclaw - 💡(How to fix) Fix SessionWriteLockTimeoutError: long LLM turns block incoming system events (Wren/cron updates lost) [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#75949Fetched 2026-05-03 04:44:02
View on GitHub
Comments
1
Participants
2
Timeline
2
Reactions
2
Timeline (top)
closed ×1commented ×1

Long-running agent turns hold the session JSONL write lock for their entire duration. Incoming system events (e.g. cron heartbeats, openclaw agent calls from peer agents) arrive during the active turn and fail with SessionWriteLockTimeoutError after the 10s timeout. The delivery is silently dropped — no retry, no queue.

Error Message

SessionWriteLockTimeoutError: session file locked (timeout 10000ms): pid=3447 /Users/page/.openclaw/agents/nova/sessions/f6bf05fa-f292-4f94-8fc1-d89d7f34d74b.jsonl.lock lane task error: lane=main durationMs=14610 error="SessionWriteLockTimeoutError: ..." lane task error: lane=session:agent:nova:main durationMs=14614 error="SessionWriteLockTimeoutError: ..."

Root Cause

Long-running agent turns hold the session JSONL write lock for their entire duration. Incoming system events (e.g. cron heartbeats, openclaw agent calls from peer agents) arrive during the active turn and fail with SessionWriteLockTimeoutError after the 10s timeout. The delivery is silently dropped — no retry, no queue.

Fix Action

Workaround

None currently. Peer agents must poll or retry manually after the turn completes.

Code Example

SessionWriteLockTimeoutError: session file locked (timeout 10000ms): pid=3447 /Users/page/.openclaw/agents/nova/sessions/f6bf05fa-f292-4f94-8fc1-d89d7f34d74b.jsonl.lock
lane task error: lane=main durationMs=14610 error="SessionWriteLockTimeoutError: ..."
lane task error: lane=session:agent:nova:main durationMs=14614 error="SessionWriteLockTimeoutError: ..."
RAW_BUFFERClick to expand / collapse

Summary

Long-running agent turns hold the session JSONL write lock for their entire duration. Incoming system events (e.g. cron heartbeats, openclaw agent calls from peer agents) arrive during the active turn and fail with SessionWriteLockTimeoutError after the 10s timeout. The delivery is silently dropped — no retry, no queue.

Observed error

SessionWriteLockTimeoutError: session file locked (timeout 10000ms): pid=3447 /Users/page/.openclaw/agents/nova/sessions/f6bf05fa-f292-4f94-8fc1-d89d7f34d74b.jsonl.lock
lane task error: lane=main durationMs=14610 error="SessionWriteLockTimeoutError: ..."
lane task error: lane=session:agent:nova:main durationMs=14614 error="SessionWriteLockTimeoutError: ..."

Incident

  • Date: 2026-05-02, ~13:41–14:21 AEST (03:41–04:21 UTC)
  • Platform: macOS (arm64), OpenClaw 2026.4.26, Node v24.14.0
  • Session: f6bf05fa-f292-4f94-8fc1-d89d7f34d74b (Nova agent, Telegram group channel)
  • Lock holder: PID 3447 (openclaw-gateway) — processing a multi-tool response turn with ~10 exec calls and file reads
  • Blocked sender: Wren (Codex app-server thread) attempting to deliver SDLC gate completion updates via openclaw agent --agent nova

Impact: Wren's updates (GATE-2 complete, GATE-3 active, GATE-4 contract tests passing) were lost for ~40 minutes. The receiving agent had no indication updates were dropped and reported stale state to the operator.

Steps to reproduce

  1. Start a long agent turn involving multiple tool calls (exec, file reads, API calls) — anything that takes >10s total
  2. While the turn is in progress, send a system event or openclaw agent message to the same session
  3. Observe SessionWriteLockTimeoutError in logs; the incoming message is dropped

Expected behaviour

Incoming system events should be queued during an active write-lock and delivered when the turn completes, or the lock timeout should be configurable to a longer value for multi-agent workflows.

Workaround

None currently. Peer agents must poll or retry manually after the turn completes.

Environment

  • OpenClaw: 2026.4.26 (be8c246)
  • Node: v24.14.0
  • OS: macOS Darwin 25.3.0 arm64
  • Channel: Telegram (group)
  • Model: anthropic/claude-sonnet-4-6

extent analysis

TL;DR

Implement a queuing mechanism for incoming system events during an active write-lock or increase the lock timeout to prevent SessionWriteLockTimeoutError.

Guidance

  • Investigate the feasibility of implementing a message queue to hold incoming system events until the write-lock is released, ensuring that no messages are lost.
  • Consider increasing the lock timeout value to a longer duration to accommodate multi-agent workflows, but be aware that this may introduce other performance issues.
  • Review the current workflow and tool calls within the long-running agent turn to identify potential optimizations that could reduce the overall execution time.
  • Evaluate the use of a more robust locking mechanism that allows for concurrent access or provides a callback for pending requests.

Example

No specific code example can be provided without further details on the implementation, but a basic queuing mechanism could involve using a library like bull in Node.js to handle message queues.

Notes

The current implementation lacks a queuing mechanism, leading to lost messages during long-running agent turns. Any solution should consider the trade-offs between message delivery guarantees, performance, and complexity.

Recommendation

Apply a workaround by implementing a queuing mechanism for incoming system events, as increasing the lock timeout may not be sufficient for all scenarios and could lead to other issues. This approach ensures that messages are not lost and provides a more robust solution for handling concurrent access.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix SessionWriteLockTimeoutError: long LLM turns block incoming system events (Wren/cron updates lost) [1 comments, 2 participants]