openclaw - 💡(How to fix) Fix Session lane not released after LLM provider timeout, blocking all subsequent messages [1 comments, 2 participants]

openclaw2026-05-13 07:55:51

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#81335•Fetched 2026-05-14 03:33:11

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Water1209

Participants

clawsweeper[bot]

Water1209

Timeline (top)

commented ×1

When an LLM provider (OpenAI Codex / gpt-5.5) returns consecutive server_error timeouts during a group chat agent run, the session lane is not properly released. All subsequent messages to the same session are queued indefinitely behind the "ghost run", effectively freezing the session until a full gateway restart.

Error Message

15:21:37 WARN embedded_run_agent_end: isError=true, error="LLM error server_error", failoverReason="timeout", model="gpt-5.5", provider="openai-codex" 15:22:34 WARN embedded_run_agent_end: isError=true, error="LLM error server_error" (second timeout) 15:19:08 WARN lane wait exceeded: lane=session:agent:main:feishu:group:oc_d497bd2b11243f8c001af50d983504ba waitedMs=403570 queueAhead=1

Error recovery: When an agent run ends with isError: true due to provider timeout, ensure the session lane is properly released

Root Cause

Fix Action

Workaround

Restart the gateway: openclaw gateway restart or systemctl --user restart openclaw-gateway.service

Code Example

15:21:37 WARN embedded_run_agent_end: isError=true, error="LLM error server_error", failoverReason="timeout", model="gpt-5.5", provider="openai-codex"
15:22:34 WARN embedded_run_agent_end: isError=true, error="LLM error server_error" (second timeout)
15:19:08 WARN lane wait exceeded: lane=session:agent:main:feishu:group:oc_d497bd2b11243f8c001af50d983504ba waitedMs=403570 queueAhead=1
15:22:46 INFO feishu[sanqi]: dispatching to agent (session=...)
15:22:46 INFO feishu[sanqi]: dispatch complete (queuedFinal=false, replies=0)

RAW_BUFFERClick to expand / collapse

Description

Environment

OpenClaw version: 2026.5.7
OS: Linux 6.17.0-20-generic (x64), Ubuntu
Node: v24.14.1
Channel: Feishu (group chat, 4 bot accounts: sanqi/taiyi/shengongbao/nezha)
Model: openai-codex/gpt-5.5
Session type: agent:main:feishu:group:<chat_id>

Steps to Reproduce

Configure a Feishu group chat with requireMention: true
Have the agent perform a multi-step task in the group chat (involving tool calls like exec, edit)
While the agent is executing, the LLM provider (Codex) returns consecutive server_error / timeout errors
The agent run ends with isError: true
Send a new message to the group chat (with @mention)

Expected Behavior

The session lane should be released after the failed agent run, allowing new messages to trigger a fresh agent run.

Actual Behavior

The session lane remains locked
New messages are dispatched but queued (queueAhead=1)
Diagnostic log shows: lane wait exceeded: lane=session:agent:main:feishu:group:... waitedMs=403570 queueAhead=1
dispatch complete (queuedFinal=false, replies=0) — messages are accepted but never processed
Only a full openclaw gateway restart recovers the session

Relevant Logs

15:21:37 WARN embedded_run_agent_end: isError=true, error="LLM error server_error", failoverReason="timeout", model="gpt-5.5", provider="openai-codex"
15:22:34 WARN embedded_run_agent_end: isError=true, error="LLM error server_error" (second timeout)
15:19:08 WARN lane wait exceeded: lane=session:agent:main:feishu:group:oc_d497bd2b11243f8c001af50d983504ba waitedMs=403570 queueAhead=1
15:22:46 INFO feishu[sanqi]: dispatching to agent (session=...)
15:22:46 INFO feishu[sanqi]: dispatch complete (queuedFinal=false, replies=0)

Impact

Group chat sessions become completely unresponsive after LLM provider errors
No CLI command or API endpoint exists to abort a stuck session lane (openclaw sessions only has cleanup, no abort)
Users must restart the entire gateway to recover, which affects all sessions

Suggested Fix

Error recovery: When an agent run ends with isError: true due to provider timeout, ensure the session lane is properly released
Lane timeout: Add a configurable lane lock timeout (e.g., 5 minutes) — if a run doesn't complete within the timeout, force-release the lane
CLI command: Add openclaw session abort <session-key> to allow manual recovery without full gateway restart
Diagnostic: Log a warning when a lane has been held for an unusually long time

Workaround

Restart the gateway: openclaw gateway restart or systemctl --user restart openclaw-gateway.service

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #file not found #serialization error #model compatibility #GPU setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix Session lane not released after LLM provider timeout, blocking all subsequent messages [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Workaround

Code Example

Description

Environment

Steps to Reproduce

Expected Behavior

Actual Behavior

Relevant Logs

Impact

Suggested Fix

Workaround

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix Session lane not released after LLM provider timeout, blocking all subsequent messages [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Workaround

Code Example

Description

Environment

Steps to Reproduce

Expected Behavior

Actual Behavior

Relevant Logs

Impact

Suggested Fix

Workaround

Still need to ship something?

RELATED_DISCOVERY

TRENDING