openclaw - 💡(How to fix) Fix [Bug]: Discord channel message marked completed while embedded run aborted, then session stalls

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

On OpenClaw 2026.5.12 (f066dd2), a Discord channel turn can be marked message.processed outcome=completed even though the embedded run ended aborted=true. After that, the same channel session can remain in stalled_agent_run with activeWorkKind=embedded_run, eventually surfacing the generic user-facing timeout text:

Request timed out before a response was generated. Please try again, or increase `agents.defaults.timeoutSeconds` in your config.

This does not appear to be a simple agents.defaults.timeoutSeconds issue. The observed run aborted after ~8s, the message was marked completed, then the session was later reported as stalled.

Error Message

2026-05-15T18:03:13.813+08:00 WARN stalled session: sessionId=0b401b0a-47dc-4091-84b3-62eae77f7c63 sessionKey=agent:tino:discord:channel:1480145267398545594 state=processing age=145s queueDepth=1 reason=active_work_without_progress classification=stalled_agent_run activeWorkKind=embedded_run lastProgress=codex_app_server:notification:rawResponseItem/completed lastProgressAge=144s terminalProgressStale=true recovery=none 3. The user-facing error should distinguish "wall-clock timeout" from "embedded run aborted / stale session state" so operators are not misled into only increasing agents.defaults.timeoutSeconds.

Root Cause

On OpenClaw 2026.5.12 (f066dd2), a Discord channel turn can be marked message.processed outcome=completed even though the embedded run ended aborted=true. After that, the same channel session can remain in stalled_agent_run with activeWorkKind=embedded_run, eventually surfacing the generic user-facing timeout text:

Request timed out before a response was generated. Please try again, or increase `agents.defaults.timeoutSeconds` in your config.

This does not appear to be a simple agents.defaults.timeoutSeconds issue. The observed run aborted after ~8s, the message was marked completed, then the session was later reported as stalled.

Fix Action

Fix / Workaround

  • OpenClaw CLI/Gateway: 2026.5.12 (f066dd2)
  • Runtime: macOS LaunchAgent, Node 22 via Homebrew
  • Channel: Discord channel session
  • Model: openai/gpt-5.5
  • Relevant defaults at time of mitigation:
    • agents.defaults.timeoutSeconds=1800
    • agents.defaults.contextTokens=120000
    • active-memory.allowedChatTypes=["direct"] (channel active-memory disabled after mitigation)
2026-05-15T18:00:05.741+08:00 DEBUG message queued: sessionId=0b401b0a-47dc-4091-84b3-62eae77f7c63 sessionKey=agent:tino:discord:channel:1480145267398545594 source=dispatch queueDepth=1 sessionState=idle
2026-05-15T18:00:07.730+08:00 DEBUG run registered: sessionId=0b401b0a-47dc-4091-84b3-62eae77f7c63 totalActive=1
2026-05-15T18:00:15.059+08:00 DEBUG session state: sessionId=0b401b0a-47dc-4091-84b3-62eae77f7c63 sessionKey=agent:tino:discord:channel:1480145267398545594 prev=processing new=idle reason="run_completed" queueDepth=0
2026-05-15T18:00:15.060+08:00 DEBUG run cleared: sessionId=0b401b0a-47dc-4091-84b3-62eae77f7c63 totalActive=0
2026-05-15T18:00:15.066+08:00 DEBUG embedded run done: runId=d2a9e294-1826-4610-8add-fb7d52b57653 sessionId=0b401b0a-47dc-4091-84b3-62eae77f7c63 durationMs=8376 aborted=true
2026-05-15T18:00:16.035+08:00 DEBUG message processed: channel=discord chatId=channel:1480145267398545594 messageId=1504785435535081615 sessionId=unknown sessionKey=agent:tino:discord:channel:1480145267398545594 outcome=completed duration=10302ms
2026-05-15T18:03:13.813+08:00 WARN stalled session: sessionId=0b401b0a-47dc-4091-84b3-62eae77f7c63 sessionKey=agent:tino:discord:channel:1480145267398545594 state=processing age=145s queueDepth=1 reason=active_work_without_progress classification=stalled_agent_run activeWorkKind=embedded_run lastProgress=codex_app_server:notification:rawResponseItem/completed lastProgressAge=144s terminalProgressStale=true recovery=none

Local mitigation used

Code Example

Request timed out before a response was generated. Please try again, or increase `agents.defaults.timeoutSeconds` in your config.

---

2026-05-15T18:00:05.741+08:00 DEBUG message queued: sessionId=0b401b0a-47dc-4091-84b3-62eae77f7c63 sessionKey=agent:tino:discord:channel:1480145267398545594 source=dispatch queueDepth=1 sessionState=idle
2026-05-15T18:00:07.730+08:00 DEBUG run registered: sessionId=0b401b0a-47dc-4091-84b3-62eae77f7c63 totalActive=1
2026-05-15T18:00:15.059+08:00 DEBUG session state: sessionId=0b401b0a-47dc-4091-84b3-62eae77f7c63 sessionKey=agent:tino:discord:channel:1480145267398545594 prev=processing new=idle reason="run_completed" queueDepth=0
2026-05-15T18:00:15.060+08:00 DEBUG run cleared: sessionId=0b401b0a-47dc-4091-84b3-62eae77f7c63 totalActive=0
2026-05-15T18:00:15.066+08:00 DEBUG embedded run done: runId=d2a9e294-1826-4610-8add-fb7d52b57653 sessionId=0b401b0a-47dc-4091-84b3-62eae77f7c63 durationMs=8376 aborted=true
2026-05-15T18:00:16.035+08:00 DEBUG message processed: channel=discord chatId=channel:1480145267398545594 messageId=1504785435535081615 sessionId=unknown sessionKey=agent:tino:discord:channel:1480145267398545594 outcome=completed duration=10302ms
2026-05-15T18:03:13.813+08:00 WARN stalled session: sessionId=0b401b0a-47dc-4091-84b3-62eae77f7c63 sessionKey=agent:tino:discord:channel:1480145267398545594 state=processing age=145s queueDepth=1 reason=active_work_without_progress classification=stalled_agent_run activeWorkKind=embedded_run lastProgress=codex_app_server:notification:rawResponseItem/completed lastProgressAge=144s terminalProgressStale=true recovery=none

---

openclaw sessions export-trajectory \
  --agent tino \
  --store /Users/betalpha/.openclaw/agents/tino/sessions/sessions.json.bak-timeoutfix-1778842350 \
  --session-key agent:tino:discord:channel:1480145267398545594 \
  --workspace /Users/betalpha/clawd/tino \
  --output discord-stall-2026-05-15-1504785435535081615 \
  --json
RAW_BUFFERClick to expand / collapse

Summary

On OpenClaw 2026.5.12 (f066dd2), a Discord channel turn can be marked message.processed outcome=completed even though the embedded run ended aborted=true. After that, the same channel session can remain in stalled_agent_run with activeWorkKind=embedded_run, eventually surfacing the generic user-facing timeout text:

Request timed out before a response was generated. Please try again, or increase `agents.defaults.timeoutSeconds` in your config.

This does not appear to be a simple agents.defaults.timeoutSeconds issue. The observed run aborted after ~8s, the message was marked completed, then the session was later reported as stalled.

Environment

  • OpenClaw CLI/Gateway: 2026.5.12 (f066dd2)
  • Runtime: macOS LaunchAgent, Node 22 via Homebrew
  • Channel: Discord channel session
  • Model: openai/gpt-5.5
  • Relevant defaults at time of mitigation:
    • agents.defaults.timeoutSeconds=1800
    • agents.defaults.contextTokens=120000
    • active-memory.allowedChatTypes=["direct"] (channel active-memory disabled after mitigation)

Evidence

Incident:

  • Discord message id: 1504785435535081615
  • Timestamp: 2026-05-15T18:00:05+08:00
  • Session key: agent:tino:discord:channel:1480145267398545594
  • Session id: 0b401b0a-47dc-4091-84b3-62eae77f7c63

Gateway log sequence:

2026-05-15T18:00:05.741+08:00 DEBUG message queued: sessionId=0b401b0a-47dc-4091-84b3-62eae77f7c63 sessionKey=agent:tino:discord:channel:1480145267398545594 source=dispatch queueDepth=1 sessionState=idle
2026-05-15T18:00:07.730+08:00 DEBUG run registered: sessionId=0b401b0a-47dc-4091-84b3-62eae77f7c63 totalActive=1
2026-05-15T18:00:15.059+08:00 DEBUG session state: sessionId=0b401b0a-47dc-4091-84b3-62eae77f7c63 sessionKey=agent:tino:discord:channel:1480145267398545594 prev=processing new=idle reason="run_completed" queueDepth=0
2026-05-15T18:00:15.060+08:00 DEBUG run cleared: sessionId=0b401b0a-47dc-4091-84b3-62eae77f7c63 totalActive=0
2026-05-15T18:00:15.066+08:00 DEBUG embedded run done: runId=d2a9e294-1826-4610-8add-fb7d52b57653 sessionId=0b401b0a-47dc-4091-84b3-62eae77f7c63 durationMs=8376 aborted=true
2026-05-15T18:00:16.035+08:00 DEBUG message processed: channel=discord chatId=channel:1480145267398545594 messageId=1504785435535081615 sessionId=unknown sessionKey=agent:tino:discord:channel:1480145267398545594 outcome=completed duration=10302ms
2026-05-15T18:03:13.813+08:00 WARN stalled session: sessionId=0b401b0a-47dc-4091-84b3-62eae77f7c63 sessionKey=agent:tino:discord:channel:1480145267398545594 state=processing age=145s queueDepth=1 reason=active_work_without_progress classification=stalled_agent_run activeWorkKind=embedded_run lastProgress=codex_app_server:notification:rawResponseItem/completed lastProgressAge=144s terminalProgressStale=true recovery=none

Key inconsistency:

  • Embedded run ended with aborted=true
  • Channel message was still marked outcome=completed
  • Later diagnostics classified the same session as stalled_agent_run
  • Recovery remained none

Expected behavior

One of these should happen:

  1. If the embedded run aborts, the Discord message should not be recorded as successfully completed unless a final user-visible response was actually sent.
  2. If a session is detected as stalled_agent_run activeWorkKind=embedded_run, the runtime should either abort/recover the run or clear the session state safely.
  3. The user-facing error should distinguish "wall-clock timeout" from "embedded run aborted / stale session state" so operators are not misled into only increasing agents.defaults.timeoutSeconds.

Actual behavior

The channel turn was treated as completed, then the session remained/returned to a stalled embedded-run state and eventually produced the generic timeout text.

Local mitigation used

We mitigated locally by:

  • Rotating only the affected Discord channel session-key mapping after backing up the session store.
  • Preserving the transcript/session files.
  • Restarting Gateway to reload in-memory state.
  • Adding a watchdog that detects stalled_agent_run + activeWorkKind=embedded_run for Discord channel sessions and performs the same targeted rotation/restart.

This restored the channel, but it is a workaround rather than a native fix.

Diagnostics bundle

A redacted trajectory export was generated locally with:

openclaw sessions export-trajectory \
  --agent tino \
  --store /Users/betalpha/.openclaw/agents/tino/sessions/sessions.json.bak-timeoutfix-1778842350 \
  --session-key agent:tino:discord:channel:1480145267398545594 \
  --workspace /Users/betalpha/clawd/tino \
  --output discord-stall-2026-05-15-1504785435535081615 \
  --json

The export contains 465 total events: 199 runtime events and 266 transcript events. I can provide the redacted export if useful.

Related issues that look adjacent but not identical:

  • #71127
  • #73496
  • #74586

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

One of these should happen:

  1. If the embedded run aborts, the Discord message should not be recorded as successfully completed unless a final user-visible response was actually sent.
  2. If a session is detected as stalled_agent_run activeWorkKind=embedded_run, the runtime should either abort/recover the run or clear the session state safely.
  3. The user-facing error should distinguish "wall-clock timeout" from "embedded run aborted / stale session state" so operators are not misled into only increasing agents.defaults.timeoutSeconds.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: Discord channel message marked completed while embedded run aborted, then session stalls