openclaw - 💡(How to fix) Fix Sub-agent session resumes after gateway reload with 9-hour delay [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#63388Fetched 2026-04-09 07:54:25
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0
Author
Participants

Sub-agent tasks that are interrupted by a gateway reload do not fail cleanly. Instead, they appear to "pause" and resume hours later, completing their work and firing completion notifications at unexpected times.

Root Cause

Sub-agent tasks that are interrupted by a gateway reload do not fail cleanly. Instead, they appear to "pause" and resume hours later, completing their work and firing completion notifications at unexpected times.

RAW_BUFFERClick to expand / collapse

Bug Report: Sub-agent Session Resumes After Gateway Reload — 9 Hour Delay

Summary

Sub-agent tasks that are interrupted by a gateway reload do not fail cleanly. Instead, they appear to "pause" and resume hours later, completing their work and firing completion notifications at unexpected times.

Environment

  • OpenClaw: 2026.4.8
  • Model: glm-5.1:cloud
  • Host: Ubuntu (MacBook Pro)
  • Channel: Telegram

Timeline

  • ~14:45 UTC — Sub-agent spawned with task to fix 14 bugs
  • ~21:37 UTC — Gateway reload occurs
  • ~23:37 UTC — Session receives message: "Your previous turn was interrupted by a gateway reload"
  • ~23:42 UTC — Task completes, PR created
  • ~23:47 UTC — Completion notification arrives via Telegram

Total elapsed: ~9 hours

Observed Behavior

What actually happened:

  1. Sub-agent ran for hours (processing code, making fixes)
  2. Gateway reloaded at 21:37
  3. Session "paused" — not killed, not restarted
  4. At 23:37, session resumed with message: "Your previous turn was interrupted by a gateway reload"
  5. Sub-agent continued from where it left off
  6. Completed successfully at 23:42
  7. Completion notification fired on Telegram 5 minutes later

Evidence

Session log excerpt:

Session: Run:

Impact

  • Confusing UX — notifications hours after expected
  • Data staleness — if task fetches data, it's stale
  • No user control — cannot cancel paused sessions
  • Resource waste — continues consuming model quota while paused

Expected Behavior

  • Fail-fast on gateway reload, OR
  • Auto-restart task from beginning, OR
  • Configurable max interruption window

Suggested Fixes

  1. If session interrupted > X seconds, mark as failed
  2. On gateway reload, restart or fail all running sub-agent sessions
  3. Notify user when task is interrupted (not just on resume)
  4. Store session state on disk for proper interruption tracking

Related

  • #20436, #30487, #9708

extent analysis

TL;DR

Implement a fail-fast mechanism or auto-restart for sub-agent sessions interrupted by gateway reloads to prevent unexpected resumptions and notifications.

Guidance

  • Investigate implementing a timeout (e.g., X seconds) after which interrupted sessions are marked as failed to prevent stale data and resource waste.
  • Consider restarting or failing all running sub-agent sessions on gateway reload to ensure a clean state and prevent unexpected resumptions.
  • Store session state on disk to track interruptions accurately and enable proper handling of paused sessions.
  • Evaluate notifying users when tasks are interrupted, not just when they resume, to improve UX and provide more control over paused sessions.

Example

No specific code snippet can be provided without more context, but an example of storing session state on disk might involve using a database or file storage to save the session's progress and status.

Notes

The suggested fixes imply that the current implementation lacks a robust mechanism for handling session interruptions. Implementing these fixes may require significant changes to the sub-agent and gateway reload logic.

Recommendation

Apply a workaround by implementing a fail-fast mechanism or auto-restart for sub-agent sessions interrupted by gateway reloads, as this addresses the immediate issue of unexpected resumptions and notifications.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Sub-agent session resumes after gateway reload with 9-hour delay [1 participants]