openclaw - 💡(How to fix) Fix Feature request: openclaw agents restart <id> — per-agent restart capability [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#75812Fetched 2026-05-02 05:29:39
View on GitHub
Comments
1
Participants
2
Timeline
1
Reactions
2
Timeline (top)
commented ×1

Add a openclaw agents restart <agent-id> command (or equivalent under another subcommand surface) that restarts a single agent's session/provider without bouncing the entire gateway. There is currently no way to surgically recover a stuck or runaway specialist agent — the only available action is openclaw gateway restart, which terminates all agents on the host and incurs ~3 minutes of downtime + the orphaned-child-process risk documented in #75747 + the session-lock-leak risk documented in #75805.

Root Cause

Add a openclaw agents restart <agent-id> command (or equivalent under another subcommand surface) that restarts a single agent's session/provider without bouncing the entire gateway. There is currently no way to surgically recover a stuck or runaway specialist agent — the only available action is openclaw gateway restart, which terminates all agents on the host and incurs ~3 minutes of downtime + the orphaned-child-process risk documented in #75747 + the session-lock-leak risk documented in #75805.

Fix Action

Fix / Workaround

Workaround in our deployment

Code Example

$ openclaw gateway restart --help
Usage: openclaw gateway restart [options]
Restart the Gateway service (launchd/systemd/schtasks)
Options:
  -h, --help  Display help for command
  --json      Output JSON (default: false)

---

$ openclaw agents --help
Commands:
  add, bind, bindings, delete, list, set-identity, unbind
$ openclaw channels --help
Commands:
  add, capabilities, list, login, logout, logs, remove, resolve, status

---

openclaw agents restart <agent-id> [--soft|--hard]
RAW_BUFFERClick to expand / collapse

Summary

Add a openclaw agents restart <agent-id> command (or equivalent under another subcommand surface) that restarts a single agent's session/provider without bouncing the entire gateway. There is currently no way to surgically recover a stuck or runaway specialist agent — the only available action is openclaw gateway restart, which terminates all agents on the host and incurs ~3 minutes of downtime + the orphaned-child-process risk documented in #75747 + the session-lock-leak risk documented in #75805.

Verified gap (against 2026.4.24)

$ openclaw gateway restart --help
Usage: openclaw gateway restart [options]
Restart the Gateway service (launchd/systemd/schtasks)
Options:
  -h, --help  Display help for command
  --json      Output JSON (default: false)

No --agent, no --name, no --id. Confirmed adjacent surfaces also lack per-agent restart:

$ openclaw agents --help
Commands:
  add, bind, bindings, delete, list, set-identity, unbind
$ openclaw channels --help
Commands:
  add, capabilities, list, login, logout, logs, remove, resolve, status

Existing channels logout + channels login cycles a transport (e.g. Discord WebSocket), but does not reset the agent's reasoning context — different failure mode, different fix.

Use case

Production multi-agent setup with a dedicated security/monitoring agent (Arya) that watches the other ten specialists for runaway sessions, stuck conversations, and token-spend anomalies. Industry pattern (AutoGen is_termination_msg, LangGraph supervisor pattern, CrewAI task cancellation) is to give the supervisor agent the ability to terminate-and-restart a specific subordinate without affecting siblings.

Today, when the supervisor detects a runaway, the available remediations are:

  1. Wait for session.reset.idleMinutes — slow (15+ min), only resolves cases that were going to recover anyway.
  2. /reset in the agent's Discord channel via the agent's messaging tool — works for context-level resets, doesn't help if the agent's provider is wedged.
  3. Full gateway restart — bounces all 11 agents, ~3 min downtime, surfaces #75747 risks.

The first is too slow, the third is too disruptive. There's no "just this one" option.

Proposed API

openclaw agents restart <agent-id> [--soft|--hard]
  • --soft (default): tear down the agent's active session, preserve channel bindings + auth profiles + workspace state, restart the agent's provider, resume on next inbound message. Equivalent to what a hypothetical /reset would do at the gateway level rather than the chat level.
  • --hard: full provider reinit including channel re-handshake. Useful when transport is also wedged (Discord WebSocket close 1006 with no recovery, etc.).

Exit codes:

  • 0 — restart succeeded, agent ready for new turn
  • 1 — agent ID not found
  • 2 — agent restart attempted but health check failed
  • 3 — agent currently locked (session in flight, retry after lock release)

Cross-references

  • #75747 — orphaned child processes after gateway restart on Linux. Per-agent restart should ensure single-agent process tree is fully reaped.
  • #75805 — session lock files not released on shutdown. Per-agent restart should release/clean its own locks before completing.
  • (Optionally) the AutoGen is_termination_msg pattern: https://microsoft.github.io/autogen/ — for prior art on supervisor-driven agent termination in production multi-agent systems.

Workaround in our deployment

Documented at the supervisor agent's prompt level: tiered protocol (watch → surgical /reset in Discord → escalate to human for full gateway restart). Works but Tier 3 is a manual step that interrupts the human, defeating the autonomy goal.

Severity

Feature gap, not a bug. Without it, supervisor-pattern multi-agent setups can't fully self-heal — every runaway becomes a human-paged incident.

extent analysis

TL;DR

Implementing a openclaw agents restart <agent-id> command with --soft and --hard options can provide a targeted solution for restarting individual agents without affecting the entire gateway.

Guidance

  • Introduce a new command under the openclaw agents subcommand surface to restart a single agent's session/provider.
  • Define the command's behavior with --soft and --hard options to handle different restart scenarios, including preserving channel bindings and auth profiles.
  • Implement exit codes to indicate the success or failure of the restart operation, including cases where the agent ID is not found or the agent is currently locked.
  • Ensure the new command addresses the issues mentioned in #75747 and #75805, such as reaping the single-agent process tree and releasing session lock files.

Example

openclaw agents restart <agent-id> [--soft|--hard]

This command would allow for targeted restarts of individual agents, reducing downtime and minimizing the risk of orphaned child processes and session lock leaks.

Notes

The proposed solution requires careful consideration of the existing openclaw command structure and the potential impact on other agents and the gateway as a whole. Additionally, the implementation should ensure that the new command is properly documented and integrated with existing features, such as the channels logout and channels login commands.

Recommendation

Apply the proposed openclaw agents restart <agent-id> command with --soft and --hard options to provide a targeted solution for restarting individual agents, reducing downtime and minimizing risks. This approach addresses the feature gap and aligns with industry patterns for supervisor-driven agent termination in production multi-agent systems.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Feature request: openclaw agents restart <id> — per-agent restart capability [1 comments, 2 participants]