openclaw - 💡(How to fix) Fix gateway: hot reload restartChannel loop aborts on first failure, leaving remaining channels unrestarted

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

src/gateway/server-reload-handlers.ts restartChannel is invoked from a plain for await loop with no per-channel try/catch:

const restartChannel = async (name: ChannelKind) => {
  params.logChannels.info(`restarting ${name} channel`);
  if (!channelsStoppedBeforePluginReload.has(name)) {
    await params.stopChannel(name);
  }
  await params.startChannel(name);
};
for (const channel of channelsToRestart) {
  await restartChannel(channel);
}

If stopChannel or startChannel throws for any one channel, the for await loop aborts. The remaining channels in channelsToRestart are never stopped or started, leaving the gateway in a partial-reload state. Both [reload] config hot reload applied (...) may or may not log depending on the throw site — the channel-status surface becomes "some channels running, some channels wedged" with no per-channel error log other than the one for the failing channel.

Error Message

If stopChannel or startChannel throws for any one channel, the for await loop aborts. The remaining channels in channelsToRestart are never stopped or started, leaving the gateway in a partial-reload state. Both [reload] config hot reload applied (...) may or may not log depending on the throw site — the channel-status surface becomes "some channels running, some channels wedged" with no per-channel error log other than the one for the failing channel. 2. Make one channel's stopChannel/startChannel throw — easiest synthetic repro: cause the first telegram channel stopChannel to take longer than the 5 s CHANNEL_STOP_ABORT_TIMEOUT_MS and have the underlying worker error path land on a code path that escapes the supervisor's .catch(err) (e.g. an unhandled rejection during stop, the issue tracked in #83008 made this concrete on hot reload). 4. Observe: the failing channel is logged as exited; subsequent channels in channelsToRestart are silently skipped — no restarting <name> channel log lines for them, no error attributing the skip to the previous channel's failure.

  • No per-channel error surface — only the first failing channel logs anything.
  •  params.logChannels.error(

Root Cause

src/gateway/server-reload-handlers.ts restartChannel is invoked from a plain for await loop with no per-channel try/catch:

const restartChannel = async (name: ChannelKind) => {
  params.logChannels.info(`restarting ${name} channel`);
  if (!channelsStoppedBeforePluginReload.has(name)) {
    await params.stopChannel(name);
  }
  await params.startChannel(name);
};
for (const channel of channelsToRestart) {
  await restartChannel(channel);
}

If stopChannel or startChannel throws for any one channel, the for await loop aborts. The remaining channels in channelsToRestart are never stopped or started, leaving the gateway in a partial-reload state. Both [reload] config hot reload applied (...) may or may not log depending on the throw site — the channel-status surface becomes "some channels running, some channels wedged" with no per-channel error log other than the one for the failing channel.

Fix Action

Fix / Workaround

  • Failure of channel A silently blocks restart of B, C, …
  • No per-channel error surface — only the first failing channel logs anything.
  • Combined with #83008 (telegram-specific wedge), a single agent-driven config patch can take out the entire post-reload channel surface, not just telegram.

Code Example

const restartChannel = async (name: ChannelKind) => {
  params.logChannels.info(`restarting ${name} channel`);
  if (!channelsStoppedBeforePluginReload.has(name)) {
    await params.stopChannel(name);
  }
  await params.startChannel(name);
};
for (const channel of channelsToRestart) {
  await restartChannel(channel);
}

---

for (const channel of channelsToRestart) {
-    await restartChannel(channel);
+    try {
+      await restartChannel(channel);
+    } catch (err) {
+      params.logChannels.error(
+        `[${channel}] hot-reload restart failed; remaining channels will continue: ${formatErrorMessage(err)}`,
+      );
+    }
   }
RAW_BUFFERClick to expand / collapse

Summary

src/gateway/server-reload-handlers.ts restartChannel is invoked from a plain for await loop with no per-channel try/catch:

const restartChannel = async (name: ChannelKind) => {
  params.logChannels.info(`restarting ${name} channel`);
  if (!channelsStoppedBeforePluginReload.has(name)) {
    await params.stopChannel(name);
  }
  await params.startChannel(name);
};
for (const channel of channelsToRestart) {
  await restartChannel(channel);
}

If stopChannel or startChannel throws for any one channel, the for await loop aborts. The remaining channels in channelsToRestart are never stopped or started, leaving the gateway in a partial-reload state. Both [reload] config hot reload applied (...) may or may not log depending on the throw site — the channel-status surface becomes "some channels running, some channels wedged" with no per-channel error log other than the one for the failing channel.

Affected code

src/gateway/server-reload-handlers.ts:415-427.

Reproduction

  1. Configure gateway with multiple channels enabled (e.g. telegram, imessage, feishu).
  2. Make one channel's stopChannel/startChannel throw — easiest synthetic repro: cause the first telegram channel stopChannel to take longer than the 5 s CHANNEL_STOP_ABORT_TIMEOUT_MS and have the underlying worker error path land on a code path that escapes the supervisor's .catch(err) (e.g. an unhandled rejection during stop, the issue tracked in #83008 made this concrete on hot reload).
  3. Change config keys that touch multiple channels (or trigger a plugin reload that requires restarting more than one channel).
  4. Observe: the failing channel is logged as exited; subsequent channels in channelsToRestart are silently skipped — no restarting <name> channel log lines for them, no error attributing the skip to the previous channel's failure.

Why this is a silent wedge

  • Failure of channel A silently blocks restart of B, C, …
  • No per-channel error surface — only the first failing channel logs anything.
  • Combined with #83008 (telegram-specific wedge), a single agent-driven config patch can take out the entire post-reload channel surface, not just telegram.

Proposed fix

Per-channel isolation in the reload loop:

   for (const channel of channelsToRestart) {
-    await restartChannel(channel);
+    try {
+      await restartChannel(channel);
+    } catch (err) {
+      params.logChannels.error(
+        `[${channel}] hot-reload restart failed; remaining channels will continue: ${formatErrorMessage(err)}`,
+      );
+    }
   }

Optional follow-up: also wrap the earlier stopChannelsBeforePluginReplace loop (line 320-332) the same way, so a stop failure during a plugin reload doesn't poison the entire plugin reload either.

Tests

src/gateway/server-reload-handlers.test.ts (if it exists; if not, the existing reload integration tests) can be extended to assert that throwing on channel A still restarts channel B. Happy to do this in a PR.

Environment

  • Gateway version: v2026.5.16-beta.4
  • Reproduced on macOS, single-host launchd-managed gateway.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix gateway: hot reload restartChannel loop aborts on first failure, leaving remaining channels unrestarted