openclaw - 💡(How to fix) Fix Gateway hangs on shutdown when Telegram API is unreachable [2 comments, 3 participants]

openclaw2026-04-03 08:47:50

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#60180•Fetched 2026-04-08 02:35:22

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

commented ×2

Error Message

From gateway.err.log:

[telegram] fetch fallback: enabling sticky IPv4-only dispatcher (codes=UND_ERR_CONNECT_TIMEOUT)
[telegram] fetch fallback: enabling sticky IPv4-only dispatcher (codes=UND_ERR_CONNECT_TIMEOUT)
... (dozens of these)
[gateway] shutdown timed out; exiting without full cleanup

Root Cause

Two issues in the shutdown path:

Fix Action

Fix / Workaround

From gateway.err.log:

[telegram] fetch fallback: enabling sticky IPv4-only dispatcher (codes=UND_ERR_CONNECT_TIMEOUT)
[telegram] fetch fallback: enabling sticky IPv4-only dispatcher (codes=UND_ERR_CONNECT_TIMEOUT)
... (dozens of these)
[gateway] shutdown timed out; exiting without full cleanup

Code Example

[telegram] fetch fallback: enabling sticky IPv4-only dispatcher (codes=UND_ERR_CONNECT_TIMEOUT)
[telegram] fetch fallback: enabling sticky IPv4-only dispatcher (codes=UND_ERR_CONNECT_TIMEOUT)
... (dozens of these)
[gateway] shutdown timed out; exiting without full cleanup

---

[2026-04-03 13:14:44] Restarting gateway — health endpoint failed or timed out
[2026-04-03 13:15:05] Gateway STILL DOWN after restart
[2026-04-03 13:17:12] Restarting gateway — port 18789 not listening
[2026-04-03 13:18:17] Gateway STILL DOWN after restart
... (repeats for 25+ minutes)

---

// channel-manager — no timeout, hangs when API is down
if (plugin?.gateway?.stopAccount) {
  await plugin.gateway.stopAccount({ ... });
}

---

// server close handler — sequential, one slow channel blocks all
for (const plugin of listChannelPlugins())
  await params.stopChannel(plugin.id);

---

const STOP_ACCOUNT_TIMEOUT_MS = 10_000;

if (plugin?.gateway?.stopAccount) {
  await Promise.race([
    plugin.gateway.stopAccount({
      cfg, accountId: id, account,
      runtime: channelRuntimeEnvs[channelId],
      abortSignal: abort?.signal ?? new AbortController().signal,
      log: channelLogs[channelId],
      getStatus: () => getRuntime(channelId, id),
      setStatus: (next) => setRuntime(channelId, id, next),
    }),
    new Promise<void>((_, reject) =>
      setTimeout(() => reject(new Error(
        `stopAccount timed out for ${channelId}/${id} after ${STOP_ACCOUNT_TIMEOUT_MS}ms`
      )), STOP_ACCOUNT_TIMEOUT_MS)
    ),
  ]).catch((err) => {
    channelLogs[channelId]?.warn?.(
      `[${channelId}] stopAccount failed: ${err.message}; continuing shutdown`
    );
  });
}

---

const CHANNEL_SHUTDOWN_TIMEOUT_MS = 15_000;

await Promise.race([
  Promise.allSettled(
    listChannelPlugins().map((plugin) => params.stopChannel(plugin.id))
  ),
  new Promise<void>((resolve) =>
    setTimeout(() => {
      gatewayLog.warn('channel shutdown timed out; continuing');
      resolve();
    }, CHANNEL_SHUTDOWN_TIMEOUT_MS)
  ),
]);

---

{
  "gateway": {
    "shutdown": {
      "timeoutMs": 25000,
      "channelTimeoutMs": 10000
    }
  }
}

RAW_BUFFERClick to expand / collapse

Problem

When the Telegram API is unreachable (e.g., network issues, UND_ERR_CONNECT_TIMEOUT), the gateway cannot shut down cleanly. It logs "shutdown timed out; exiting without full cleanup" and exits with code 1, leaving the port in a dirty state.

This causes a restart loop: the watchdog restarts the gateway, but the new process can't bind the port (still held by the dying process), so it fails immediately — repeating until the network recovers.

Observed behavior

From gateway.err.log:

[telegram] fetch fallback: enabling sticky IPv4-only dispatcher (codes=UND_ERR_CONNECT_TIMEOUT)
[telegram] fetch fallback: enabling sticky IPv4-only dispatcher (codes=UND_ERR_CONNECT_TIMEOUT)
... (dozens of these)
[gateway] shutdown timed out; exiting without full cleanup

From watchdog.log — 10+ consecutive "STILL DOWN" restarts during a Telegram outage:

[2026-04-03 13:14:44] Restarting gateway — health endpoint failed or timed out
[2026-04-03 13:15:05] Gateway STILL DOWN after restart
[2026-04-03 13:17:12] Restarting gateway — port 18789 not listening
[2026-04-03 13:18:17] Gateway STILL DOWN after restart
... (repeats for 25+ minutes)

Root cause

Two issues in the shutdown path:

1. `stopAccount()` has no timeout

In the channel manager, stopChannel() calls await plugin.gateway.stopAccount(...) with no timeout. When the Telegram API is unreachable, this hangs indefinitely — blocking the entire shutdown sequence.

// channel-manager — no timeout, hangs when API is down
if (plugin?.gateway?.stopAccount) {
  await plugin.gateway.stopAccount({ ... });
}

2. Channels are stopped sequentially during server shutdown

// server close handler — sequential, one slow channel blocks all
for (const plugin of listChannelPlugins())
  await params.stopChannel(plugin.id);

The 25-second force-exit timer (SHUTDOWN_TIMEOUT_MS) eventually fires, but by then the process is in a bad state and the port isn't released cleanly.

Suggested fix

Fix 1: Wrap `stopAccount()` with a timeout

const STOP_ACCOUNT_TIMEOUT_MS = 10_000;

if (plugin?.gateway?.stopAccount) {
  await Promise.race([
    plugin.gateway.stopAccount({
      cfg, accountId: id, account,
      runtime: channelRuntimeEnvs[channelId],
      abortSignal: abort?.signal ?? new AbortController().signal,
      log: channelLogs[channelId],
      getStatus: () => getRuntime(channelId, id),
      setStatus: (next) => setRuntime(channelId, id, next),
    }),
    new Promise<void>((_, reject) =>
      setTimeout(() => reject(new Error(
        `stopAccount timed out for ${channelId}/${id} after ${STOP_ACCOUNT_TIMEOUT_MS}ms`
      )), STOP_ACCOUNT_TIMEOUT_MS)
    ),
  ]).catch((err) => {
    channelLogs[channelId]?.warn?.(
      `[${channelId}] stopAccount failed: ${err.message}; continuing shutdown`
    );
  });
}

Fix 2: Stop channels in parallel with an overall timeout

const CHANNEL_SHUTDOWN_TIMEOUT_MS = 15_000;

await Promise.race([
  Promise.allSettled(
    listChannelPlugins().map((plugin) => params.stopChannel(plugin.id))
  ),
  new Promise<void>((resolve) =>
    setTimeout(() => {
      gatewayLog.warn('channel shutdown timed out; continuing');
      resolve();
    }, CHANNEL_SHUTDOWN_TIMEOUT_MS)
  ),
]);

Bonus: make timeouts configurable

Ideally SHUTDOWN_TIMEOUT_MS and STOP_ACCOUNT_TIMEOUT_MS could be set in openclaw.json:

{
  "gateway": {
    "shutdown": {
      "timeoutMs": 25000,
      "channelTimeoutMs": 10000
    }
  }
}

Environment

openclaw version: $(node -e "console.log(require('/opt/homebrew/lib/node_modules/openclaw/package.json').version)" 2>/dev/null || echo "unknown")
macOS (darwin arm64)
Channel: Telegram (long-polling mode)
Triggered by: intermittent network issues causing UND_ERR_CONNECT_TIMEOUT to Telegram API

extent analysis

TL;DR

Implement timeouts for stopAccount() and parallelize channel shutdown to prevent the gateway from hanging indefinitely during Telegram API outages.

Guidance

Wrap stopAccount() with a timeout (e.g., 10 seconds) to prevent it from blocking the shutdown sequence indefinitely.
Stop channels in parallel with an overall timeout (e.g., 15 seconds) to prevent a single slow channel from blocking all others.
Consider making timeouts configurable via openclaw.json for easier tuning.
Review the provided code snippets for stopAccount() timeout and parallel channel shutdown to ensure they fit your specific use case.

Example

The suggested fix provides example code snippets for implementing timeouts:

const STOP_ACCOUNT_TIMEOUT_MS = 10_000;
await Promise.race([
  plugin.gateway.stopAccount({ /*... */ }),
  new Promise<void>((_, reject) =>
    setTimeout(() => reject(new Error(`stopAccount timed out`)), STOP_ACCOUNT_TIMEOUT_MS)
  ),
]);

And for parallel channel shutdown:

const CHANNEL_SHUTDOWN_TIMEOUT_MS = 15_000;
await Promise.race([
  Promise.allSettled(listChannelPlugins().map((plugin) => params.stopChannel(plugin.id))),
  new Promise<void>((resolve) =>
    setTimeout(() => {
      gatewayLog.warn('channel shutdown timed out; continuing');
      resolve();
    }, CHANNEL_SHUTDOWN_TIMEOUT_MS)
  ),
]);

Notes

The provided code snippets assume a TypeScript environment and may require adjustments for other languages or frameworks. Additionally, the choice of timeout values (e.g., 10 seconds, 15 seconds) may need to be tuned based on your specific use case and performance requirements.

Recommendation

Apply the suggested fixes to implement timeouts for stopAccount() and parallelize channel shutdown. This should help prevent the gateway from hanging indefinitely during Telegram API outages and reduce the likelihood of restart loops.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #dependency error #configuration error #environment variable #network issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix Gateway hangs on shutdown when Telegram API is unreachable [2 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Code Example

Problem

Observed behavior

Root cause

1. `stopAccount()` has no timeout

2. Channels are stopped sequentially during server shutdown

Suggested fix

Fix 1: Wrap `stopAccount()` with a timeout

Fix 2: Stop channels in parallel with an overall timeout

Bonus: make timeouts configurable

Environment

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix Gateway hangs on shutdown when Telegram API is unreachable [2 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Code Example

Problem

Observed behavior

Root cause

1. stopAccount() has no timeout

2. Channels are stopped sequentially during server shutdown

Suggested fix

Fix 1: Wrap stopAccount() with a timeout

Fix 2: Stop channels in parallel with an overall timeout

Bonus: make timeouts configurable

Environment

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING

1. `stopAccount()` has no timeout

Fix 1: Wrap `stopAccount()` with a timeout