openclaw - ✅(Solved) Fix [Bug]: Discord WebSocket code 1006 triggers uncaught exception and crashes gateway instead of reconnecting [1 pull requests, 6 comments, 4 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#53644Fetched 2026-04-08 01:25:27
View on GitHub
Comments
6
Participants
4
Timeline
15
Reactions
0
Author
Timeline (top)
commented ×6cross-referenced ×4labeled ×2closed ×1

When the Discord WebSocket drops with code 1006 (abnormal closure), the internal @buape/carbon GatewayPlugin throws an uncaught exception (Max reconnect attempts (0) reached after code 1006) that crashes the entire OpenClaw gateway process, requiring manual restart.

Error Message

Uncaught exception: Error: Max reconnect attempts (0) reached after code 1006 at GatewayPlugin.handleReconnectionAttempt (.../GatewayPlugin.js:305:40) at GatewayPlugin.handleClose (.../GatewayPlugin.js:350:14) at WebSocket.<anonymous> (.../GatewayPlugin.js:294:18) at WebSocket.emit (node:events:508:28) at WebSocket.emitClose (.../ws/lib/websocket.js:263:12) at emitErrorAndClose (.../ws/lib/websocket.js:1047:13) at process.processTicksAndRejections (node:internal/process/task_queues:89:21)

Root Cause

When the Discord WebSocket drops with code 1006 (abnormal closure), the internal @buape/carbon GatewayPlugin throws an uncaught exception (Max reconnect attempts (0) reached after code 1006) that crashes the entire OpenClaw gateway process, requiring manual restart.

Fix Action

Fixed

PR fix notes

PR #53854: fix(discord): catch thrown errors from gateway WebSocket close handler

Description (problem / solution / changelog)

Summary

When the Discord WebSocket drops with code 1006 (abnormal closure), the internal @buape/carbon GatewayPlugin throws an exception (Max reconnect attempts (0) reached after code 1006) instead of emitting it as an error event. This thrown exception bypasses OpenClaw's lifecycle error handling and crashes the entire gateway process.

This PR wraps the WebSocket's emit method to catch errors thrown during the close event and converts them to emitted error events, allowing the existing error handling in provider.lifecycle.ts to process them gracefully.

Changes

  • Added wrapWebSocketWithErrorGuard() helper that intercepts the WebSocket's emit method and wraps close event handlers in try-catch
  • Modified SafeGatewayPlugin.createWebSocket() to apply the error guard to all WebSocket instances
  • Added comprehensive tests for the error wrapping behavior

Root Cause

The Carbon library's GatewayPlugin.handleReconnectionAttempt() throws an error when reconnect attempts are exhausted. This error originates in a WebSocket event callback, so it escapes the normal Promise/async error handling and becomes an uncaught exception that Node.js cannot catch without a process-level handler.

Test Plan

  • Added unit tests for wrapWebSocketWithErrorGuard
  • Verified existing provider.lifecycle.test.ts tests still pass
  • Verified all related gateway tests pass

Fixes #53644

Joel Nishanth · offlyn.AI

Changed files

  • extensions/discord/src/monitor/gateway-plugin.test.ts (added, +105/-0)
  • extensions/discord/src/monitor/gateway-plugin.ts (modified, +57/-5)

Code Example

Uncaught exception: Error: Max reconnect attempts (0) reached after code 1006
    at GatewayPlugin.handleReconnectionAttempt (.../GatewayPlugin.js:305:40)
    at GatewayPlugin.handleClose (.../GatewayPlugin.js:350:14)
    at WebSocket.<anonymous> (.../GatewayPlugin.js:294:18)
    at WebSocket.emit (node:events:508:28)
    at WebSocket.emitClose (.../ws/lib/websocket.js:263:12)
    at emitErrorAndClose (.../ws/lib/websocket.js:1047:13)
    at process.processTicksAndRejections (node:internal/process/task_queues:89:21)

---

Relevant log sequence (from openclaw-2026-03-23.log, times in MDT):


00:19:25 discord gateway: WebSocket connection closed with code 1006
00:19:27 discord gateway: Attempting resume with backoff: 1000ms after code 1006
00:19:30 health-monitor: restarting (reason: disconnected) [discord:default]
00:19:32 health-monitor: restarting (reason: disconnected) [discord:petra]
00:19:41 [openclaw] Uncaught exception: Error: Max reconnect attempts (0) reached after code 1006
         at GatewayPlugin.handleReconnectionAttempt (GatewayPlugin.js:305:40)
         at GatewayPlugin.handleClose (GatewayPlugin.js:350:14)
         ... (full stack in actual log)


Log file ends here. Gateway is dead. Next log entry is the manual restart at 05:12 MDT (~5 hours later).
RAW_BUFFERClick to expand / collapse

Bug type

Crash (process/app exits or hangs)

Summary

When the Discord WebSocket drops with code 1006 (abnormal closure), the internal @buape/carbon GatewayPlugin throws an uncaught exception (Max reconnect attempts (0) reached after code 1006) that crashes the entire OpenClaw gateway process, requiring manual restart.

Steps to reproduce

  1. Run OpenClaw gateway on Windows with Discord channel configured (3 bot accounts).
  2. Wait for a transient network interruption or Discord server hiccup that closes the WebSocket with code 1006 (abnormal closure).
  3. Observe: gateway attempts one resume with 1000ms backoff, fails, and throws an uncaught exception.
  4. Gateway process exits. No auto-recovery. Manual restart required.

This occurred at 00:19:41 MDT on 2026-03-24 after an unattended overnight run.

Expected behavior

Gateway should catch the Discord WebSocket drop, attempt reconnection with exponential backoff (as it does for other transient errors), and resume normal operation without crashing the process. A transient network blip should not take down the entire gateway.

Actual behavior

Gateway process crashes with an uncaught exception. The error originates in GatewayPlugin.handleReconnectionAttempt inside @buape/carbon. Full stack trace from gateway log:

Uncaught exception: Error: Max reconnect attempts (0) reached after code 1006
    at GatewayPlugin.handleReconnectionAttempt (.../GatewayPlugin.js:305:40)
    at GatewayPlugin.handleClose (.../GatewayPlugin.js:350:14)
    at WebSocket.<anonymous> (.../GatewayPlugin.js:294:18)
    at WebSocket.emit (node:events:508:28)
    at WebSocket.emitClose (.../ws/lib/websocket.js:263:12)
    at emitErrorAndClose (.../ws/lib/websocket.js:1047:13)
    at process.processTicksAndRejections (node:internal/process/task_queues:89:21)

Gateway does not restart itself. No auto-recovery mechanism. Requires manual intervention.

OpenClaw version

2026.3.12

Operating system

Windows 10 (10.0.19045 x64) — LAPTOP-UJ92CE78, Node.js v24.13.1

Install method

npm global (npm install -g openclaw)

Model

anthropic/claude-sonnet-4-6

Provider / routing chain

openclaw -> anthropic (direct, no proxy)

Additional provider/model setup details

No response

Logs, screenshots, and evidence

Relevant log sequence (from openclaw-2026-03-23.log, times in MDT):


00:19:25 discord gateway: WebSocket connection closed with code 1006
00:19:27 discord gateway: Attempting resume with backoff: 1000ms after code 1006
00:19:30 health-monitor: restarting (reason: disconnected) [discord:default]
00:19:32 health-monitor: restarting (reason: disconnected) [discord:petra]
00:19:41 [openclaw] Uncaught exception: Error: Max reconnect attempts (0) reached after code 1006
         at GatewayPlugin.handleReconnectionAttempt (GatewayPlugin.js:305:40)
         at GatewayPlugin.handleClose (GatewayPlugin.js:350:14)
         ... (full stack in actual log)


Log file ends here. Gateway is dead. Next log entry is the manual restart at 05:12 MDT (~5 hours later).

Impact and severity

  • Affected: Any OpenClaw deployment running Discord channel(s) on an unattended host
  • Severity: Critical — entire gateway process dies, all agents go offline
  • Frequency: Triggered by any transient Discord WS drop (code 1006); in this case occurred once overnight in ~3 weeks of uptime
  • Consequence: All agents unreachable, all cron jobs missed, no auto-recovery — requires manual restart. On a headless/unattended setup (e.g. home server, VPS), this can mean hours of downtime before someone notices.

Additional information

  • 3 Discord bot accounts configured (multi-account setup: default, petra, bean)
  • The channels.discord.retry config schema only covers outbound API call retries, not WebSocket reconnect behavior — there is no user-configurable setting to increase maxReconnectAttempts beyond 0
  • The health-monitor correctly detected the disconnection and attempted to restart, but the uncaught exception from @buape/carbon killed the process before it could recover
  • Code 1006 is a WebSocket abnormal closure (no clean close frame) — common for transient network drops, NAT timeouts, Discord server restarts. It should be treated as a recoverable error, not a fatal one.

extent analysis

Fix Plan

To address the issue, we need to modify the GatewayPlugin in @buape/carbon to handle the code 1006 WebSocket closure as a recoverable error and implement exponential backoff for reconnection attempts.

Step-by-Step Solution:

  1. Increase max reconnect attempts: Modify the maxReconnectAttempts variable in GatewayPlugin to a reasonable value (e.g., 5) to allow for multiple reconnection attempts.
  2. Implement exponential backoff: Update the handleReconnectionAttempt method to use exponential backoff for reconnection attempts.
  3. Catch and handle exceptions: Wrap the reconnection logic in a try-catch block to prevent uncaught exceptions from crashing the process.

Example Code:

// GatewayPlugin.js (modified)
const maxReconnectAttempts = 5; // increased from 0
const initialBackoff = 1000; // 1 second
const backoffMultiplier = 2;

// ...

handleReconnectionAttempt() {
  if (this.reconnectAttempts >= maxReconnectAttempts) {
    // handle max attempts reached, e.g., log and exit cleanly
    console.error('Max reconnect attempts reached');
    process.exit(1);
  }

  const backoff = initialBackoff * Math.pow(backoffMultiplier, this.reconnectAttempts);
  setTimeout(() => {
    try {
      // attempt to reconnect
      this.reconnect();
    } catch (error) {
      console.error('Reconnection attempt failed:', error);
      this.reconnectAttempts++;
      this.handleReconnectionAttempt();
    }
  }, backoff);
}

Verification

To verify the fix, reproduce the issue by simulating a WebSocket closure with code 1006 and observe the gateway's behavior. The gateway should now attempt to reconnect with exponential backoff and not crash due to an uncaught exception.

Extra Tips

  • Consider adding a configuration option to allow users to adjust the maxReconnectAttempts and backoff settings.
  • Implement logging and monitoring to detect and alert on repeated reconnection failures, indicating a potential underlying issue.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Gateway should catch the Discord WebSocket drop, attempt reconnection with exponential backoff (as it does for other transient errors), and resume normal operation without crashing the process. A transient network blip should not take down the entire gateway.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING