openclaw - ✅(Solved) Fix [Bug]: Gateway crashes with uncaught exception on Discord WebSocket stale-socket reconnect (code 1005) [2 pull requests, 1 comments, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#57291Fetched 2026-04-08 01:51:30
View on GitHub
Comments
1
Participants
1
Timeline
10
Reactions
0
Participants
Timeline (top)
cross-referenced ×4referenced ×3labeled ×2commented ×1

The gateway process crashes with an uncaught exception repeatedly (~every 90 minutes) when the internal health monitor detects a stale Discord socket and attempts to restart it. Instead of recovering gracefully, the reconnect failure propagates as an uncaught exception and kills the entire Node.js process.

Error Message

The gateway process crashes with an uncaught exception repeatedly (~every 90 minutes) when the internal health monitor detects a stale Discord socket and attempts to restart it. Instead of recovering gracefully, the reconnect failure propagates as an uncaught exception and kills the entire Node.js process. 2. Uncaught exception immediately follows and process exits: [openclaw] Uncaught exception: Error: Max reconnect attempts (0) reached after code 1005 Uncaught exception kills the gateway process entirely. Crashed 4+ times today.

Root Cause

The gateway process crashes with an uncaught exception repeatedly (~every 90 minutes) when the internal health monitor detects a stale Discord socket and attempts to restart it. Instead of recovering gracefully, the reconnect failure propagates as an uncaught exception and kills the entire Node.js process.

Fix Action

Fixed

PR fix notes

PR #57352: fix(discord): suppress reconnect-exhausted crash when maxAttempts=0

Description (problem / solution / changelog)

Summary

When the health monitor triggers a stale-socket restart, it sets gateway.options.reconnect.maxAttempts = 0 before disconnecting. The Discord gateway library then fires Max reconnect attempts (0) reached, which was propagated as an uncaught exception and crashed the entire Node process.

Root Cause

In extensions/discord/src/monitor/provider.lifecycle.ts, handleGatewayEvent only suppressed reconnect-exhausted errors when lifecycleStopping was already true. However, the health-monitor restart fires while the gateway is fully operational, so lifecycleStopping is false. The error then calls danger() and throws, crashing the process.

Fix

Added an explicit check for maxAttempts === 0 in handleGatewayEvent before the lifecycleStopping guard. When maxAttempts=0, reconnect-exhausted is always treated as a graceful intentional abort (logged at info, returns stop), regardless of lifecycleStopping state.

Files Changed

  • extensions/discord/src/monitor/provider.lifecycle.ts
  • extensions/discord/src/monitor/provider.lifecycle.test.ts

Linked Issue

Fixes #57291

Changed files

  • extensions/discord/src/monitor/provider.lifecycle.test.ts (modified, +39/-0)
  • extensions/discord/src/monitor/provider.lifecycle.ts (modified, +6/-0)
  • src/agents/pi-tools.ts (modified, +5/-1)
  • src/agents/tool-policy.test.ts (modified, +24/-0)
  • src/agents/tool-policy.ts (modified, +19/-2)

PR #57458: fix(discord): prevent uncaught exception on WebSocket stale-socket reconnect

Description (problem / solution / changelog)

Summary

  • Fix #57291: Gateway crashes with uncaught exception when health monitor triggers stale-socket restart for Discord
  • Replace one-shot once("error") with persistent on("error") handler during teardown to absorb all error events from Carbon GatewayPlugin
  • Move handler removal to after async cleanup (voiceManager.destroy(), execApprovalsHandler.stop()) in finally block, preventing race condition where deferred close events fire during yielded event loop

Test plan

  • Discord extension tests pass
  • Manual: run gateway with Discord channel, wait for stale-socket detection, verify gateway survives reconnect

🦞 Generated with Claude Code

Changed files

  • extensions/discord/src/monitor/provider.lifecycle.ts (modified, +28/-3)
RAW_BUFFERClick to expand / collapse

Bug type

Crash (process/app exits or hangs)

Beta release blocker

No

Summary

The gateway process crashes with an uncaught exception repeatedly (~every 90 minutes) when the internal health monitor detects a stale Discord socket and attempts to restart it. Instead of recovering gracefully, the reconnect failure propagates as an uncaught exception and kills the entire Node.js process.

Steps to reproduce

Running normally with a Discord channel connected. No special config required — happens on its own at regular intervals.

Log Sequence

  1. Health monitor fires: health-monitor: restarting (reason: stale-socket)
  2. Uncaught exception immediately follows and process exits:

``` [openclaw] Uncaught exception: Error: Max reconnect attempts (0) reached after code 1005 at SafeGatewayPlugin.handleReconnectionAttempt (provider-CAlWEl41.js:3318:47) at SafeGatewayPlugin.handleClose (provider-CAlWEl41.js:3364:8) at WebSocket.<anonymous> (provider-CAlWEl41.js:3307:9) at WebSocket.emit (node:events:508:28) at WebSocket.emitClose (...ws/lib/websocket.js:273:10) at TLSSocket.socketOnClose (...ws/lib/websocket.js:1346:15) ```

Expected behavior

Health monitor restarts the Discord WebSocket connection and gateway continues running.

Actual behavior

Uncaught exception kills the gateway process entirely. Crashed 4+ times today.

OpenClaw version

2026.3.24

Operating system

Windows 11 x64 (Build 26200)

Install method

npm global

Model

anthropic/claude-sonnet-4-6 (primary), ollama/qwen2.5:7b (cron jobs at time of crash)

Provider / routing chain

openclaw -> anthropic (direct, no proxy or AI gateway)

Additional provider/model setup details

No response

Logs, screenshots, and evidence

Impact and severity

No response

Additional information

No response

extent analysis

Fix Plan

To prevent the gateway process from crashing due to an uncaught exception when the internal health monitor detects a stale Discord socket, we need to implement a retry mechanism with a limited number of attempts and handle the reconnect failure gracefully.

Step-by-Step Solution:

  1. Modify the handleReconnectionAttempt function in provider-CAlWEl41.js to include a retry counter and a maximum number of attempts.
  2. Catch and handle the reconnect failure to prevent the uncaught exception from killing the process.
  3. Implement an exponential backoff strategy for retries to avoid overwhelming the Discord socket.

Example Code:

// provider-CAlWEl41.js

const maxReconnectAttempts = 5;
const initialRetryDelay = 1000; // 1 second
const maxRetryDelay = 30000; // 30 seconds

class SafeGatewayPlugin {
  // ...

  handleReconnectionAttempt(attempt = 0) {
    if (attempt >= maxReconnectAttempts) {
      // Handle max attempts reached, e.g., log and notify
      console.error('Max reconnect attempts reached');
      return;
    }

    // Attempt to reconnect
    this.reconnect()
      .catch((error) => {
        // Handle reconnect failure
        console.error('Reconnect failed:', error);

        // Calculate retry delay with exponential backoff
        const retryDelay = Math.min(initialRetryDelay * 2 ** attempt, maxRetryDelay);
        setTimeout(() => this.handleReconnectionAttempt(attempt + 1), retryDelay);
      });
  }

  // ...
}

Verification

To verify the fix, run the gateway process and simulate a stale Discord socket to trigger the health monitor. The process should now retry the reconnect attempt with a limited number of attempts and handle the failure without crashing.

Extra Tips

  • Monitor the retry attempts and adjust the maxReconnectAttempts and initialRetryDelay values as needed to balance between reconnecting quickly and avoiding overwhelming the Discord socket.
  • Consider implementing a circuit breaker pattern to detect and prevent further reconnect attempts when the Discord socket is consistently failing.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Health monitor restarts the Discord WebSocket connection and gateway continues running.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING