openclaw - ✅(Solved) Fix [Bug]: Gateway crashes with unhandled promise rejection when Discord API returns 503 during health-monitor bot reconnect [4 pull requests, 3 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#44529Fetched 2026-04-08 00:45:41
View on GitHub
Comments
3
Participants
3
Timeline
15
Reactions
0
Timeline (top)
cross-referenced ×4referenced ×4commented ×3labeled ×2

When health-monitor restarts a stuck Discord bot connection and Discord API returns a 503 with non-JSON body, the gateway throws an unhandled promise rejection and the entire Node.js process exits.

Error Message

[openclaw] Unhandled promise rejection: Error: Failed to get gateway information from Discord: Unexpected token 'u', "upstream c"... is not valid JSON at GatewayPlugin.registerClient (file:///opt/homebrew/lib/node_modules/openclaw/dist/subsystem-nlluZawe.js) at processTicksAndRejections (node:internal/process/task_queues:105:5)

Root Cause

When health-monitor restarts a stuck Discord bot connection and Discord API returns a 503 with non-JSON body, the gateway throws an unhandled promise rejection and the entire Node.js process exits.

Fix Action

Fix / Workaround

No known workaround other than relying on LaunchAgent/systemd KeepAlive for auto-restart.

PR fix notes

PR #44659: fix(discord): prevent gateway crash on non-JSON Discord API responses

Description (problem / solution / changelog)

Summary

Fixes a crash where a single Discord API 503 response kills the entire gateway process, taking all agents offline.

  • Root cause: ProxyGatewayPlugin.registerClient() calls response.json() without checking response.ok first. When Discord returns a 503 with a non-JSON body (e.g. upstream connect error or disconnect/reset before headers), JSON.parse() throws a SyntaxError that propagates as an unhandled promise rejection, crashing the Node.js process.

  • Fix: Add response.ok check before .json(), extract body safely with .text().catch(() => ""), and throw a descriptive error. This aligns with the existing pattern used throughout the Discord module (api.ts, probe.ts, pluralkit.ts).

  • Tests added: 3 new test cases covering 503, 502 (HTML body), and 500 (consumed body stream) scenarios. Existing happy-path test updated to include ok: true on the mock response.

Impact

  • Before: One Discord 503 → entire gateway crashes → all agents (Discord, Slack, Telegram, etc.) go offline for 2-3 min until auto-restart
  • After: Error is caught and propagated cleanly through the existing error handling chain, allowing the health-monitor reconnect cycle to retry

Test plan

  • All 6 proxy gateway plugin tests pass (3 existing + 3 new)
  • Full Discord test suite passes (576 tests)
  • Linter passes (oxlint, 0 warnings)

Closes #44529

Changed files

  • src/discord/monitor/gateway-plugin.ts (modified, +6/-0)
  • src/discord/monitor/provider.proxy.test.ts (modified, +67/-0)

PR #2: OpenClaw 紧急未修复

Description (problem / solution / changelog)

<!-- CURSOR_AGENT_PR_BODY_BEGIN -->

Summary

Describe the problem and fix in 2–5 bullets:

  • Problem: The Gateway crashes with an unhandled promise rejection when the Discord API returns a 503 error, as response.json() is called without checking response.ok.
  • Why it matters: This causes the entire gateway process to exit, leading to a complete service interruption for all connected Discord channels for several minutes.
  • What changed: Added a check for response.ok before attempting to parse the JSON response. Errors are now emitted via this.emitter.emit("error") and this.handleReconnectionAttempt() is called to trigger a graceful reconnection attempt, preventing a full process crash. Updated tests to cover the 503 error path.
  • What did NOT change (scope boundary): Other identified issues (#44544, #44611, #44562, #44549) were either already fixed, determined not to be code bugs, or required a dedicated test environment for safe implementation and are not part of this PR.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #44529
  • Related #

User-visible / Behavior Changes

The gateway will no longer crash when the Discord API returns a 503 error during client registration. Instead, it will log the error and attempt to reconnect gracefully.

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation:

Repro + Verification

Environment

  • OS: Linux (typical Node.js environment)
  • Runtime/container: Node.js
  • Model/provider: N/A
  • Integration/channel (if any): Discord
  • Relevant config (redacted): N/A

Steps

  1. Configure OpenClaw Gateway with a Discord integration.
  2. During the GatewayPlugin.registerClient() call (e.g., when the health monitor triggers a Discord bot reconnection), simulate a Discord API returning a 503 HTTP status code with a non-JSON body.
  3. Observe the gateway process.

Expected

  • The gateway logs an error about the Discord 503 response.
  • The gateway attempts to reconnect to Discord without crashing the entire Node.js process.

Actual

  • The gateway process crashes with an unhandled promise rejection.

Evidence

  • Failing test/log before + passing after (New test should handle 503 response gracefully without crashing added to provider.proxy.test.ts which now passes.)
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios: Added and passed a new unit test specifically for the 503 error case in provider.proxy.test.ts, ensuring the response.ok check and error handling logic are correctly implemented.
  • Edge cases checked: Specifically addressed the case where response.json() would throw an error on a non-2xx response.
  • What you did not verify: Did not verify in a live production environment with an actual Discord 503 error.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps:

Failure Recovery (if this breaks)

  • How to disable/revert this change quickly: Revert this PR.
  • Files/config to restore: N/A
  • Known bad symptoms reviewers should watch for: Gateway still crashing on Discord 503, or Discord reconnection issues.

Risks and Mitigations

  • Risk: The error handling or reconnection logic might not fully cover all edge cases of Discord API errors.
    • Mitigation: The current fix addresses the immediate crash by preventing response.json() on bad responses and uses existing reconnection mechanisms. Further monitoring of Discord integration health is always recommended.
<div><a href="https://cursor.com/agents/bc-7023bdfd-1d36-48c0-818c-2346ec9378b5"><picture><source media="(prefers-color-scheme: dark)" srcset="https://cursor.com/assets/images/open-in-web-dark.png"><source media="(prefers-color-scheme: light)" srcset="https://cursor.com/assets/images/open-in-web-light.png"><img alt="Open in Web" width="114" height="28" src="https://cursor.com/assets/images/open-in-web-dark.png"></picture></a>&nbsp;<a href="https://cursor.com/background-agent?bcId=bc-7023bdfd-1d36-48c0-818c-2346ec9378b5"><picture><source media="(prefers-color-scheme: dark)" srcset="https://cursor.com/assets/images/open-in-cursor-dark.png"><source media="(prefers-color-scheme: light)" srcset="https://cursor.com/assets/images/open-in-cursor-light.png"><img alt="Open in Cursor" width="131" height="28" src="https://cursor.com/assets/images/open-in-cursor-dark.png"></picture></a>&nbsp;</div> <!-- CURSOR_AGENT_PR_BODY_END -->

Changed files

  • .detect-secrets.cfg (modified, +19/-4)
  • .github/FUNDING.yml (removed, +0/-1)
  • .github/ISSUE_TEMPLATE/bug_report.yml (modified, +31/-0)
  • .github/actions/ensure-base-commit/action.yml (added, +47/-0)
  • .github/actions/setup-node-env/action.yml (modified, +8/-4)
  • .github/actions/setup-pnpm-store-cache/action.yml (modified, +9/-7)
  • .github/codeql/codeql-javascript-typescript.yml (added, +18/-0)
  • .github/pull_request_template.md (modified, +7/-0)
  • .github/workflows/auto-response.yml (modified, +55/-0)
  • .github/workflows/ci.yml (modified, +85/-50)
  • .github/workflows/codeql.yml (added, +134/-0)
  • .github/workflows/docker-release.yml (modified, +77/-10)
  • .github/workflows/install-smoke.yml (modified, +63/-19)
  • .github/workflows/labeler.yml (modified, +256/-82)
  • .github/workflows/openclaw-npm-release.yml (added, +79/-0)
  • .github/workflows/sandbox-common-smoke.yml (modified, +1/-1)
  • .github/workflows/stale.yml (modified, +62/-3)
  • .gitignore (modified, +10/-0)
  • .npmignore (added, +1/-0)
  • .pi/prompts/reviewpr.md (modified, +37/-8)
  • .pre-commit-config.yaml (modified, +27/-1)
  • .secrets.baseline (modified, +229/-316)
  • .swiftformat (modified, +1/-1)
  • .swiftlint.yml (modified, +3/-1)
  • AGENTS.md (modified, +38/-0)
  • CHANGELOG.md (modified, +717/-137)
  • CONTRIBUTING.md (modified, +36/-4)
  • Dockerfile (modified, +165/-50)
  • Dockerfile.sandbox (modified, +6/-3)
  • Dockerfile.sandbox-browser (modified, +7/-5)
  • Dockerfile.sandbox-common (modified, +6/-4)
  • README.md (modified, +1/-1)
  • SECURITY.md (modified, +9/-0)
  • appcast.xml (modified, +499/-328)
  • apps/android/README.md (modified, +1/-1)
  • apps/android/app/build.gradle.kts (modified, +49/-4)
  • apps/android/app/proguard-rules.pro (modified, +1/-1)
  • apps/android/app/src/main/AndroidManifest.xml (modified, +1/-9)
  • apps/android/app/src/main/java/ai/openclaw/android/InstallResultReceiver.kt (removed, +0/-33)
  • apps/android/app/src/main/java/ai/openclaw/android/ScreenCaptureRequester.kt (removed, +0/-65)
  • apps/android/app/src/main/java/ai/openclaw/android/node/AppUpdateHandler.kt (removed, +0/-295)
  • apps/android/app/src/main/java/ai/openclaw/android/node/ScreenHandler.kt (removed, +0/-25)
  • apps/android/app/src/main/java/ai/openclaw/android/node/ScreenRecordManager.kt (removed, +0/-165)
  • apps/android/app/src/main/java/ai/openclaw/app/CameraHudState.kt (renamed, +1/-1)
  • apps/android/app/src/main/java/ai/openclaw/app/DeviceNames.kt (renamed, +1/-1)
  • apps/android/app/src/main/java/ai/openclaw/app/LocationMode.kt (renamed, +2/-2)
  • apps/android/app/src/main/java/ai/openclaw/app/MainActivity.kt (renamed, +3/-7)
  • apps/android/app/src/main/java/ai/openclaw/app/MainViewModel.kt (renamed, +11/-10)
  • apps/android/app/src/main/java/ai/openclaw/app/NodeApp.kt (renamed, +1/-1)
  • apps/android/app/src/main/java/ai/openclaw/app/NodeForegroundService.kt (renamed, +6/-28)
  • apps/android/app/src/main/java/ai/openclaw/app/NodeRuntime.kt (renamed, +64/-44)
  • apps/android/app/src/main/java/ai/openclaw/app/PermissionRequester.kt (renamed, +1/-1)
  • apps/android/app/src/main/java/ai/openclaw/app/SecurePrefs.kt (renamed, +45/-6)
  • apps/android/app/src/main/java/ai/openclaw/app/SessionKey.kt (renamed, +1/-1)
  • apps/android/app/src/main/java/ai/openclaw/app/VoiceWakeMode.kt (renamed, +1/-1)
  • apps/android/app/src/main/java/ai/openclaw/app/WakeWords.kt (renamed, +1/-1)
  • apps/android/app/src/main/java/ai/openclaw/app/chat/ChatController.kt (renamed, +2/-2)
  • apps/android/app/src/main/java/ai/openclaw/app/chat/ChatModels.kt (renamed, +1/-1)
  • apps/android/app/src/main/java/ai/openclaw/app/gateway/BonjourEscapes.kt (renamed, +1/-1)
  • apps/android/app/src/main/java/ai/openclaw/app/gateway/DeviceAuthPayload.kt (renamed, +1/-1)
  • apps/android/app/src/main/java/ai/openclaw/app/gateway/DeviceAuthStore.kt (renamed, +4/-3)
  • apps/android/app/src/main/java/ai/openclaw/app/gateway/DeviceIdentityStore.kt (renamed, +1/-1)
  • apps/android/app/src/main/java/ai/openclaw/app/gateway/GatewayDiscovery.kt (renamed, +1/-1)
  • apps/android/app/src/main/java/ai/openclaw/app/gateway/GatewayEndpoint.kt (renamed, +1/-1)
  • apps/android/app/src/main/java/ai/openclaw/app/gateway/GatewayProtocol.kt (renamed, +1/-1)
  • apps/android/app/src/main/java/ai/openclaw/app/gateway/GatewaySession.kt (renamed, +225/-20)
  • apps/android/app/src/main/java/ai/openclaw/app/gateway/GatewayTls.kt (renamed, +1/-1)
  • apps/android/app/src/main/java/ai/openclaw/app/gateway/InvokeErrorParser.kt (renamed, +1/-1)
  • apps/android/app/src/main/java/ai/openclaw/app/node/A2UIHandler.kt (renamed, +2/-2)
  • apps/android/app/src/main/java/ai/openclaw/app/node/CalendarHandler.kt (renamed, +2/-2)
  • apps/android/app/src/main/java/ai/openclaw/app/node/CameraCaptureManager.kt (renamed, +2/-2)
  • apps/android/app/src/main/java/ai/openclaw/app/node/CameraHandler.kt (renamed, +4/-4)
  • apps/android/app/src/main/java/ai/openclaw/app/node/CanvasController.kt (renamed, +2/-2)
  • apps/android/app/src/main/java/ai/openclaw/app/node/ConnectionManager.kt (renamed, +9/-9)
  • apps/android/app/src/main/java/ai/openclaw/app/node/ContactsHandler.kt (renamed, +2/-2)
  • apps/android/app/src/main/java/ai/openclaw/app/node/DebugHandler.kt (renamed, +4/-4)
  • apps/android/app/src/main/java/ai/openclaw/app/node/DeviceHandler.kt (renamed, +3/-18)
  • apps/android/app/src/main/java/ai/openclaw/app/node/DeviceNotificationListenerService.kt (renamed, +1/-1)
  • apps/android/app/src/main/java/ai/openclaw/app/node/GatewayEventHandler.kt (renamed, +3/-3)
  • apps/android/app/src/main/java/ai/openclaw/app/node/InvokeCommandRegistry.kt (renamed, +14/-22)
  • apps/android/app/src/main/java/ai/openclaw/app/node/InvokeDispatcher.kt (renamed, +14/-24)
  • apps/android/app/src/main/java/ai/openclaw/app/node/JpegSizeLimiter.kt (renamed, +1/-1)
  • apps/android/app/src/main/java/ai/openclaw/app/node/LocationCaptureManager.kt (renamed, +1/-1)
  • apps/android/app/src/main/java/ai/openclaw/app/node/LocationHandler.kt (renamed, +4/-20)
  • apps/android/app/src/main/java/ai/openclaw/app/node/MotionHandler.kt (renamed, +2/-2)
  • apps/android/app/src/main/java/ai/openclaw/app/node/NodeUtils.kt (renamed, +2/-2)
  • apps/android/app/src/main/java/ai/openclaw/app/node/NotificationsHandler.kt (renamed, +2/-2)
  • apps/android/app/src/main/java/ai/openclaw/app/node/PhotosHandler.kt (renamed, +2/-2)
  • apps/android/app/src/main/java/ai/openclaw/app/node/SmsHandler.kt (renamed, +2/-2)
  • apps/android/app/src/main/java/ai/openclaw/app/node/SmsManager.kt (renamed, +2/-2)
  • apps/android/app/src/main/java/ai/openclaw/app/node/SystemHandler.kt (renamed, +2/-2)
  • apps/android/app/src/main/java/ai/openclaw/app/protocol/OpenClawCanvasA2UIAction.kt (renamed, +1/-1)
  • apps/android/app/src/main/java/ai/openclaw/app/protocol/OpenClawProtocolConstants.kt (renamed, +1/-12)
  • apps/android/app/src/main/java/ai/openclaw/app/tools/ToolDisplay.kt (renamed, +1/-1)
  • apps/android/app/src/main/java/ai/openclaw/app/ui/CameraHudOverlay.kt (renamed, +1/-1)
  • apps/android/app/src/main/java/ai/openclaw/app/ui/CanvasScreen.kt (renamed, +2/-2)
  • apps/android/app/src/main/java/ai/openclaw/app/ui/ChatSheet.kt (renamed, +3/-3)
  • apps/android/app/src/main/java/ai/openclaw/app/ui/ConnectTabScreen.kt (renamed, +5/-2)
  • apps/android/app/src/main/java/ai/openclaw/app/ui/GatewayConfigResolver.kt (renamed, +24/-6)
  • apps/android/app/src/main/java/ai/openclaw/app/ui/MobileUiTokens.kt (renamed, +2/-2)

PR #44669: fix(discord): prevent gateway crash when Discord API returns 503 during proxy reconnect, solve #44529

Description (problem / solution / changelog)

Summary

Fixes an unhandled promise rejection that crashes the entire Node.js gateway process when Discord's API returns a transient 503 (or any non-2xx status) during a proxy-mode bot reconnect attempt.

Root Cause

ProxyGatewayPlugin.registerClient() called response.json() on the API response without first checking response.ok. When Discord returns a 503 during a health-monitor-triggered reconnect, the response body is a non-JSON string like "upstream connect error or disconnect/reset before headers", causing a JSON parse error.

Critically, Carbon calls registerClient() without await, so any thrown error from this async method becomes an unhandled promise rejection — which crashes the entire Node.js gateway process, taking all agents offline.

Fix

  • Add response.ok check before calling response.json(). On non-2xx responses, read the body as text and build a descriptive error message with the HTTP status code.
  • Instead of re-throwing the error (which becomes unhandled since Carbon doesn't await registerClient()), emit it on this.emitter — the gateway lifecycle's error listener picks it up gracefully.
  • Call this.handleReconnectionAttempt({}) to schedule a backoff retry via Carbon's existing reconnect mechanism, so the bot reconnects after the Discord 503 clears.

This same catch block handles both non-2xx HTTP responses and other network errors (e.g. DNS failure, connection refused).

Before / After

Before: Any transient Discord 503 during proxy reconnect → unhandled rejection → gateway process exits → all agents offline for 2–3 minutes until LaunchAgent/systemd restarts the process.

After: 503 is caught, logged via the emitter error event, and a backoff reconnect is scheduled. The gateway process stays alive. Other bots/agents are unaffected.

Change Type

  • Bug fix

Scope

  • Integrations (Discord)

Linked Issue

Fixes #44529

User-visible / Behavior Changes

Gateway no longer crashes when Discord returns a transient 503 during a proxy-mode health-monitor reconnect. The affected bot will retry with exponential backoff; other bots remain online.

Security Impact

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No (same Discord gateway endpoint, added error handling)
  • Command/tool execution surface changed? No

Repro + Verification

Unit test added in src/discord/monitor/provider.proxy.test.ts covering the 503 path:

  • Mocks undiciFetch to return { ok: false, status: 503, text: async () => "upstream connect error..." }
  • Verifies emitter.emit("error", ...) is called with an error containing "HTTP 503"
  • Verifies handleReconnectionAttempt({}) is called once
  • Verifies super.registerClient() is NOT called (no attempt to open WebSocket with bad gateway info)

Changed files

  • src/cli/daemon-cli/lifecycle.test.ts (modified, +11/-10)
  • src/discord/monitor/gateway-plugin.ts (modified, +21/-1)
  • src/discord/monitor/provider.proxy.test.ts (modified, +102/-0)

PR #183: 🦅 Scout: Critical Inherited Defect Report - 2026-03-20

Description (problem / solution / changelog)

🦅 Scout Intelligence Report

Identified Upstream Defect: #44529 "Gateway crashes with unhandled promise rejection when Discord API returns 503 during health-monitor bot reconnect"

Actions Taken:

  • Mined the parent repository for severe, core-breaking bugs resulting in gateway crashes.
  • Verified the flawed logic exists in our local codebase (src/discord/monitor/gateway-plugin.ts, lines 59-72).
  • Documented the defect pattern in .jules/scout.md for future review strategies.
  • Generated a formal bug report (scout_report.md) detailing Expected vs. Observed Behavior and Impact Severity.

The missing try/catch coverage around .json() parsing of the Discord API fetch results in an unhandled promise rejection that kills the entire Node.js process instead of gracefully reconnecting.


PR created automatically by Jules for task 6440894518280869909 started by @MillionthOdin16

<!-- This is an auto-generated comment: release notes by coderabbit.ai -->

Summary by CodeRabbit

  • Documentation

    • Added internal documentation files for defect tracking and reporting.
  • Chores

    • Added registry files for issue tracking and management.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Changed files

  • .jules/scout.md (added, +4/-0)
  • issues_all.txt (added, +100/-0)
  • issues_high.txt (added, +50/-0)
  • scout_report.md (added, +7/-0)
  • test_output.log (added, +958/-0)

Code Example

[openclaw] Unhandled promise rejection: Error: Failed to get gateway information from Discord: Unexpected token 'u', "upstream c"... is not valid JSON
    at GatewayPlugin.registerClient (file:///opt/homebrew/lib/node_modules/openclaw/dist/subsystem-nlluZawe.js)
    at processTicksAndRejections (node:internal/process/task_queues:105:5)

---

Log sequence (timestamps in MYT):


04:20 [gateway/health-monitor] [discord:my-bot] health-monitor: restarting (reason: stuck)
04:21 [gateway/channels/discord] discord gateway: WebSocket connection closed with code 1006
04:30 [gateway/health-monitor] [discord:my-bot] health-monitor: restarting (reason: stuck)
04:30 [gateway/channels/discord] discord channel resolve failed; Discord API /channels/... failed (503): upstream connect error or disconnect/reset before headers. reset reason: overflow
04:30 [openclaw] Unhandled promise rejection: Error: Failed to get gateway information from Discord: Unexpected token 'u', "upstream c"... is not valid JSON
          at GatewayPlugin.registerClient
          at processTicksAndRejections (node:internal/process/task_queues:105:5)


bot-c was in a low-traffic channel and was flagged as stuck twice before the fatal crash on the second reconnect attempt. Discord status page showed no incidents.
RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Summary

When health-monitor restarts a stuck Discord bot connection and Discord API returns a 503 with non-JSON body, the gateway throws an unhandled promise rejection and the entire Node.js process exits.

Steps to reproduce

  1. Run gateway with multiple Discord bot accounts (bot-a, bot-b, bot-c)
  2. One bot (aveline-bot) connection goes idle in a low-traffic channel
  3. health-monitor detects stuck connection and triggers reconnect (reason: stuck)
  4. During reconnect, Discord API returns HTTP 503 with non-JSON body: upstream connect error or disconnect/reset before headers. reset reason: overflow
  5. GatewayPlugin.registerClient() attempts to parse the body as JSON
  6. Unhandled promise rejection crashes the entire Node.js gateway process

Expected behavior

Gateway catches the 503 error gracefully, logs a warning, and retries the reconnect after backoff. The process should NOT exit.

Actual behavior

Gateway process exits with:

[openclaw] Unhandled promise rejection: Error: Failed to get gateway information from Discord: Unexpected token 'u', "upstream c"... is not valid JSON
    at GatewayPlugin.registerClient (file:///opt/homebrew/lib/node_modules/openclaw/dist/subsystem-nlluZawe.js)
    at processTicksAndRejections (node:internal/process/task_queues:105:5)

LaunchAgent KeepAlive auto-restarts the gateway (~2-3 min downtime for all agents).

OpenClaw version

2026.3.2 (85377a2)

Operating system

macOS Sequoia (Apple Silicon M4)

Install method

npm global (homebrew node@22)

Model

anthropic/claude-sonnet-4-6

Provider / routing chain

openclaw -> anthropic (OAuth)

Config file / key location

No response

Additional provider/model setup details

No response

Logs, screenshots, and evidence

Log sequence (timestamps in MYT):


04:20 [gateway/health-monitor] [discord:my-bot] health-monitor: restarting (reason: stuck)
04:21 [gateway/channels/discord] discord gateway: WebSocket connection closed with code 1006
04:30 [gateway/health-monitor] [discord:my-bot] health-monitor: restarting (reason: stuck)
04:30 [gateway/channels/discord] discord channel resolve failed; Discord API /channels/... failed (503): upstream connect error or disconnect/reset before headers. reset reason: overflow
04:30 [openclaw] Unhandled promise rejection: Error: Failed to get gateway information from Discord: Unexpected token 'u', "upstream c"... is not valid JSON
          at GatewayPlugin.registerClient
          at processTicksAndRejections (node:internal/process/task_queues:105:5)


bot-c was in a low-traffic channel and was flagged as stuck twice before the fatal crash on the second reconnect attempt. Discord status page showed no incidents.

Impact and severity

  • Affected: ALL agents on the gateway (not just aveline-bot)
  • Severity: High — entire gateway process exits, all agents go offline
  • Frequency: Intermittent (triggered by Discord API transient 503 during bot reconnect)
  • Consequence: ~2-3 minute outage for all agents; in-flight cron jobs interrupted; requires LaunchAgent/systemd for auto-recovery

Side note: health-monitor flagging an idle-but-alive bot as 'stuck' may itself be worth reviewing — bot-c was in a low-traffic channel with no recent messages.

Additional information

The unhandled rejection originates in GatewayPlugin.registerClient() when it calls the Discord gateway info endpoint and receives a non-JSON 503 response. Wrapping this call in try/catch (or adding a .catch() handler) and treating non-2xx or non-JSON responses as retriable errors would fix this.

No known workaround other than relying on LaunchAgent/systemd KeepAlive for auto-restart.

extent analysis

Fix Plan

To address the issue, we need to modify the GatewayPlugin.registerClient() method to handle non-JSON responses from the Discord API. We can achieve this by wrapping the API call in a try-catch block and adding a .catch() handler to catch any promise rejections.

Step-by-Step Solution

  1. Modify the GatewayPlugin.registerClient() method:

// Before async function registerClient() { const response = await fetch('https://discord.com/api/gateway'); const jsonData = await response.json(); // ... }

// After async function registerClient() { try { const response = await fetch('https://discord.com/api/gateway'); if (!response.ok) { // Treat non-2xx responses as retriable errors throw new Error(HTTP error! status: ${response.status}); } const jsonData = await response.json(); // ... } catch (error) { // Log the error and retry the request after a backoff console.warn('Error registering client:', error); // Implement retry logic with backoff here } }


2. **Implement retry logic with backoff**:
   ```javascript
const retry = async (fn, retries = 3, delay = 500) => {
  let attempt = 0;
  while (attempt < retries) {
    try {
      return await fn();
    } catch (error) {
      attempt++;
      if (attempt < retries) {
        console.log(`Retry ${attempt} after ${delay}ms`);
        await new Promise(resolve => setTimeout(resolve, delay));
        delay *= 2; // exponential backoff
      } else {
        throw error;
      }
    }
  }
};

// Usage
async function registerClient() {
  await retry(async () => {
    // Original registerClient logic here
  });
}

Verification

To verify that the fix worked, you can simulate a non-JSON response from the Discord API and check that the gateway does not crash. You can also monitor the logs to ensure that the retry logic is working as expected.

Extra Tips

  • Make sure to handle errors properly and implement retry logic with backoff to avoid overwhelming the Discord API with requests.
  • Consider adding a circuit breaker pattern to detect when the Discord API is down and prevent further requests.
  • Review the health-monitor logic to prevent flagging idle-but-alive bots as 'stuck'.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Gateway catches the 503 error gracefully, logs a warning, and retries the reconnect after backoff. The process should NOT exit.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING