openclaw - 💡(How to fix) Fix [Bug] Discord /users/@me probe self-starves event loop on startup, crashes gateway (Windows, v2026.5.22)

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

[provider-transport-fetch] error provider=anthropic model=claude-opus-4-6 elapsedMs=54164 causeCode=ECONNRESET message=fetch failed [agent/embedded] embedded run agent end: isError=true error=LLM request failed: network connection error. Embedded agent failed before reply: LLM request failed: network connection error. 3. Circuit breaker on the inner retry loop — after N consecutive timeouts in start-account, pause retries and surface a clear error instead of self-starving

Root Cause

The /users/@me cold-boot probe in the Discord adapter has a 2.5s initial timeout that is too aggressive for a Node process performing concurrent auth pre-warming. When the probe times out:

  1. It retries with a 10s timeout
  2. Each retry adds async work to an already-loaded event loop
  3. Auth pre-warming (OAuth token refresh for google-gemini-cli, openai-codex) also needs the event loop
  4. The compounding load causes timer delays of 8-34 seconds
  5. Discord's WebSocket heartbeat ACK window (~41.25s) is exceeded
  6. WebSocket closes with code 1006
  7. Health monitor triggers restart, adding more probe retries
  8. Death spiral continues until Discord is disabled

The slash command deploy retry amplification was fixed in commit 2b9b133 ("avoid startup rest amplification"), which is already in @openclaw/[email protected]. But the /users/@me probe retry path was NOT addressed by that fix.

Fix Action

Workaround

Rolling back to OpenClaw v2026.4.10 where Discord worked. No config-level workaround was found for v2026.5.22.

Code Example

curl discord.com/api/v10/users/@me — 212ms (HTTP 401, expected)
curl api.anthropic.com — 118ms
curl gateway.discord.gg — 188ms
Node.js timer drift: 12ms (normal)

---

http server listening (10 plugins; 5.3s)
provider auth state pre-warmed in 19389ms eventLoopMax=7755.3ms

---

[discord] [default] starting provider
[fetch-timeout] timeoutMs=2500 elapsedMs=4739 timerDelayMs=2239 event-loop starvation url=discord.com/api/v10/users/@me
[discord] discord client initialized as 1477815626566205613; awaiting gateway readiness
[gateway] gateway ready
[fetch-timeout] timeoutMs=10000 elapsedMs=18982 timerDelayMs=8982 event-loop starvation url=discord.com/api/v10/users/@me
[fetch-timeout] timeoutMs=10000 elapsedMs=16686 timerDelayMs=6686 event-loop starvation url=discord.com/api/v10/users/@me
[discord] discord gateway: Gateway websocket closed: 1006
[fetch-timeout] timeoutMs=10000 elapsedMs=23485 timerDelayMs=13485 event-loop starvation url=discord.com/api/v10/users/@me
[gateway] provider auth state pre-warmed in 103802ms eventLoopMax=33252.4ms

---

[provider-transport-fetch] error provider=anthropic model=claude-opus-4-6 elapsedMs=54164 causeCode=ECONNRESET message=fetch failed
[agent/embedded] embedded run agent end: isError=true error=LLM request failed: network connection error.
Embedded agent failed before reply: LLM request failed: network connection error.

---

[discord] discord client initialized as 1477815626566205613; awaiting gateway readiness
[gateway] gateway ready
heartbeat: started
[fetch-timeout] url=discord.com/api/v10/users/@me — timeout
[discord] discord gateway: Gateway websocket closed: 1006
RAW_BUFFERClick to expand / collapse

[Bug] Discord /users/@me probe self-starves event loop on startup, crashes gateway (Windows, v2026.5.22)

Related Issues

  • #75341 — v2026.4.29 regression: OAuth @me fetch-timeout + native slash command deploy PATCH hits 429 during startup
  • #79992 — Discord adapter wedges in start-account and self-starves event loop
  • #80227 — Discord plugin REST gateway metadata fetch bypasses account proxy

Environment

  • OS: Windows 11 (Windows_NT 10.0.26200 x64)
  • OpenClaw: v2026.5.22 (a374c3a)
  • Discord Plugin: @openclaw/[email protected]
  • Node: v24.13.1
  • Gateway mode: Embedded, loopback on 127.0.0.1:18789
  • Service: Windows Scheduled Task
  • Auth profiles: 4 (anthropic:api_key, openai-codex:oauth, google-gemini-cli:oauth, google:token)
  • Host: 32GB RAM, CPU 19%, idle. Network verified fast (Discord API 161ms, Anthropic 118ms via curl).

Problem

Discord integration is completely unusable on v2026.5.22. Every gateway restart with channels.discord.enabled=true results in:

  1. The /users/@me cold-boot probe times out (2.5s initial, 10s retry)
  2. Retries compound event loop starvation
  3. Auth pre-warming balloons from ~19s (without Discord) to 100-180s (with Discord), with eventLoopMax jumping from ~8s to 33-80s
  4. Discord WebSocket heartbeat ACK times out and closes (code 1006)
  5. Health monitor triggers auto-restart, compounding the starvation further
  6. Anthropic API calls also fail with ECONNRESET due to shared event loop starvation
  7. Gateway becomes completely unresponsive — agent cannot reply to any channel

Disabling channels.discord.enabled via hot-reload immediately recovers the event loop. Re-enabling deterministically reproduces the wedge.

Discord worked fine on v2026.4.10. The regression was introduced somewhere between 4.10 and 4.29 (per #75341) and persists through 5.22.

Evidence (from openclaw-2026-05-24.log)

Network is NOT the cause

curl discord.com/api/v10/users/@me — 212ms (HTTP 401, expected)
curl api.anthropic.com — 118ms
curl gateway.discord.gg — 188ms
Node.js timer drift: 12ms (normal)

Without Discord (baseline)

http server listening (10 plugins; 5.3s)
provider auth state pre-warmed in 19389ms eventLoopMax=7755.3ms

With Discord enabled — death spiral

[discord] [default] starting provider
[fetch-timeout] timeoutMs=2500 elapsedMs=4739 timerDelayMs=2239 event-loop starvation url=discord.com/api/v10/users/@me
[discord] discord client initialized as 1477815626566205613; awaiting gateway readiness
[gateway] gateway ready
[fetch-timeout] timeoutMs=10000 elapsedMs=18982 timerDelayMs=8982 event-loop starvation url=discord.com/api/v10/users/@me
[fetch-timeout] timeoutMs=10000 elapsedMs=16686 timerDelayMs=6686 event-loop starvation url=discord.com/api/v10/users/@me
[discord] discord gateway: Gateway websocket closed: 1006
[fetch-timeout] timeoutMs=10000 elapsedMs=23485 timerDelayMs=13485 event-loop starvation url=discord.com/api/v10/users/@me
[gateway] provider auth state pre-warmed in 103802ms eventLoopMax=33252.4ms

Cascade kills Anthropic too

[provider-transport-fetch] error provider=anthropic model=claude-opus-4-6 elapsedMs=54164 causeCode=ECONNRESET message=fetch failed
[agent/embedded] embedded run agent end: isError=true error=LLM request failed: network connection error.
Embedded agent failed before reply: LLM request failed: network connection error.

Best attempt — Discord heartbeat started but died

On one attempt with applicationId set, slash commands disabled, pricing disabled, and 45 providers disabled, Discord actually got further than usual:

[discord] discord client initialized as 1477815626566205613; awaiting gateway readiness
[gateway] gateway ready
heartbeat: started
[fetch-timeout] url=discord.com/api/v10/users/@me — timeout
[discord] discord gateway: Gateway websocket closed: 1006

What we tried (all failed)

AttemptResult
Set channels.discord.applicationId (as string)Skipped app lookup but /users/@me probe still fires and retries
Set commands.native: false + nativeSkills: falseSlash command deploy skipped, but /users/@me probe still starves loop
Extended all gateway timeouts to 120000msDiscord WebSocket still dies during auth pre-warm
Disabled models.pricingPre-warm slightly faster but still kills Discord
Disabled 45 unused provider pluginsStartup faster (5.3s vs 9.4s) but auth pre-warm still blocks 33s+
Disabled codex + acpx pluginsCodex still loaded (stock plugin?), Discord still failed
Fresh uninstall/reinstall of Discord pluginSame behavior
Downgrade to @openclaw/[email protected]Plugin too old, didn't load
Upgrade to @openclaw/[email protected]Incompatible: api.registerMeetingNotesSourceProvider is not a function
Restart with webchat closed (no competing embedded runs)Same behavior
Token via env SecretRefSECRETS_REF_IGNORED_INACTIVE_SURFACE — token not picked up
Token as direct stringToken works, Discord initializes, but still dies to event loop starvation

Root cause analysis

The /users/@me cold-boot probe in the Discord adapter has a 2.5s initial timeout that is too aggressive for a Node process performing concurrent auth pre-warming. When the probe times out:

  1. It retries with a 10s timeout
  2. Each retry adds async work to an already-loaded event loop
  3. Auth pre-warming (OAuth token refresh for google-gemini-cli, openai-codex) also needs the event loop
  4. The compounding load causes timer delays of 8-34 seconds
  5. Discord's WebSocket heartbeat ACK window (~41.25s) is exceeded
  6. WebSocket closes with code 1006
  7. Health monitor triggers restart, adding more probe retries
  8. Death spiral continues until Discord is disabled

The slash command deploy retry amplification was fixed in commit 2b9b133 ("avoid startup rest amplification"), which is already in @openclaw/[email protected]. But the /users/@me probe retry path was NOT addressed by that fix.

Suggested fixes

  1. Exponential backoff on /users/@me probe failures (1s, 2s, 4s, 8s... capped at 60s, with jitter)
  2. Bump cold-boot probe timeout to 30s — this is a one-shot identity check, not a hot path
  3. Circuit breaker on the inner retry loop — after N consecutive timeouts in start-account, pause retries and surface a clear error instead of self-starving
  4. Defer /users/@me probe behind setImmediate/idle queue so it yields to auth pre-warming and other startup phases instead of fighting them for the same tick
  5. Consider making the probe optional when applicationId is already configured — if we know the bot identity, the cold-boot probe is redundant

Workaround

Rolling back to OpenClaw v2026.4.10 where Discord worked. No config-level workaround was found for v2026.5.22.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING