openclaw - 💡(How to fix) Fix [Bug]: Discord /gateway/bot metadata lookup times out at 10s starting ~2026-04-27 (raw HTTP latency is fine) [2 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#73585Fetched 2026-04-29 06:17:53
View on GitHub
Comments
2
Participants
3
Timeline
4
Reactions
0
Author
Timeline (top)
commented ×2closed ×1cross-referenced ×1

Starting around 2026-04-27, the Discord provider's /gateway/bot metadata fetch began timing out frequently (gateway metadata lookup failed transiently; using default gateway url ... gateway metadata timeout), even though raw HTTP latency to discord.com from the same host is healthy. The 10s hardcoded DISCORD_GATEWAY_INFO_TIMEOUT_MS is no longer a comfortable margin against whatever the bot's fetch() is doing.

Error Message

  • Discord API /gateway/bot timed out after ${timeoutMs}ms ... cause: new Error("gateway metadata timeout")

Root Cause

Starting around 2026-04-27, the Discord provider's /gateway/bot metadata fetch began timing out frequently (gateway metadata lookup failed transiently; using default gateway url ... gateway metadata timeout), even though raw HTTP latency to discord.com from the same host is healthy. The 10s hardcoded DISCORD_GATEWAY_INFO_TIMEOUT_MS is no longer a comfortable margin against whatever the bot's fetch() is doing.

Fix Action

Fix / Workaround

  1. Bump DISCORD_GATEWAY_INFO_TIMEOUT_MS default to ~30s, or expose it as an env var / config (OPENCLAW_DISCORD_GATEWAY_INFO_TIMEOUT_MS).
  2. Force a fresh dispatcher / disable H2 connection reuse for the /gateway/bot request specifically (undici.fetch(..., { dispatcher: new Agent({ pipelining: 0, allowH2: false }) })), since the response is small, infrequent, and benefits nothing from reuse.
  3. Demote the warning to debug-level once retries succeed, or rate-limit it (one log line per minute), so a sustained period of failures doesn't drown the log.

Happy to share more journal excerpts or test a patched build if helpful.

Code Example

$ for i in 1 2 3; do curl -o /dev/null -s -w "attempt $i: %{http_code} dns=%{time_namelookup}s connect=%{time_connect}s tls=%{time_appconnect}s total=%{time_total}s\n" --max-time 10 https://discord.com/api/v10/gateway/bot; done
attempt 1: 401 dns=0.002s connect=0.007s tls=0.052s total=0.124s
attempt 2: 401 dns=0.002s connect=0.008s tls=0.053s total=0.117s
attempt 3: 401 dns=0.002s connect=0.007s tls=0.050s total=0.172s
RAW_BUFFERClick to expand / collapse

Summary

Starting around 2026-04-27, the Discord provider's /gateway/bot metadata fetch began timing out frequently (gateway metadata lookup failed transiently; using default gateway url ... gateway metadata timeout), even though raw HTTP latency to discord.com from the same host is healthy. The 10s hardcoded DISCORD_GATEWAY_INFO_TIMEOUT_MS is no longer a comfortable margin against whatever the bot's fetch() is doing.

Environment

  • [email protected] (npm-global install)
  • Node: system /usr/bin/node
  • Linux (Debian-family), ens18 + LAN-resolver DNS
  • Multiple Discord bots configured under one gateway process (chief, default, security, opencoder, reviewer, architect, notifier, researcher)

Evidence

gateway metadata lookup failed transiently occurrences per day (from journalctl --user -u openclaw-gateway.service):

Daycount
Apr 212 (startup only)
Apr 222
Apr 232
Apr 27184
Apr 28 (partial, through 09:25 EDT)178

Sharp inflection on Apr 27 with no version change on this host (gateway has been on 2026.4.26 since install).

Raw HTTP probe to the same endpoint from the same host while warnings are firing:

$ for i in 1 2 3; do curl -o /dev/null -s -w "attempt $i: %{http_code} dns=%{time_namelookup}s connect=%{time_connect}s tls=%{time_appconnect}s total=%{time_total}s\n" --max-time 10 https://discord.com/api/v10/gateway/bot; done
attempt 1: 401 dns=0.002s connect=0.007s tls=0.052s total=0.124s
attempt 2: 401 dns=0.002s connect=0.008s tls=0.053s total=0.117s
attempt 3: 401 dns=0.002s connect=0.007s tls=0.050s total=0.172s

401 is the expected unauthenticated response — the point is total time is ~120ms, well under the 10s timeout. So the underlying network path and TLS handshake are fine.

Hypotheses

The mismatch (curl: ~120ms; bot's fetch: 10s timeout) suggests the issue is in how the bot's fetchImpl reuses connections, not the network:

  1. undici HTTP/2 session stuck. If the bot's fetch reuses a long-lived H2 session that's been half-closed by Cloudflare or stalled by a window-update issue, every subsequent request hangs until the per-request timeout, while a fresh connection (curl) succeeds instantly. The pattern — periodic, host-wide, unaffected by the network — fits this exactly.
  2. Per-bot-token slow path on Discord's side introduced ~Apr 27. Less likely given that other Discord traffic from these bots (gateway WS, REST messages) is healthy; only /gateway/bot chronically times out.

(1) is more consistent with the observation that the WS gateway connections themselves stay healthy on this host outside of an unrelated 14-minute upstream blip on Apr 28 09:11–09:25 EDT.

Code references

  • dist/extensions/discord/provider-*.js:
    • const DISCORD_GATEWAY_INFO_TIMEOUT_MS = 1e4; — hardcoded 10s constant
    • const timeoutMs = Math.max(1, params.timeoutMs ?? DISCORD_GATEWAY_INFO_TIMEOUT_MS);
    • Discord API /gateway/bot timed out after ${timeoutMs}ms ... cause: new Error("gateway metadata timeout")

The fallback path (using default gateway url) is working as designed and the bots stay online — this is log noise rather than a hard outage. But it floods journald and masks real failures.

Suggestions

  1. Bump DISCORD_GATEWAY_INFO_TIMEOUT_MS default to ~30s, or expose it as an env var / config (OPENCLAW_DISCORD_GATEWAY_INFO_TIMEOUT_MS).
  2. Force a fresh dispatcher / disable H2 connection reuse for the /gateway/bot request specifically (undici.fetch(..., { dispatcher: new Agent({ pipelining: 0, allowH2: false }) })), since the response is small, infrequent, and benefits nothing from reuse.
  3. Demote the warning to debug-level once retries succeed, or rate-limit it (one log line per minute), so a sustained period of failures doesn't drown the log.

Happy to share more journal excerpts or test a patched build if helpful.

extent analysis

TL;DR

Increase the DISCORD_GATEWAY_INFO_TIMEOUT_MS value or disable HTTP/2 connection reuse for the /gateway/bot request to mitigate the timeout issue.

Guidance

  • Verify the hypothesis of undici HTTP/2 session being stuck by checking the connection reuse behavior and its impact on the timeout.
  • Consider increasing the DISCORD_GATEWAY_INFO_TIMEOUT_MS value to a higher value (e.g., 30s) to provide a more comfortable margin against timeouts.
  • Test disabling HTTP/2 connection reuse for the /gateway/bot request using undici.fetch(..., { dispatcher: new Agent({ pipelining: 0, allowH2: false }) }) to see if it resolves the issue.
  • Demote the warning to debug-level or rate-limit it to prevent log flooding and make it easier to identify real failures.

Example

const fetchOptions = {
  dispatcher: new Agent({ pipelining: 0, allowH2: false })
};
undici.fetch('https://discord.com/api/v10/gateway/bot', fetchOptions);

Notes

The issue seems to be related to the undici HTTP/2 session being stuck, but further investigation is needed to confirm this hypothesis. Increasing the timeout value or disabling connection reuse may help mitigate the issue, but it's essential to monitor the behavior and adjust the solution accordingly.

Recommendation

Apply a workaround by increasing the DISCORD_GATEWAY_INFO_TIMEOUT_MS value or disabling HTTP/2 connection reuse for the /gateway/bot request, as this seems to be the most likely cause of the issue.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: Discord /gateway/bot metadata lookup times out at 10s starting ~2026-04-27 (raw HTTP latency is fine) [2 comments, 3 participants]