openclaw - 💡(How to fix) Fix [Bug]: undici HTTP/2 hang on Windows extends from Telegram polling into the LLM model dispatcher (related to #66885) [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#73831Fetched 2026-04-29 06:14:31
View on GitHub
Comments
1
Participants
2
Timeline
5
Reactions
0
Timeline (top)
mentioned ×2subscribed ×2commented ×1

On Windows running OpenClaw 2026.4.23 and 2026.4.26 with Node 24.13.0, all outbound fetch-based HTTP calls intermittently hang for 90–200 seconds before failing. This affects:

  1. Telegram getUpdates long-polling (already noted in #66885)
  2. Telegram sendMessage outbound (logged as Network request for 'sendMessage' failed!)
  3. Model dispatcher LLM calls (e.g. openai/claude-opus-4-7) — LLM request timed out after the configured 97s

The third one is new — #66885 only mentions telegram polling and subagent announce, but the same undici socket pool hang is now blocking actual model invocations on the main agent. After we layered every reasonable client-side mitigation, telegram bot-internal commands like /status work (no LLM), but any real agent run on a long prompt times out.

Error Message

telegram sendMessage failed: Network request for 'sendMessage' failed! telegram slash block reply failed: HttpError: Network request for 'sendMessage' failed! telegram sendMessage failed: Network request for 'sendMessage' failed! telegram slash final reply failed: HttpError: Network request for 'sendMessage' failed!

Root Cause

The current state on Windows is that:

  • Bot-internal commands work (no LLM call)
  • Cron jobs and any prompt to a main agent intermittently time out at the model call layer
  • The watchdog masks the issue for telegram (eventually retries) but not for model calls (one shot, 97s, fail)

Mac users on identical OpenClaw versions are entirely unaffected because their IPv6 stack is healthy enough to negotiate HTTP/2.

Fix Action

Fix / Workaround

[Bug]: undici HTTP/2 hang on Windows extends from Telegram polling into the LLM model dispatcher (related to #66885 / #10795 / #4847)

  1. Telegram getUpdates long-polling (already noted in #66885)
  2. Telegram sendMessage outbound (logged as Network request for 'sendMessage' failed!)
  3. Model dispatcher LLM calls (e.g. openai/claude-opus-4-7) — LLM request timed out after the configured 97s

The third one is new — #66885 only mentions telegram polling and subagent announce, but the same undici socket pool hang is now blocking actual model invocations on the main agent. After we layered every reasonable client-side mitigation, telegram bot-internal commands like /status work (no LLM), but any real agent run on a long prompt times out.

Code Example

[telegram] Polling stall detected (active getUpdates stuck for 178.44s); forcing restart.
   [diag inFlight=1 outcome=started startedAt=1777406174509 finishedAt=1777406174509 durationMs=30356 offset=0]
[telegram][diag] polling cycle finished reason=polling stall detected
   error=Network request for 'getUpdates' failed!
Telegram polling runner stopped (polling stall detected); restarting in 3.78s.
[telegram][diag] rebuilding transport for next polling cycle

---

telegram sendMessage failed: Network request for 'sendMessage' failed!
telegram slash block reply failed: HttpError: Network request for 'sendMessage' failed!
telegram sendMessage failed: Network request for 'sendMessage' failed!
telegram slash final reply failed: HttpError: Network request for 'sendMessage' failed!

---

lane task error: lane=session:agent:main:main durationMs=96963 error="FailoverError: LLM request timed out."
lane task error: lane=main durationMs=7477 error="FailoverError: openrouter (openai/gpt-5.5) returned a billing error..."
Embedded agent failed before reply: All models failed (2):
   openai/claude-opus-4-7: LLM request timed out. (timeout)
   openrouter/openai/gpt-5.5: 402 This request requires more credits…

---

Invoke-RestMethod  https://api.telegram.org/bot$bot/getMe          → 404 ms ✅
Invoke-RestMethod  https://api.telegram.org/bot$bot/getUpdates?…  → 399 ms  (returned 2 pending updates)

---

new Agent({
  allowH2: false,
  connect: { autoSelectFamily: false, family: 4 },
})
RAW_BUFFERClick to expand / collapse

[Bug]: undici HTTP/2 hang on Windows extends from Telegram polling into the LLM model dispatcher (related to #66885 / #10795 / #4847)

Summary

On Windows running OpenClaw 2026.4.23 and 2026.4.26 with Node 24.13.0, all outbound fetch-based HTTP calls intermittently hang for 90–200 seconds before failing. This affects:

  1. Telegram getUpdates long-polling (already noted in #66885)
  2. Telegram sendMessage outbound (logged as Network request for 'sendMessage' failed!)
  3. Model dispatcher LLM calls (e.g. openai/claude-opus-4-7) — LLM request timed out after the configured 97s

The third one is new — #66885 only mentions telegram polling and subagent announce, but the same undici socket pool hang is now blocking actual model invocations on the main agent. After we layered every reasonable client-side mitigation, telegram bot-internal commands like /status work (no LLM), but any real agent run on a long prompt times out.

Affected versions

  • 2026.4.26 (be8c246) — first observed today (2026-04-28). Telegram polling stalls every 10–15 min, sendMessage failures, model timeouts.
  • 2026.4.23 (a979721) — same behavior after rolling back. Bug is not version-specific within this range.

Environment

  • OS: Windows 10.0.26200 (x64)
  • Node: 24.13.0
  • OpenClaw user config: agents.defaults.model.primary = openai/claude-opus-4-7, channels.telegram.streaming.mode=partial
  • Network: Tailscale + LAN, behind Comcast NAT, IPv6 already disabled at adapter binding (Disable-NetAdapterBinding -Name Ethernet -ComponentID ms_tcpip6 shows Enabled: False)
  • Comparable Mac on identical 2026.4.26: zero stalls. Issue is Windows-specific.

Mitigations already applied (none fully resolve)

  1. channels.telegram.streaming.mode=partial, autoSelectFamily=true, dnsResultOrder=ipv4first (set by OpenClaw runtime — see [telegram/network] dnsResultOrder=ipv4first (default-node22) log line)
  2. Add-MpPreference -ExclusionPath for the openclaw npm node_modules path
  3. Add-MpPreference -ExclusionProcess "node.exe"
  4. ✅ Inserted set "NODE_OPTIONS=--dns-result-order=ipv4first" into gateway.cmd before the node.exe launch line (process-level, not just runtime hint)
  5. Disable-NetAdapterBinding -ComponentID ms_tcpip6 on the active Ethernet adapter (was already disabled)
  6. ✅ Hard reboot of the Windows host to flush stuck undici sockets
  7. ✅ Full gateway restart (multiple times)

After all of the above, /status and other bot-internal commands respond instantly. Long prompts to the main agent still time out at 97s on the Anthropic call.

Logs

Telegram polling stall pattern (recurring all afternoon, ~every 10–15 min)

[telegram] Polling stall detected (active getUpdates stuck for 178.44s); forcing restart.
   [diag inFlight=1 outcome=started startedAt=1777406174509 finishedAt=1777406174509 durationMs=30356 offset=0]
[telegram][diag] polling cycle finished reason=polling stall detected
   error=Network request for 'getUpdates' failed!
Telegram polling runner stopped (polling stall detected); restarting in 3.78s.
[telegram][diag] rebuilding transport for next polling cycle

Telegram sendMessage failure pattern (slash command replies dropped)

telegram sendMessage failed: Network request for 'sendMessage' failed!
telegram slash block reply failed: HttpError: Network request for 'sendMessage' failed!
telegram sendMessage failed: Network request for 'sendMessage' failed!
telegram slash final reply failed: HttpError: Network request for 'sendMessage' failed!

(and intermittently, the same path succeeds: telegram sendMessage ok chat=… message=17754 2 seconds later.)

NEW: model dispatcher timeout (this is the part #66885 doesn't cover)

lane task error: lane=session:agent:main:main durationMs=96963 error="FailoverError: LLM request timed out."
lane task error: lane=main durationMs=7477 error="FailoverError: openrouter (openai/gpt-5.5) returned a billing error..."
Embedded agent failed before reply: All models failed (2):
   openai/claude-opus-4-7: LLM request timed out. (timeout)
   openrouter/openai/gpt-5.5: 402 This request requires more credits…

The 96963 ms duration matches the [default] starting providerLLM request timed out envelope perfectly — same undici hang shape as the telegram stalls, but on the model call.

Direct API tests bypassing OpenClaw (PowerShell, same machine, same network)

Invoke-RestMethod  https://api.telegram.org/bot$bot/getMe          → 404 ms ✅
Invoke-RestMethod  https://api.telegram.org/bot$bot/getUpdates?…  → 399 ms ✅ (returned 2 pending updates)

So api.telegram.org, DNS, TLS, the bot token, and Windows TCP all work. The hang is inside undici's connection pool when the same calls go through Node's built-in fetch.

Suggested root cause (from forum / prior issue references)

Per #66885 and #10795: Node 22+ undici implements Happy Eyeballs but ignores net.setDefaultAutoSelectFamily. When allowH2: true (default) and the host advertises HTTP/2 + IPv6, undici can keep an HTTP/2 stream half-open against an IPv6 path that Windows can't actually route. The dispatcher sits in inFlight until the watchdog kills it.

#66885 fixed this for web_fetch in 4.7 by setting allowH2: false on that dispatcher. The same fix appears not to be applied to:

  • The Telegram polling/sending dispatcher (4.12 onwards per #66885, still in 4.23/4.26)
  • The model dispatcher used for actual LLM requests (openai/claude-opus-4-7 etc.)

Suggested fix

Apply the same allowH2: false (and explicit autoSelectFamily: false in the underlying Agent) to:

  1. The Telegram channel's outbound HTTP client (covering both getUpdates and sendMessage)
  2. The model dispatcher used by agents/harness for provider calls

Both should use a shared undici Agent configured to:

new Agent({
  allowH2: false,
  connect: { autoSelectFamily: false, family: 4 },
})

…or accept a user-supplied dispatcher via env (UNDICI_HTTP1_ONLY=1 or similar) so Windows users without IPv6 routability can opt in without code changes.

Why this matters

The current state on Windows is that:

  • Bot-internal commands work (no LLM call)
  • Cron jobs and any prompt to a main agent intermittently time out at the model call layer
  • The watchdog masks the issue for telegram (eventually retries) but not for model calls (one shot, 97s, fail)

Mac users on identical OpenClaw versions are entirely unaffected because their IPv6 stack is healthy enough to negotiate HTTP/2.

Related issues

  • #66885 — Telegram polling stall on Windows (4.12), undici HTTP/2 root cause
  • #10795 — Node 22+ undici ignores net.setDefaultAutoSelectFamily
  • #4847 — Telegram sendMessage fails with 'Network request failed' while curl/standalone Node works
  • #25676 — 2026.2.23 outbound Telegram regression (Node 22 undici)
  • Blog post: https://blog.juchunko.com/en/openclaw-telegram-ipv6-fix/

cc @steipete — given you tracked #71325 to landing, this may be in your area.

extent analysis

TL;DR

The most likely fix is to apply the allowH2: false and autoSelectFamily: false configuration to the Telegram and model dispatchers' undici Agents.

Guidance

  1. Verify the undici version: Ensure that the undici version being used is compatible with the suggested fix.
  2. Apply the configuration: Update the Telegram and model dispatchers to use a shared undici Agent with allowH2: false and autoSelectFamily: false.
  3. Test the fix: Run tests to verify that the hang issue is resolved for both Telegram polling/sending and model dispatcher calls.
  4. Consider user-supplied dispatcher: Allow users to opt-in to the fix via an environment variable (e.g., UNDICI_HTTP1_ONLY=1) for easier configuration.

Example

const { Agent } = require('undici');
const agent = new Agent({
  allowH2: false,
  connect: { autoSelectFamily: false, family: 4 },
});

Notes

The suggested fix may not apply to all environments, especially those with healthy IPv6 stacks. Mac users are reportedly unaffected.

Recommendation

Apply the workaround by setting allowH2: false and autoSelectFamily: false in the undici Agents, as this has been shown to resolve similar issues in the past (e.g., #66885).

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: undici HTTP/2 hang on Windows extends from Telegram polling into the LLM model dispatcher (related to #66885) [1 comments, 2 participants]