openclaw - 💡(How to fix) Fix Telegram cascade on v2026.5.7: polling stall + outbound HTTP agent never rebuilt; multi-account start-account causes event-loop starvation

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

On v2026.5.7 (Windows 11, 8 Telegram bot accounts) the gateway entered a Telegram delivery cascade with two distinct failure modes back-to-back:

  1. Pre-restart: 148s getUpdates polling stall, followed by ~50 consecutive sendChatAction "Network request failed" errors firing every ~3s with no backoff. Network was healthy throughout — DNS, TCP, TLS to api.telegram.org all OK from the same host at the same time. The polling-cycle's internal transport rebuild fired (closing stale transport before rebuildrebuilding transport for next polling cycle) but did not repair the sendChatAction HTTP agent — outbound failures continued unabated until a full gateway restart.
  2. Post-restart: During channels.telegram.start-account (8 bots starting in parallel) the gateway hit event-loop starvation (eventLoopDelayMaxMs=8384.4ms, timer delayed 5604ms, likely event-loop starvation), causing multiple getMe calls to time out at 10–15s.

This appears to be a combination of, or interaction between, several already-tracked issues:

  • #50040 — polling stalls causing silent outbound loss (still OPEN)
  • #56096 — sendChatAction infinite retry loop with no backoff (still OPEN)
  • #80362 — Network request for 'X' failed! regex too strict; drops outbound (still OPEN)
  • Closed but symptomatically identical: #76164, #76172, #76258

The new signal here is that both halves reproduce cleanly on v2026.5.7 — the closed .4.24/.4.25 issues were not, in fact, fully resolved on .5.7; and the inner polling-rebuild explicitly does not heal the outbound HTTP agent.

Error Message

  1. Recognize bare grammy error string per #80362 so Network request for 'sendChatAction' failed! is classified recoverable and a retry-with-fresh-socket can fire instead of dropping the send.

Root Cause

  • #50040 (open) — polling stalls + silent outbound loss
  • #56096 (open) — sendChatAction retry loop, no backoff
  • #80362 (open) — strict regex drops outbound on bare grammy failures
  • #76164 / #76172 / #76258 (closed) — symptomatically identical for .4.24/.4.25; this report demonstrates the same patterns reproduce on v2026.5.7
  • #79380 (open) — Pi 4 CPU spin regression .4.23 → .4.25+ (may share start-account fanout root cause)

Fix Action

Fix / Workaround

11:30:22.669 [telegram] Polling stall detected (no completed getUpdates for 148.89s); forcing restart.
              [diag inFlight=0 outcome=ok durationMs=7074 offset=0 apiElapsedMs=2487]
11:30:22.692 [telegram] sendChatAction failed: Network request for 'sendChatAction' failed!
11:30:25.710 [telegram] sendChatAction failed: Network request for 'sendChatAction' failed!
11:30:28.693 [telegram] sendChatAction failed: Network request for 'sendChatAction' failed!
...  (50+ consecutive failures at ~3s cadence)
11:30:37.676 [telegram] Polling runner stop timed out after 15s; forcing restart cycle.
11:30:54.756 [telegram] [diag] closing stale transport before rebuild
11:30:54.759 [telegram] [diag] rebuilding transport for next polling cycle
11:30:55.715 [telegram] sendChatAction failed: Network request for 'sendChatAction' failed!  ← rebuild did NOT fix outbound
... (failures continue past rebuild, only fixed by full gateway restart)

Code Example

11:30:22.669 [telegram] Polling stall detected (no completed getUpdates for 148.89s); forcing restart.
              [diag inFlight=0 outcome=ok durationMs=7074 offset=0 apiElapsedMs=2487]
11:30:22.692 [telegram] sendChatAction failed: Network request for 'sendChatAction' failed!
11:30:25.710 [telegram] sendChatAction failed: Network request for 'sendChatAction' failed!
11:30:28.693 [telegram] sendChatAction failed: Network request for 'sendChatAction' failed!
...  (50+ consecutive failures at ~3s cadence)
11:30:37.676 [telegram] Polling runner stop timed out after 15s; forcing restart cycle.
11:30:54.756 [telegram] [diag] closing stale transport before rebuild
11:30:54.759 [telegram] [diag] rebuilding transport for next polling cycle
11:30:55.715 [telegram] sendChatAction failed: Network request for 'sendChatAction' failed!  ← rebuild did NOT fix outbound
... (failures continue past rebuild, only fixed by full gateway restart)

---

Test-NetConnection api.telegram.org:443True
Resolve-DnsName    api.telegram.org149.154.166.110 (TTL 249)
curl https://api.telegram.org/           → http=302 dns=0.008s connect=0.115s total=0.398s

---

11:33:47.768 [gateway] resolving authentication…
11:35:06.071 [gateway] http server listening (9 plugins; 78.2s)
11:35:16.715 [gateway] startup model warmup timed out after 5000ms; continuing without waiting
11:35:19.942 [telegram] [alex] starting provider (@alexclawdmozibot)
11:35:20.280 [telegram] [tony] starting provider (@TonyClawbinsBot)
11:35:20.283 [telegram] [alex-demo] starting provider (@alexclawdmozidemobot)
11:35:20.285 [telegram] [cllawway] starting provider (@cllawawaybot)
11:35:20.288 [telegram] [alan] starting provider (@alanclawttsbot)
11:35:20.291 [telegram] [donald-clawmp] starting provider (@donaldclawmpbot)
11:35:20.294 [telegram] [clawd] starting provider (@clawdminsterbot)
11:35:20.297 [telegram] [daniel-clawstley] starting provider (@danielclawdslibot)

11:38:10.220 [diagnostic] liveness warning: reasons=event_loop_delay
              eventLoopDelayP99Ms=23.8 eventLoopDelayMaxMs=8384.4
              eventLoopUtilization=0.672 cpuCoreRatio=0.639
              phase=channels.telegram.start-account
11:38:31.430 [fetch-timeout] fetch timeout after 10000ms (elapsed 10682ms) operation=fetchWithTimeout
              url=https://api.telegram.org/bot823795…/getMe
11:38:47.038 [fetch-timeout] fetch timeout after 10000ms (elapsed 15604ms) timer delayed 5604ms,
              likely event-loop starvation operation=fetchWithTimeout
              url=https://api.telegram.org/bot818818…/getMe
11:39:09.345 [fetch-timeout] fetch timeout after 10000ms (elapsed 10133ms) operation=fetchWithTimeout
              url=https://api.telegram.org/bot858735…/getMe
RAW_BUFFERClick to expand / collapse

Summary

On v2026.5.7 (Windows 11, 8 Telegram bot accounts) the gateway entered a Telegram delivery cascade with two distinct failure modes back-to-back:

  1. Pre-restart: 148s getUpdates polling stall, followed by ~50 consecutive sendChatAction "Network request failed" errors firing every ~3s with no backoff. Network was healthy throughout — DNS, TCP, TLS to api.telegram.org all OK from the same host at the same time. The polling-cycle's internal transport rebuild fired (closing stale transport before rebuildrebuilding transport for next polling cycle) but did not repair the sendChatAction HTTP agent — outbound failures continued unabated until a full gateway restart.
  2. Post-restart: During channels.telegram.start-account (8 bots starting in parallel) the gateway hit event-loop starvation (eventLoopDelayMaxMs=8384.4ms, timer delayed 5604ms, likely event-loop starvation), causing multiple getMe calls to time out at 10–15s.

This appears to be a combination of, or interaction between, several already-tracked issues:

  • #50040 — polling stalls causing silent outbound loss (still OPEN)
  • #56096 — sendChatAction infinite retry loop with no backoff (still OPEN)
  • #80362 — Network request for 'X' failed! regex too strict; drops outbound (still OPEN)
  • Closed but symptomatically identical: #76164, #76172, #76258

The new signal here is that both halves reproduce cleanly on v2026.5.7 — the closed .4.24/.4.25 issues were not, in fact, fully resolved on .5.7; and the inner polling-rebuild explicitly does not heal the outbound HTTP agent.

Environment

  • OpenClaw: webchat v2026.5.7 (per [ws] handshake banner)
  • OS: Windows 11 Pro N (10.0.26200)
  • Node: bundled
  • Plugins active: browser, device-pair, file-transfer, google, memory-core, microsoft, phone-control, talk-voice, telegram
  • Telegram accounts: 8 (alex, tony, alex-demo, cllawway, alan, donald-clawmp, clawd, daniel-clawstley)
  • IPv4-only forced via --require force-ipv4.js (NODE_OPTIONS)

Pre-restart cascade (excerpt)

11:30:22.669 [telegram] Polling stall detected (no completed getUpdates for 148.89s); forcing restart.
              [diag inFlight=0 outcome=ok durationMs=7074 offset=0 apiElapsedMs=2487]
11:30:22.692 [telegram] sendChatAction failed: Network request for 'sendChatAction' failed!
11:30:25.710 [telegram] sendChatAction failed: Network request for 'sendChatAction' failed!
11:30:28.693 [telegram] sendChatAction failed: Network request for 'sendChatAction' failed!
...  (50+ consecutive failures at ~3s cadence)
11:30:37.676 [telegram] Polling runner stop timed out after 15s; forcing restart cycle.
11:30:54.756 [telegram] [diag] closing stale transport before rebuild
11:30:54.759 [telegram] [diag] rebuilding transport for next polling cycle
11:30:55.715 [telegram] sendChatAction failed: Network request for 'sendChatAction' failed!  ← rebuild did NOT fix outbound
... (failures continue past rebuild, only fixed by full gateway restart)

Network was healthy during the cascade

Run from the same host while the failures were still firing:

Test-NetConnection api.telegram.org:443 → True
Resolve-DnsName    api.telegram.org      → 149.154.166.110 (TTL 249)
curl https://api.telegram.org/           → http=302 dns=0.008s connect=0.115s total=0.398s

So the failures were not network-level — strongly suggests a stale persistent HTTP/keep-alive socket inside the sendChatAction HTTP agent that the polling-rebuild path doesn't touch.

Post-restart event-loop starvation

11:33:47.768 [gateway] resolving authentication…
11:35:06.071 [gateway] http server listening (9 plugins; 78.2s)
11:35:16.715 [gateway] startup model warmup timed out after 5000ms; continuing without waiting
11:35:19.942 [telegram] [alex] starting provider (@alexclawdmozibot)
11:35:20.280 [telegram] [tony] starting provider (@TonyClawbinsBot)
11:35:20.283 [telegram] [alex-demo] starting provider (@alexclawdmozidemobot)
11:35:20.285 [telegram] [cllawway] starting provider (@cllawawaybot)
11:35:20.288 [telegram] [alan] starting provider (@alanclawttsbot)
11:35:20.291 [telegram] [donald-clawmp] starting provider (@donaldclawmpbot)
11:35:20.294 [telegram] [clawd] starting provider (@clawdminsterbot)
11:35:20.297 [telegram] [daniel-clawstley] starting provider (@danielclawdslibot)

11:38:10.220 [diagnostic] liveness warning: reasons=event_loop_delay
              eventLoopDelayP99Ms=23.8 eventLoopDelayMaxMs=8384.4
              eventLoopUtilization=0.672 cpuCoreRatio=0.639
              phase=channels.telegram.start-account
11:38:31.430 [fetch-timeout] fetch timeout after 10000ms (elapsed 10682ms) operation=fetchWithTimeout
              url=https://api.telegram.org/bot823795…/getMe
11:38:47.038 [fetch-timeout] fetch timeout after 10000ms (elapsed 15604ms) timer delayed 5604ms,
              likely event-loop starvation operation=fetchWithTimeout
              url=https://api.telegram.org/bot818818…/getMe
11:39:09.345 [fetch-timeout] fetch timeout after 10000ms (elapsed 10133ms) operation=fetchWithTimeout
              url=https://api.telegram.org/bot858735…/getMe

8 bots starting in parallel + first-call agent-model resolution + session-locks recovery + model warmup all bunched into the same tick window. The 5.6s timer delay confirms the loop was actually parked, not just slow.

Reproduction

Run a gateway with ≥8 Telegram accounts on Windows. Leave it idle overnight, then send an inbound message. Most days nothing happens; on a stall day the pre-restart cascade above appears. Restart the gateway and you'll see the post-restart starvation pattern reliably during start-account fanout.

Proposed fixes (for triage)

  1. Outbound HTTP agent rebuild. When polling-rebuild fires, also dispose the outbound (sendChatAction/sendMessage) agent — or share a single Undici/grammy transport instance so a single rebuild fixes both directions. The current code path leaves outbound permanently broken until full restart.
  2. Backoff on sendChatAction failures. Per #56096 — the 3s/3s/3s repeat with no jitter or escalation is a hot loop.
  3. Bound multi-account start-account fanout. Throttle to N=2–3 concurrent start-account calls instead of all 8 at once, or move the first getMe off the hot path so model warmup + agent prep don't compound.
  4. Recognize bare grammy error string per #80362 so Network request for 'sendChatAction' failed! is classified recoverable and a retry-with-fresh-socket can fire instead of dropping the send.

Related

  • #50040 (open) — polling stalls + silent outbound loss
  • #56096 (open) — sendChatAction retry loop, no backoff
  • #80362 (open) — strict regex drops outbound on bare grammy failures
  • #76164 / #76172 / #76258 (closed) — symptomatically identical for .4.24/.4.25; this report demonstrates the same patterns reproduce on v2026.5.7
  • #79380 (open) — Pi 4 CPU spin regression .4.23 → .4.25+ (may share start-account fanout root cause)

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING