openclaw - 💡(How to fix) Fix [Bug]: Multi-account Telegram startup — all 16 bots timeout simultaneously, 10s timeout blown to 41-82s with no circuit breaker [1 comments, 2 participants]

openclaw2026-05-06 08:06:55

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#78353•Fetched 2026-05-07 03:37:58

View on GitHub

Comments

Participants

Timeline

Reactions

Author

xiaobu1112

Participants

clawsweeper[bot]

xiaobu1112

Timeline (top)

commented ×1cross-referenced ×1

When running OpenClaw with 16 Telegram bot accounts, every single bot's getMe health check fails with fetch-timeout during startup. The 10-second timeout is repeatedly blown out to 41-82 seconds elapsed due to event loop starvation — but the root issue here is that the Telegram channel init pipeline lacks any circuit breaker or staggered retry logic for multi-account scenarios.

Error Message

Each timeout generates a WARN log → more I/O → more EL pressure

Root Cause

All 16 bots fire getMe nearly simultaneously during channels.telegram.start-account
Event loop is saturated by synchronous initialization work
Timer callbacks are delayed by 56-72 seconds
Fetch timeouts fire late (or not at all) because the timer itself was delayed
Each timeout generates a WARN log → more I/O → more EL pressure
Subsequent operations (session recovery, model prewarm) are further delayed

Code Example

[fetch-timeout] fetch timeout after 10000ms (elapsed 66668ms)
timer delayed 56668ms, likely event-loop starvation
operation=fetchWithTimeout url=https://api.telegram.org/bot856265...5qhQ/getMe

[fetch-timeout] fetch timeout after 10000ms (elapsed 82339ms)
timer delayed 72339ms, likely event-loop starvation
operation=fetchWithTimeout url=https://api.telegram.org/bot809652...ZAHE/getMe

RAW_BUFFERClick to expand / collapse

Description

Environment

OS: Windows 10.0.26200 (x64)
Node.js: v25.9.0
OpenClaw: 2026.5.x
Telegram bots: 16 concurrent bot accounts

Observed Behavior

All 16 Telegram bots fail getMe during startup:

Bot Token (suffix)	Timeout	Elapsed	Timer Delay
`...5qhQ`	10,000ms	66,668ms	56,668ms
`...8YYQ`	10,000ms	29,368ms	19,368ms
`...ZAHE`	10,000ms	41,638ms	31,638ms
`...yl6g`	10,000ms	13,245ms	3,245ms
`...4A_Y`	10,000ms	10,852ms	852ms
`...Pgpk`	10,000ms	10,506ms	506ms
(+ 10 more)	10,000ms	10,000-82,000ms	varied

Log Samples

[fetch-timeout] fetch timeout after 10000ms (elapsed 66668ms)
timer delayed 56668ms, likely event-loop starvation
operation=fetchWithTimeout url=https://api.telegram.org/bot856265...5qhQ/getMe

[fetch-timeout] fetch timeout after 10000ms (elapsed 82339ms)
timer delayed 72339ms, likely event-loop starvation
operation=fetchWithTimeout url=https://api.telegram.org/bot809652...ZAHE/getMe

Chain Reaction

All 16 bots fire getMe nearly simultaneously during channels.telegram.start-account
Event loop is saturated by synchronous initialization work
Timer callbacks are delayed by 56-72 seconds
Fetch timeouts fire late (or not at all) because the timer itself was delayed
Each timeout generates a WARN log → more I/O → more EL pressure
Subsequent operations (session recovery, model prewarm) are further delayed

Relationship to Existing Issues

#77900 covers a single bot with ENETUNREACH (network offline) lacking circuit breaker
This issue is about multi-account startup cascade — all 16 accounts timing out simultaneously even when the network is healthy. The root cause is serial initialization + no stagger + no batch-aware retry.

Impact

Startup time: 2-3+ minutes before all bots either succeed or give up
User perception: "Gateway started but Telegram not responding" for minutes
Cascading failure: Event loop starvation from this phase bleeds into all other subsystems
False alarms: Diagnostic liveness warnings trigger because the ELU stays at 1.0 during startup

Suggested Fix Direction

Staggered startup — Add configurable delay between bot account initializations (e.g., 2-3s gap per account)
Batch-aware circuit breaker — If >50% of getMe calls fail within a window, pause and back off
Worker-thread offload — Move fetchWithTimeout for channel health checks off the main event loop
Adaptive timeout — During startup phase, increase timeout to 30s instead of flat 10s
Parallel-but-bounded — Limit concurrent getMe calls to N at a time (e.g., 4 concurrent) instead of all 16 simultaneously

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #installation #tensor shape #autograd error #model save/load

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix [Bug]: Multi-account Telegram startup — all 16 bots timeout simultaneously, 10s timeout blown to 41-82s with no circuit breaker [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Description

Environment

Observed Behavior

Log Samples

Chain Reaction

Relationship to Existing Issues

Impact

Suggested Fix Direction

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: Multi-account Telegram startup — all 16 bots timeout simultaneously, 10s timeout blown to 41-82s with no circuit breaker [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Description

Environment

Observed Behavior

Log Samples

Chain Reaction

Relationship to Existing Issues

Impact

Suggested Fix Direction

Still need to ship something?

RELATED_DISCOVERY

TRENDING