openclaw - ✅(Solved) Fix telegram polling can restart into not-started state after connectivity loss [3 pull requests, 1 participants]

openclaw2026-03-26 22:52:49

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#55406•Fetched 2026-04-08 01:39:55

View on GitHub

Comments

Participants

Timeline

Reactions

Author

sinogello

Participants

sinogello

Timeline (top)

cross-referenced ×3

After Telegram connectivity interruption and subsequent recovery, Telegram long polling can get stuck in a post-restart not-started state where the next polling cycle never actually begins its first getUpdates call.

In real deployment testing, the sequence was repeatedly:

a getUpdates request fails or becomes stuck during Telegram connectivity disruption
the watchdog detects the polling failure and forces a restart
the transport/bot is rebuilt successfully
the new polling cycle enters before-race
no new getUpdates request is ever started (inFlight=0, outcome=not-started)
without additional guarding, the process would wait for the full 90s stall watchdog and then repeat

This creates a misleading "Telegram connectivity has recovered but polling is still dead" failure mode.

Error Message

2026-03-27T02:56:01.784+08:00 [telegram] [diag] polling cycle finished reason=polling stall detected inFlight=0 outcome=error startedAt=1774550909084 finishedAt=1774551361781 durationMs=452697 offset=661064690 error=Network request for 'getUpdates' failed! [2026-03-27 06:25:19] #5335 HEALTH_FAIL elapsed=5.08s error=URLError desc=<urlopen error [Errno 54] Connection reset by peer>

Root Cause

This is useful because it narrows the issue to Telegram connectivity disruption, not just "entire network disappeared" behavior.

PR fix notes

PR #55918: telegram: cache botInfo across polling cycles to eliminate getMe() init stall

Repository: openclaw/openclaw
Author: sinogello
State: closed | merged: False
Link: https://github.com/openclaw/openclaw/pull/55918

Description (problem / solution / changelog)

Problem

After a network interruption, Telegram polling recovery is very slow (30s–minutes). The root cause is that every polling cycle creates a new Bot instance without botInfo, forcing the grammY runner to call bot.init() → getMe() on each restart.

Inside grammy, getMe() uses withRetries() with exponential backoff up to 20 minutes, and the runner calls bot.init() without passing an AbortSignal, so this retry loop cannot be externally cancelled. This is the root cause of the not-started stall observed after network recovery.

Call chain

polling-session.ts: runUntilAbort() loop
  → #createPollingBot()
    → createTelegramBot({ token, ... })  // no botInfo passed
      → new Bot(token, { client })       // this.me = undefined
  → run(bot, runnerOptions)              // @grammyjs/runner
    → source.supply()
      → await bot.init()                 // no abort signal!
        → withRetries(() => api.getMe()) // exponential backoff up to 20min
        → ...hangs here...
      → fetchUpdates(...)                // never reached

Fix

Cache the UserFromGetMe result from the first successful getMe() call and inject it as botInfo into subsequent Bot constructor calls. When botInfo is provided, grammy's isInited() returns true immediately, making bot.init() a no-op — getUpdates starts without any network round-trip.

Changes

bot.ts: Accept optional botInfo in TelegramBotOptions and pass it to the Bot constructor
bot.runtime.ts: Export UserFromGetMe type from @grammyjs/types
polling-session.ts: Cache botInfo after first successful init; pass it to subsequent bot instances

Result

First cold start: normal getMe() call, result cached
All subsequent cycles: isInited() === true → bot.init() is no-op → getUpdates starts immediately
Network recovery: new cycle skips getMe() entirely → near-instant polling resume

Relationship to #55406 / #55407

The startup watchdog (PR #55407) remains valuable as defense-in-depth for edge cases where botInfo is not yet cached (e.g. first cold start with no network). This PR addresses the root cause; the watchdog addresses the symptom.

Changed files

docs/.generated/plugin-sdk-api-baseline.json (modified, +9/-0)
docs/.generated/plugin-sdk-api-baseline.jsonl (modified, +1/-0)
extensions/anthropic/test-api.ts (added, +1/-0)
extensions/bluebubbles/api.ts (modified, +4/-0)
extensions/discord/action-runtime-api.ts (added, +1/-0)
extensions/discord/src/actions/handle-action.guild-admin.ts (modified, +1/-1)
extensions/discord/src/actions/handle-action.ts (modified, +1/-1)
extensions/google/test-api.ts (added, +1/-0)
extensions/image-generation-core/api.ts (added, +1/-0)
extensions/image-generation-core/package.json (added, +7/-0)
extensions/image-generation-core/runtime-api.ts (added, +6/-0)
extensions/image-generation-core/src/runtime.ts (added, +183/-0)
extensions/matrix/src/config-schema.ts (modified, +4/-2)
extensions/mattermost/src/channel.ts (modified, +2/-3)
extensions/mattermost/src/config-schema-core.ts (added, +125/-0)
extensions/mattermost/src/config-schema.ts (modified, +1/-115)
extensions/mattermost/src/config-surface.ts (added, +4/-0)
extensions/media-understanding-core/package.json (added, +7/-0)
extensions/media-understanding-core/runtime-api.ts (added, +9/-0)
extensions/media-understanding-core/src/runtime.ts (added, +147/-0)
extensions/msteams/test-api.ts (added, +1/-0)
extensions/nextcloud-talk/src/config-schema.ts (modified, +23/-11)
extensions/nostr/src/config-schema.ts (modified, +6/-2)
extensions/nostr/test-api.ts (added, +1/-0)
extensions/openai/test-api.ts (modified, +1/-0)
extensions/shared/config-schema-helpers.ts (modified, +17/-1)
extensions/signal/reaction-runtime-api.ts (added, +6/-0)
extensions/signal/src/accounts.ts (modified, +1/-1)
extensions/signal/src/message-actions.ts (modified, +1/-1)
extensions/slack/test-api.ts (modified, +2/-0)
extensions/speech-core/api.ts (added, +1/-0)
extensions/speech-core/package.json (added, +7/-0)
extensions/speech-core/runtime-api.ts (added, +33/-0)
extensions/speech-core/src/tts.ts (added, +849/-0)
extensions/telegram/src/bot.runtime.ts (modified, +1/-0)
extensions/telegram/src/bot.ts (modified, +15/-2)
extensions/telegram/src/polling-session.ts (modified, +22/-1)
extensions/telegram/test-api.ts (modified, +2/-0)
extensions/tlon/test-api.ts (added, +1/-0)
extensions/twitch/src/config-schema.ts (modified, +1/-1)
extensions/whatsapp/test-api.ts (modified, +2/-0)
extensions/zalo/src/config-schema.ts (modified, +2/-2)
extensions/zalouser/src/config-schema.ts (modified, +3/-2)
package.json (modified, +8/-0)
scripts/lib/plugin-sdk-entrypoints.json (modified, +2/-0)
scripts/openclaw-npm-postpublish-verify.ts (modified, +2/-10)
src/agents/cli-runner.test-support.ts (modified, +3/-3)
src/agents/tools/tts-tool.test.ts (modified, +15/-15)
src/auto-reply/reply/commands-system-prompt.test.ts (modified, +10/-11)
src/cli/prompt.runtime.ts (added, +1/-0)
src/cli/update-cli.test.ts (modified, +1/-1)
src/commands/channel-test-helpers.ts (modified, +5/-5)
src/cron/isolated-agent.test-setup.ts (modified, +1/-1)
src/gateway/test-helpers.mocks.ts (modified, +2/-2)
src/image-generation/runtime.ts (modified, +6/-183)
src/infra/binaries.runtime.ts (added, +1/-0)
src/infra/env.ts (modified, +12/-2)
src/infra/heartbeat-runner.test-harness.ts (modified, +3/-3)
src/infra/heartbeat-runner.test-utils.ts (modified, +1/-1)
src/infra/outbound/message-action-runner.test-helpers.ts (modified, +2/-2)
src/infra/outbound/targets.shared-test.ts (modified, +1/-1)
src/infra/provider-usage.auth.plugin.test.ts (modified, +5/-1)
src/library.test.ts (modified, +9/-9)
src/library.ts (modified, +27/-35)
src/media-understanding/runtime.ts (modified, +9/-146)
src/plugin-sdk/account-resolution.ts (modified, +4/-10)
src/plugin-sdk/agent-config-primitives.ts (added, +3/-0)
src/plugin-sdk/bluebubbles.ts (modified, +2/-1)
src/plugin-sdk/channel-config-primitives.ts (added, +16/-0)
src/plugin-sdk/channel-import-guardrails.test.ts (modified, +1/-1)
src/plugin-sdk/channel-runtime.ts (modified, +1/-0)
src/plugin-sdk/compat.ts (modified, +1/-1)
src/plugin-sdk/config-runtime.ts (modified, +4/-0)
src/plugin-sdk/image-generation-core.ts (modified, +15/-0)
src/plugin-sdk/image-generation-runtime.ts (modified, +6/-1)
src/plugin-sdk/mattermost.ts (modified, +1/-0)
src/plugin-sdk/media-runtime.ts (modified, +1/-0)
src/plugin-sdk/media-understanding-runtime.ts (modified, +3/-1)
src/plugin-sdk/signal.ts (modified, +10/-11)
src/plugin-sdk/speech-core.ts (modified, +9/-0)
src/plugin-sdk/speech-runtime.ts (modified, +33/-1)
src/plugins/capability-provider-runtime.ts (modified, +1/-1)
src/plugins/contracts/registry.ts (modified, +20/-43)
src/plugins/public-artifacts.ts (modified, +7/-8)
src/plugins/runtime/runtime-matrix-boundary.ts (modified, +11/-9)
src/plugins/runtime/runtime-matrix-surface.ts (added, +22/-0)
src/plugins/runtime/runtime-whatsapp-boundary.ts (modified, +71/-64)
src/plugins/runtime/runtime-whatsapp-surface.ts (added, +249/-0)
src/test-utils/imessage-test-plugin.ts (modified, +1/-1)
src/tts/tts.ts (modified, +34/-855)

PR #55937: telegram: cache botInfo across polling cycles to eliminate getMe() init stall

Repository: openclaw/openclaw
Author: sinogello
State: closed | merged: False
Link: https://github.com/openclaw/openclaw/pull/55937

Description (problem / solution / changelog)

Problem

Call chain

polling-session.ts: runUntilAbort() loop
  → #createPollingBot()
    → createTelegramBot({ token, ... })  // no botInfo passed
      → new Bot(token, { client })       // this.me = undefined
  → run(bot, runnerOptions)              // @grammyjs/runner
    → source.supply()
      → await bot.init()                 // no abort signal!
        → withRetries(() => api.getMe()) // exponential backoff up to 20min
        → ...hangs here...
      → fetchUpdates(...)                // never reached

Fix

Cache botInfo: Store the UserFromGetMe result from the first successful getMe() call and inject it as botInfo into subsequent Bot constructor calls. When botInfo is provided, grammy's isInited() returns true immediately, making bot.init() a no-op.
Forward abortSignal during init: Wire the session's abortSignal to the fetch abort controller during the initial bot.init() call, so a shutdown request during first cold start with no network is not blocked by grammy's internal retry loop.

Changes (3 files, ~50 lines)

bot.runtime.ts: Export UserFromGetMe type
bot.ts: Accept optional botInfo in TelegramBotOptions, pass to Bot constructor
polling-session.ts: Cache botInfo after first init; forward abortSignal during init

Result

First cold start: normal getMe() call, result cached
All subsequent cycles: isInited() === true → bot.init() is no-op → getUpdates starts immediately
Network recovery: new cycle skips getMe() entirely → near-instant polling resume
Shutdown during first init: properly cancelled via forwarded abort signal

Relationship to #55406 / #55407

The startup watchdog (PR #55407) remains valuable as defense-in-depth for edge cases where botInfo is not yet cached. This PR addresses the root cause.

Changed files

extensions/telegram/src/bot.runtime.ts (modified, +1/-0)
extensions/telegram/src/bot.ts (modified, +15/-2)
extensions/telegram/src/polling-session.test.ts (modified, +8/-1)
extensions/telegram/src/polling-session.ts (modified, +53/-1)

PR #58451: feat(telegram): add heartbeat supervisor for silent network outage detection

Repository: openclaw/openclaw
Author: V-Gutierrez
State: closed | merged: False
Link: https://github.com/openclaw/openclaw/pull/58451

Description (problem / solution / changelog)

Background

In production deployments, the existing polling watchdog detects stalls when getUpdates hangs, but it cannot detect silent TCP drops — when the network connection dies without any error signal. In these cases, the bot can sit idle for 20+ minutes before any recovery attempt.

Related issues

#54708 — Message Loss on Telegram Network Failure
#52116 — Telegram polling client gets permanently stuck after transient network failure
#54513 — Telegram polling has no stall detection (unlike Slack health-monitor)
#55406 — Telegram polling can restart into not-started state after connectivity loss
#47458 — Polling stall loop — getUpdates hangs, restart never recovers
#41704 — Telegram polling stalls indefinitely when proxy TCP connection drops silently
#42782 — [Feature Request] Add health-monitor auto-reconnect for Telegram polling
#44396 — Telegram polling stall (~95s) causes significant message delivery delay

What changed

New: `HeartbeatSupervisor` (`extensions/telegram/src/heartbeat.ts`)

A threshold-based heartbeat supervisor that runs periodic getMe probes using the existing probeTelegram() function from probe.ts:

Runs on a configurable interval (default: 30s)
Counts consecutive probe failures
Fires onOutageDetected after reaching the failure threshold (default: 3)
Fires onRecovered once when connectivity returns
Security: Reuses probeTelegram() which already handles transport safely. Error messages are logged without the bot token or full URL — only the method name, error description, and failure counter.

Modified: `TelegramPollingSession` (`extensions/telegram/src/polling-session.ts`)

Integrates HeartbeatSupervisor in runUntilAbort() when apiBase is provided
onOutageDetected aborts the current polling cycle only via a cycle-scoped AbortController (not the global abort signal), so the outer loop can restart cleanly
onRecovered logs recovery; the polling loop restarts naturally
Supervisor starts before the polling loop, stops in finally
The existing watchdog is untouched — both mechanisms work independently

Modified: `monitor.ts`

Passes apiBase (from resolveTelegramApiBase()) to TelegramPollingSession

Tests

14 new tests in heartbeat.test.ts: threshold behavior, recovery, abort signal, overlap prevention, token-never-in-logs assertion
1 new test in polling-session.test.ts: verifies HeartbeatSupervisor starts when apiBase is provided
All 9 existing polling-session.test.ts tests pass unchanged

Test Files  2 passed (2)
     Tests  24 passed (24)

How to test

pnpm vitest run extensions/telegram/src/heartbeat.test.ts extensions/telegram/src/polling-session.test.ts

For manual QA: deploy with a Telegram bot, kill network connectivity for 2+ minutes, observe logs for [telegram][heartbeat] probe failed → outage detection → recovery on reconnect.

Changed files

extensions/telegram/src/heartbeat.test.ts (added, +417/-0)
extensions/telegram/src/heartbeat.ts (added, +110/-0)
extensions/telegram/src/monitor.test.ts (modified, +2/-0)
extensions/telegram/src/monitor.ts (modified, +3/-1)
extensions/telegram/src/polling-session.test.ts (modified, +147/-0)
extensions/telegram/src/polling-session.ts (modified, +66/-19)

Code Example

2026-03-27T02:56:01.773+08:00 [telegram] Polling stall detected (active getUpdates stuck for 452.69s); forcing restart. [diag inFlight=1 outcome=started startedAt=1774550909084 finishedAt=1774550909084 durationMs=30802 offset=661064690]
2026-03-27T02:56:01.784+08:00 [telegram] [diag] polling cycle finished reason=polling stall detected inFlight=0 outcome=error startedAt=1774550909084 finishedAt=1774551361781 durationMs=452697 offset=661064690 error=Network request for 'getUpdates' failed!

---

2026-03-27T02:56:04.114+08:00 [telegram] [diag] createPollingBot success rebuild=true hasTransport=true lastUpdateId=661064689
2026-03-27T02:56:09.222+08:00 [telegram] [diag] runPollingCycle before-race
2026-03-27T02:56:14.225+08:00 [telegram] [diag] polling startup stalled before first getUpdates; forcing restart
2026-03-27T02:56:29.234+08:00 [telegram] [diag] polling cycle finished reason=polling stall detected inFlight=0 outcome=not-started startedAt=n/a finishedAt=n/a durationMs=n/a offset=n/a

---

2026-03-27T06:19:31.385+08:00 [telegram] [diag] createPollingBot success rebuild=true hasTransport=true lastUpdateId=661064701
2026-03-27T06:19:36.497+08:00 [telegram] [diag] runPollingCycle before-race
2026-03-27T06:19:41.501+08:00 [telegram] [diag] polling startup stalled before first getUpdates; forcing restart
2026-03-27T06:19:56.508+08:00 [telegram] [diag] polling cycle finished reason=polling stall detected inFlight=0 outcome=not-started startedAt=n/a finishedAt=n/a durationMs=n/a offset=n/a

---

06:20:20 before-race
06:20:25 polling startup stalled before first getUpdates

06:21:07 before-race
06:21:12 polling startup stalled before first getUpdates

06:22:01 before-race
06:22:06 polling startup stalled before first getUpdates

---

[2026-03-27 06:25:19] #5335 HEALTH_FAIL elapsed=5.08s error=URLError desc=<urlopen error [Errno 54] Connection reset by peer>
[2026-03-27 06:29:10] #5336 HEALTH_OK elapsed=2.165s username=sinogello7799bot
[2026-03-27 06:29:22] #5337 HEALTH_OK elapsed=1.106s username=sinogello7799bot

RAW_BUFFERClick to expand / collapse

Summary

In real deployment testing, the sequence was repeatedly:

a getUpdates request fails or becomes stuck during Telegram connectivity disruption
the watchdog detects the polling failure and forces a restart
the transport/bot is rebuilt successfully
the new polling cycle enters before-race
no new getUpdates request is ever started (inFlight=0, outcome=not-started)
without additional guarding, the process would wait for the full 90s stall watchdog and then repeat

This creates a misleading "Telegram connectivity has recovered but polling is still dead" failure mode.

Key evidence

A. Active request can truly get stuck during outage

2026-03-27T02:56:01.773+08:00 [telegram] Polling stall detected (active getUpdates stuck for 452.69s); forcing restart. [diag inFlight=1 outcome=started startedAt=1774550909084 finishedAt=1774550909084 durationMs=30802 offset=661064690]
2026-03-27T02:56:01.784+08:00 [telegram] [diag] polling cycle finished reason=polling stall detected inFlight=0 outcome=error startedAt=1774550909084 finishedAt=1774551361781 durationMs=452697 offset=661064690 error=Network request for 'getUpdates' failed!

This confirms the initial failure is not a false positive: the request was truly stuck far beyond expected timeout behavior.

B. After rebuild, polling cycle can stall before first `getUpdates`

2026-03-27T02:56:04.114+08:00 [telegram] [diag] createPollingBot success rebuild=true hasTransport=true lastUpdateId=661064689
2026-03-27T02:56:09.222+08:00 [telegram] [diag] runPollingCycle before-race
2026-03-27T02:56:14.225+08:00 [telegram] [diag] polling startup stalled before first getUpdates; forcing restart
2026-03-27T02:56:29.234+08:00 [telegram] [diag] polling cycle finished reason=polling stall detected inFlight=0 outcome=not-started startedAt=n/a finishedAt=n/a durationMs=n/a offset=n/a

This demonstrates the critical bug: the rebuilt runner entered the cycle but never began the first getUpdates call.

C. Same behavior reproduces when only Telegram connectivity is blocked

A later test did not disconnect the whole network; it only interrupted Telegram connectivity at the router level. The same dead-start polling pattern still appeared:

2026-03-27T06:19:31.385+08:00 [telegram] [diag] createPollingBot success rebuild=true hasTransport=true lastUpdateId=661064701
2026-03-27T06:19:36.497+08:00 [telegram] [diag] runPollingCycle before-race
2026-03-27T06:19:41.501+08:00 [telegram] [diag] polling startup stalled before first getUpdates; forcing restart
2026-03-27T06:19:56.508+08:00 [telegram] [diag] polling cycle finished reason=polling stall detected inFlight=0 outcome=not-started startedAt=n/a finishedAt=n/a durationMs=n/a offset=n/a

This is useful because it narrows the issue to Telegram connectivity disruption, not just "entire network disappeared" behavior.

D. The `not-started` startup stall repeated across multiple restart cycles

06:20:20 before-race
06:20:25 polling startup stalled before first getUpdates

06:21:07 before-race
06:21:12 polling startup stalled before first getUpdates

06:22:01 before-race
06:22:06 polling startup stalled before first getUpdates

This repetition strongly suggests the issue is not random logging noise.

E. Health probe confirms recovery once Telegram connectivity is really back

[2026-03-27 06:25:19] #5335 HEALTH_FAIL elapsed=5.08s error=URLError desc=<urlopen error [Errno 54] Connection reset by peer>
[2026-03-27 06:29:10] #5336 HEALTH_OK elapsed=2.165s username=sinogello7799bot
[2026-03-27 06:29:22] #5337 HEALTH_OK elapsed=1.106s username=sinogello7799bot

The external probe shows Telegram connectivity really did recover. The main problem was how the polling runner behaved across restart cycles during the outage/recovery window.

Why current behavior is problematic

Without a dedicated startup watchdog, the system can enter repeated 90-second dead windows after restart because the next runner cycle appears alive but has not actually started long polling.

In practice this means:

Telegram connectivity may already be back
bot/transport recreation may already have succeeded
logs may show before-race
but Telegram delivery remains dead because the first getUpdates never starts

Proposed fix

Add an explicit startup watchdog for each new polling cycle:

start a short timer when entering before-race
if after that delay inFlightGetUpdates === 0 and lastGetUpdatesOutcome === "not-started"
- treat the cycle as stuck before startup
- discard transport for the next cycle
- stop the runner and force restart

Implementation used in local testing:

POLL_STARTUP_TIMEOUT_MS = 10_000
condition: inFlightGetUpdates === 0 && lastGetUpdatesOutcome === "not-started"
action: log a dedicated startup-stall message, mark transport dirty, stop runner, force cycle restart

Additionally, a default timeout for non-getUpdates Telegram API calls can help prevent unrelated requests from hanging forever during degraded network conditions.

Test plan

Reproduction

run Telegram polling with diagnostics enabled
interrupt Telegram connectivity
- either disconnect the network/router entirely, or
- block Telegram connectivity specifically at the router level
wait long enough to trigger polling restart behavior
restore connectivity
inspect whether the next rebuilt cycle reaches a real getUpdates

Expected failure without fix

repeated before-race
no first getUpdates
outcome=not-started
recovery delayed by the full 90s stall watchdog

Expected behavior with fix

startup-stalled cycles are detected in ~10s
transport is rebuilt aggressively
repeated dead-start cycles no longer wait 90s each
once Telegram connectivity is truly back, the next successful cycle resumes quickly

Notes

This does not fully explain the lower-level root cause inside the runner/fetch stack, but it makes the failure mode observable and avoids extremely long false-alive windows after restart.

extent analysis

Fix Plan

To address the issue of Telegram long polling getting stuck in a not-started state after a restart, we will implement an explicit startup watchdog for each new polling cycle. Here are the steps:

Set a short timer (POLL_STARTUP_TIMEOUT_MS = 10_000) when entering the before-race state.
Check if after the delay, inFlightGetUpdates === 0 and lastGetUpdatesOutcome === "not-started".
If the condition is met, treat the cycle as stuck before startup, discard the transport for the next cycle, stop the runner, and force a restart.

Example code snippet:

const POLL_STARTUP_TIMEOUT_MS = 10_000;

// ...

function runPollingCycle() {
  // ...
  const startTime = Date.now();
  const timeoutId = setTimeout(() => {
    if (inFlightGetUpdates === 0 && lastGetUpdatesOutcome === "not-started") {
      console.log("Polling startup stalled before first getUpdates");
      // Mark transport as dirty and stop the runner
      transport.dirty = true;
      stopRunner();
      // Force restart
      restartPollingCycle();
    }
  }, POLL_STARTUP_TIMEOUT_MS);
  // ...
}

// ...

Additionally, consider setting a default timeout for non-getUpdates Telegram API calls to prevent unrelated requests from hanging forever during degraded network conditions.

Verification

To verify the fix, follow these steps:

Run Telegram polling with diagnostics enabled.
Interrupt Telegram connectivity (either disconnect the network/router entirely or block Telegram connectivity specifically at the router level).
Wait long enough to trigger polling restart behavior.
Restore connectivity.
Inspect whether the next rebuilt cycle reaches a real getUpdates.

Expected behavior with the fix:

Startup-stalled cycles are detected in ~10s.
Transport is rebuilt aggressively.
Repeated dead-start cycles no longer wait 90s each.
Once Telegram connectivity is truly back, the next successful cycle resumes quickly.

Extra Tips

Monitor the polling cycle's behavior and adjust the POLL_STARTUP_TIMEOUT_MS value as needed to balance between detecting startup stalls and avoiding false positives.
Consider implementing additional logging and monitoring to help identify the lower-level root cause of the issue.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #pipeline error #runtime error #dependency conflict #environment setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix telegram polling can restart into not-started state after connectivity loss [3 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

PR fix notes

PR #55918: telegram: cache botInfo across polling cycles to eliminate getMe() init stall

Description (problem / solution / changelog)

Problem

Call chain

Fix

Changes

Result

Relationship to #55406 / #55407

Changed files

PR #55937: telegram: cache botInfo across polling cycles to eliminate getMe() init stall

Description (problem / solution / changelog)

Problem

Call chain

Fix

Changes (3 files, ~50 lines)

Result

Relationship to #55406 / #55407

Changed files

PR #58451: feat(telegram): add heartbeat supervisor for silent network outage detection

Description (problem / solution / changelog)

Background

Related issues

What changed

New: HeartbeatSupervisor (extensions/telegram/src/heartbeat.ts)

Modified: TelegramPollingSession (extensions/telegram/src/polling-session.ts)

Modified: monitor.ts

Tests

How to test

Changed files

Code Example

Summary

Key evidence

A. Active request can truly get stuck during outage

B. After rebuild, polling cycle can stall before first getUpdates

C. Same behavior reproduces when only Telegram connectivity is blocked

D. The not-started startup stall repeated across multiple restart cycles

E. Health probe confirms recovery once Telegram connectivity is really back

Why current behavior is problematic

Proposed fix

Test plan

Reproduction

Expected failure without fix

Expected behavior with fix

Notes

extent analysis

Fix Plan

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING

New: `HeartbeatSupervisor` (`extensions/telegram/src/heartbeat.ts`)

Modified: `TelegramPollingSession` (`extensions/telegram/src/polling-session.ts`)

Modified: `monitor.ts`

B. After rebuild, polling cycle can stall before first `getUpdates`

D. The `not-started` startup stall repeated across multiple restart cycles