openclaw - ✅(Solved) Fix Telegram polling: self-sustaining 409 getUpdates conflict from probe + health-monitor re-triggering transport [2 pull requests, 2 comments, 2 participants]

openclaw2026-03-18 23:02:42

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#50064•Fetched 2026-04-08 00:59:37

View on GitHub

Comments

Participants

Timeline

Reactions

Author

RIPRODUCTIONS

Participants

Hollychou924

RIPRODUCTIONS

Timeline (top)

cross-referenced ×3commented ×2referenced ×1

The gateway enters a permanent 409 getUpdates conflict loop because the Telegram probe client and polling client create competing getUpdates connections. Once triggered, the loop is self-sustaining because the long-poll timeout (30s) equals the max retry interval (30s), so each retry's server-side connection overlaps with the next.

Root Cause

Dual client creation:

channel.ts startAccount() (line ~489) calls probeTelegram() → resolveTelegramTransport() — creates Client #1
monitor.ts monitorTelegramProvider() creates TelegramPollingSession → createTelegramBot() → resolveTelegramTransport() — creates Client #2

Each call to resolveTelegramTransport() in fetch.ts creates a new dispatcher with no caching. The probe's TCP connection lingers in the socket pool and can race with the polling client's getUpdates call.

Self-sustaining loop:

createTelegramRunnerOptions() sets fetch.timeout: 30 (30s long-poll)
Grammy-runner's max retry interval is also 30s
When a 409 occurs, the retry fires a new getUpdates while the previous call's server-side connection is still alive (within the 30s window)
This creates a permanent overlap: each retry conflicts with the previous retry

Health-monitor re-trigger:

Even if the initial 409 self-resolves, the health-monitor (300s interval) re-probes via probeTelegram(), creating a fresh competing connection that re-triggers the loop

Fix Action

Workaround

Runtime patch reducing fetch.timeout from 30 to 10 in createTelegramRunnerOptions() resolves the self-sustaining loop. An initial 409 may still occur but recovers within one retry cycle.

PR fix notes

PR #50505: fix(telegram): avoid self-sustaining polling 409 conflicts

Repository: openclaw/openclaw
Author: xinhuagu
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/50505

Description (problem / solution / changelog)

Summary

Fix a Telegram polling failure mode where getUpdates can fall into a self-sustaining 409 conflict loop.

This change tackles two parts of the issue:

skip the startup bot probe before polling begins, so polling owns getUpdates from the start
reduce grammY long-poll fetch timeout from 30s to 10s, so retry attempts do not overlap the previous server-side getUpdates window

Root cause

Issue #50064 describes a failure mode where Telegram probe and polling behavior can combine into repeated 409 Conflict: terminated by other getUpdates request errors.

The key pieces are:

startup probe runs before polling starts
polling retries can line up with the previous 30s long-poll window
once a 409 is triggered, the 30s fetch timeout can keep the overlap going

By removing the startup probe from the polling path and shortening the polling timeout, polling no longer competes with an immediate pre-start probe and retry cycles recover instead of re-triggering the same overlap.

What changed

in channel.ts, only run the startup probe for webhook accounts
- polling accounts now go straight to monitorTelegramProvider(...)
in monitor.ts, reduce grammY polling fetch.timeout from 30 to 10
add a regression test that verifies polling startup skips the probe
update the runner-options test to lock the new timeout

Why this is different from nearby Telegram PRs

This is specifically about the startup probe / polling ownership conflict and the 30s long-poll = 30s retry overlap described in #50064.

It is not the same as:

#49910: graceful stop timeout / shutdown cleanup race
#50368: startup persisted-offset confirmation timeout

Testing

pnpm test extensions/telegram/src/monitor.test.ts
added coverage in extensions/telegram/src/channel.test.ts

channel.test.ts currently trips an unrelated repo test-environment import problem in this checkout (fake-indexeddb/auto via Matrix runtime mocking), but the new assertion is narrow and the Telegram monitor suite passes locally.

Closes #50064

Changed files

extensions/telegram/src/channel.test.ts (modified, +30/-6)
extensions/telegram/src/channel.ts (modified, +17/-14)
extensions/telegram/src/monitor.test.ts (modified, +3/-0)
extensions/telegram/src/monitor.ts (modified, +3/-2)

PR #56324: fix(telegram): add per-token duplicate poller guard to prevent 409 conflicts

Repository: openclaw/openclaw
Author: Co-Messi
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/56324

Description (problem / solution / changelog)

Summary

Add a per-token active polling session registry in monitorTelegramProvider() that detects and waits for an existing session to release before starting a new one
Add a 500ms drain pause in the hot-reload channel restart handler between stopChannel and startChannel

Both changes prevent 409 Conflict errors from concurrent getUpdates calls on the same bot token.

Context

The gateway has no protection against duplicate polling sessions for the same bot token. Multiple scenarios can create overlapping pollers:

Hot-reload race: applyHotReload restarts channels via stopChannel then startChannel, but waitForGracefulStop has a 15-second timeout (POLL_STOP_GRACE_MS). If the grammY runner does not stop within that window, the new poller starts while the old one still holds a connection.
External scripts: Any process calling getUpdates on the same token (launchd agents, cron scripts, monitoring tools) creates a competing poller the gateway cannot detect.
Watchdog restart overlap: The 90-second POLL_STALL_THRESHOLD_MS triggers a polling cycle restart that can overlap with the existing session if graceful stop times out.

PR #20930 fixed the SIGUSR1 + config.patch race, but the file-watcher hot-reload path remains unguarded.

Implementation

extensions/telegram/src/monitor.ts (+68 lines) — Module-level Map<string, ActivePollerEntry> keyed by bot token. Before starting polling, monitorTelegramProvider checks the registry and waits up to 5 seconds for any existing session to signal completion via a done promise. The registry is cleaned up in the finally block.

src/gateway/server-reload-handlers.ts (+4 lines) — 500ms setTimeout between stopChannel and startChannel in the hot-reload channel restart path, giving the polling session graceful stop a buffer to fully release.

Test plan

Existing telegram monitor tests pass (23/23)
Existing reload handler tests pass (12/12)
Verified on a 4-bot macOS setup (jarvis, atlas, forge, trader) — zero 409 errors after 10+ minutes of clean operation
Manual test: edit config while gateway is running, verify hot-reload restarts channels without 409s

Fixes #56230 Related: #20893, #43628, #50064, #49822, #33154

Changed files

extensions/telegram/src/monitor.ts (modified, +69/-0)
src/agents/pi-tools.params.ts (modified, +14/-4)
src/gateway/server-reload-handlers.ts (modified, +6/-0)

RAW_BUFFERClick to expand / collapse

Summary

Reproduction

Start the gateway with Telegram polling enabled (channels.telegram.enabled: true)
Observe startup logs — two autoSelectFamily + dnsResultOrder log pairs appear (one from probe, one from polling)
Within 30-60s, getUpdates conflict: 409 errors begin
Errors continue indefinitely at ~30s intervals

Root Cause

Dual client creation:

channel.ts startAccount() (line ~489) calls probeTelegram() → resolveTelegramTransport() — creates Client #1
monitor.ts monitorTelegramProvider() creates TelegramPollingSession → createTelegramBot() → resolveTelegramTransport() — creates Client #2

Self-sustaining loop:

createTelegramRunnerOptions() sets fetch.timeout: 30 (30s long-poll)
Grammy-runner's max retry interval is also 30s
When a 409 occurs, the retry fires a new getUpdates while the previous call's server-side connection is still alive (within the 30s window)
This creates a permanent overlap: each retry conflicts with the previous retry

Health-monitor re-trigger:

Even if the initial 409 self-resolves, the health-monitor (300s interval) re-probes via probeTelegram(), creating a fresh competing connection that re-triggers the loop

Suggested Fixes

Fix A — Transport caching (primary): Add a cache to resolveTelegramTransport() in fetch.ts, similar to the existing probeFetcherCache in probe.ts. This ensures probe and polling share the same dispatcher/connection pool.

Fix B — Break the 30s=30s deadlock: Reduce fetch.timeout in createTelegramRunnerOptions() to a value less than the max retry interval (e.g., 10-15s). This ensures the previous server-side connection expires before the next retry fires.

Fix C — Skip probe before polling: In startAccount(), skip the probeTelegram() call when the provider is about to start polling immediately. The probe is useful for health checks but redundant right before monitorTelegramProvider().

Workaround

Runtime patch reducing fetch.timeout from 30 to 10 in createTelegramRunnerOptions() resolves the self-sustaining loop. An initial 409 may still occur but recovers within one retry cycle.

Environment

OpenClaw latest (0537f3e59)
Docker on Windows 10 (gateway runs in Linux container)
Single bot token, single container, no webhook
Telegram plugin with polling mode

🤖 Generated with Claude Code

extent analysis

Fix Plan

To resolve the getUpdates conflict loop, we will implement Fix A — Transport caching. This involves adding a cache to resolveTelegramTransport() in fetch.ts to ensure the probe and polling clients share the same dispatcher/connection pool.

Code Changes

// fetch.ts
const transportCache = new Map();

async function resolveTelegramTransport() {
  const cacheKey = 'telegram-transport';
  if (transportCache.has(cacheKey)) {
    return transportCache.get(cacheKey);
  }

  // existing implementation to create the transport
  const transport = await createTransport();

  transportCache.set(cacheKey, transport);
  return transport;
}

Verification

After applying the fix, restart the gateway and verify that the getUpdates conflict: 409 errors no longer occur. Monitor the logs for the presence of a single autoSelectFamily and dnsResultOrder log pair, indicating that only one client is creating a connection.

Extra Tips

Consider implementing Fix B — Break the 30s=30s deadlock as an additional precaution to prevent similar issues in the future.
Review the probeTelegram() call in startAccount() and consider skipping it when the provider is about to start polling immediately, as suggested in Fix C — Skip probe before polling.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #serialization error #model compatibility #GPU setup #container setup #orchestration issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix Telegram polling: self-sustaining 409 getUpdates conflict from probe + health-monitor re-triggering transport [2 pull requests, 2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Workaround

PR fix notes

PR #50505: fix(telegram): avoid self-sustaining polling 409 conflicts

Description (problem / solution / changelog)

Summary

Root cause

What changed

Why this is different from nearby Telegram PRs

Testing

Changed files

PR #56324: fix(telegram): add per-token duplicate poller guard to prevent 409 conflicts

Description (problem / solution / changelog)

Summary

Context

Implementation

Test plan

Changed files

Summary

Reproduction

Root Cause

Suggested Fixes

Workaround

Environment

extent analysis

Fix Plan

Code Changes

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING