openclaw - 💡(How to fix) Fix Feature: Telegram polling watchdog — auto-restart connection on stall without killing gateway [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#59332Fetched 2026-04-08 02:25:51
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0

When Telegram long polling stalls (due to network issues, DNS failures, or upstream Telegram API instability), the entire gateway can become unresponsive. Currently there is no mechanism to detect a stalled Telegram connection and restart just the polling loop — the only recovery is a full gateway restart, which itself triggers the update_id reprocessing bug (#59331).

Error Message

  1. Log every watchdog recovery event at warn level
  2. After maxRestarts consecutive restarts without a successful stable connection, log at error and stop retrying
  • Telegram is the only channel where a transport-level stall can silently kill the bot with no user-visible error

Root Cause

  • Telegram is the only channel where a transport-level stall can silently kill the bot with no user-visible error
  • Discord has its own reconnect logic via discord.js
  • A polling watchdog is standard practice for long-polling consumers
  • This would eliminate the most common cause of "bot went dark" reports

Code Example

{
  channels: {
    telegram: {
      watchdog: {
        enabled: true,
        stallTimeoutSeconds: 90,
        maxRestarts: 5,
        cooldownSeconds: 30,
      }
    }
  }
}
RAW_BUFFERClick to expand / collapse

Description

When Telegram long polling stalls (due to network issues, DNS failures, or upstream Telegram API instability), the entire gateway can become unresponsive. Currently there is no mechanism to detect a stalled Telegram connection and restart just the polling loop — the only recovery is a full gateway restart, which itself triggers the update_id reprocessing bug (#59331).

Problem

  • Telegram polling hangs silently — no inbound messages are processed
  • The gateway process itself stays alive (health check passes), but the bot appears dead
  • The only recovery is openclaw gateway restart, which:
    • Interrupts all active sessions across all channels (Discord, WebChat, etc.)
    • Triggers message reprocessing (see #59331)
    • Loses in-flight completion state
  • Observed as a contributing factor in crashes on 3/22, 3/26, 3/31, 4/1

Proposed Solution

Add a configurable watchdog for the Telegram polling loop:

{
  channels: {
    telegram: {
      watchdog: {
        enabled: true,
        stallTimeoutSeconds: 90,
        maxRestarts: 5,
        cooldownSeconds: 30,
      }
    }
  }
}

Behavior:

  1. Track the timestamp of the last successful poll response (even if it returned zero updates)
  2. If stallTimeoutSeconds elapses with no poll response, tear down and restart the grammY polling runner
  3. Log every watchdog recovery event at warn level
  4. After maxRestarts consecutive restarts without a successful stable connection, log at error and stop retrying
  5. Successful stable polling (e.g., 60s of normal operation) resets the restart counter

Why This Matters

  • Telegram is the only channel where a transport-level stall can silently kill the bot with no user-visible error
  • Discord has its own reconnect logic via discord.js
  • A polling watchdog is standard practice for long-polling consumers
  • This would eliminate the most common cause of "bot went dark" reports

Environment

  • OpenClaw 2026.4.1 (da64a97)
  • macOS (arm64), Node v22.22.1
  • Telegram: long polling mode (default)

extent analysis

TL;DR

Implement a configurable watchdog for the Telegram polling loop to detect and recover from stalled connections.

Guidance

  • Introduce a watchdog mechanism to track the last successful poll response and restart the polling loop if a stall is detected.
  • Configure the watchdog with a suitable stallTimeoutSeconds value (e.g., 90 seconds) and maxRestarts limit (e.g., 5) to balance recovery and prevention of excessive restarts.
  • Log watchdog recovery events at warn level and error events after maxRestarts consecutive restarts to monitor the bot's health.
  • Consider implementing a cooldown period (cooldownSeconds) to prevent rapid restarts and allow the system to stabilize.

Example

{
  channels: {
    telegram: {
      watchdog: {
        enabled: true,
        stallTimeoutSeconds: 90,
        maxRestarts: 5,
        cooldownSeconds: 30,
      }
    }
  }
}

Notes

The proposed solution assumes that the watchdog mechanism will be integrated into the existing OpenClaw gateway, and its behavior will be as described. The choice of stallTimeoutSeconds and maxRestarts values may require tuning based on the specific environment and usage patterns.

Recommendation

Apply the proposed watchdog workaround to detect and recover from stalled Telegram connections, as it addresses the root cause of the issue and provides a more targeted recovery mechanism than a full gateway restart.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Feature: Telegram polling watchdog — auto-restart connection on stall without killing gateway [1 participants]