When Telegram polling experiences a transient stall/restart: - outbound deliveries should not be silently lost, - safe transient failures should be retried automatically, - ambiguous failures should be preserved/held rather than blindly retried, - gateway restart should not be required to recover eligible deliveries.

openclaw - 💡(How to fix) Fix Telegram delivery reliability: polling stalls can lead to silent outbound message loss [3 comments, 3 participants]

openclaw2026-03-18 22:08:32

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#50040•Fetched 2026-04-08 00:59:56

View on GitHub

Comments

Participants

Timeline

Reactions

Author

dmitriiforpost-commits

Participants

dmitriiforpost-commits

Hollychou924

p3nchan

Timeline (top)

commented ×3cross-referenced ×1

On OpenClaw 2026.3.12, Telegram Bot API connectivity may remain generally healthy while the gateway's Telegram polling loop intermittently stalls/restarts. During those recovery windows, outbound sendMessage delivery can fail and the effective recovery path is not strong enough at runtime, leading to silent or operator-visible message loss.

This appears to be a gap between:

polling restart/recovery behavior, and
outbound delivery recovery behavior for non-idempotent Telegram sends.

Error Message

In production use, logs repeatedly showed patterns like:

Polling stall detected
sendChatAction failed
sendMessage failed: Network request for 'sendMessage' failed!
polling runner stop/restart cycles

Root Cause

This appears to be a gap between:

polling restart/recovery behavior, and
outbound delivery recovery behavior for non-idempotent Telegram sends.

Fix Action

Fix / Workaround

Optional note

A local prototype patch implementing the runtime worker + failure classes + stateful delivery entries significantly improves the recovery model, if maintainers want a more concrete direction for upstreaming.

RAW_BUFFERClick to expand / collapse

Telegram delivery reliability: polling stalls can lead to silent outbound message loss

Summary

This appears to be a gap between:

polling restart/recovery behavior, and
outbound delivery recovery behavior for non-idempotent Telegram sends.

Observed behavior

In production use, logs repeatedly showed patterns like:

Polling stall detected
sendChatAction failed
sendMessage failed: Network request for 'sendMessage' failed!
polling runner stop/restart cycles

At the same time:

direct short HTTPS/Bot API probes to api.telegram.org succeeded,
DNS and IPv4 routing looked healthy,
the failure pattern was intermittent rather than a full Telegram outage.

This suggests the issue is not simply "Telegram unreachable", but rather that the long-poll / recovery path can degrade and outbound delivery is not fully protected when that happens.

Why this is harmful

A message can be prepared by the assistant but still fail to reach Telegram during a polling/recovery disruption. From the operator perspective, this looks like silent message loss or partial reply loss.

Suspected design gap

There is already a disk-backed outbound delivery queue and startup recovery, but runtime delivery recovery appears insufficient for this failure mode.

The practical gap seems to be:

polling stalls or restarts,
outbound Telegram send fails during that window,
recovery is not strong enough as a continuous runtime mechanism,
result: delivery may be stuck, dropped, or ambiguous from the operator perspective.

Proposed direction

A robust fix would combine:

Runtime outbound delivery recovery worker
- periodically scan pending deliveries
- retry only safe-to-retry entries
- run without requiring gateway restart
- trigger an immediate recovery pass after Telegram polling restart/recovery
Delivery failure classification
- safe_to_retry
- ambiguous
- permanent
Stateful delivery entries
- pending
- retryable
- ambiguous
- delivered
- failed

This would reduce silent message loss while avoiding blind retries for ambiguous non-idempotent send outcomes.

Expected behavior

When Telegram polling experiences a transient stall/restart:

outbound deliveries should not be silently lost,
safe transient failures should be retried automatically,
ambiguous failures should be preserved/held rather than blindly retried,
gateway restart should not be required to recover eligible deliveries.

Version notes

Reproduced on: 2026.3.12
A local comparison against 2026.3.13 did not reveal an obvious upstream runtime recovery worker / failure-classification / stateful-outbox implementation for this specific Telegram delivery gap.

Optional note

extent analysis

Fix Plan

To address the issue of silent message loss due to polling stalls and insufficient runtime delivery recovery, we will implement a runtime outbound delivery recovery worker. This worker will periodically scan pending deliveries, retry safe-to-retry entries, and run without requiring a gateway restart.

Step-by-Step Solution:

Implement Runtime Outbound Delivery Recovery Worker:
- Create a worker that periodically scans the pending deliveries queue.
- Use a scheduling library (e.g., schedule in Python) to run the worker at regular intervals.
- Example Python code snippet:

import schedule import time

def recovery_worker(): # Scan pending deliveries and retry safe-to-retry entries pending_deliveries = get_pending_deliveries() for delivery in pending_deliveries: if delivery['safe_to_retry']: retry_delivery(delivery)

schedule.every(1).minutes.do(recovery_worker) # Run every 1 minute

while True: schedule.run_pending() time.sleep(1)

2. **Implement Delivery Failure Classification**:
   - Introduce failure classes: `safe_to_retry`, `ambiguous`, and `permanent`.
   - Update the delivery entry with the corresponding failure class when a failure occurs.
   - Example Python code snippet:
     ```python
def classify_failure(delivery, failure_reason):
    if failure_reason == 'network_error':
        delivery['failure_class'] = 'safe_to_retry'
    elif failure_reason == 'ambiguous_error':
        delivery['failure_class'] = 'ambiguous'
    else:
        delivery['failure_class'] = 'permanent'

Implement Stateful Delivery Entries:
- Introduce states: pending, retryable, ambiguous, delivered, and failed.
- Update the delivery entry state based on the failure class and retry outcome.
- Example Python code snippet:

def update_delivery_state(delivery, new_state): delivery['state'] = new_state

def retry_delivery(delivery): # Retry the delivery if retry_successful: update_delivery_state(delivery, 'delivered') else: update_delivery_state(delivery, 'retryable')


### Verification
To verify that the fix worked, monitor the pending deliveries queue and the delivery failure rates. The number of silent message losses should decrease, and the recovery worker should retry safe-to-retry deliveries automatically.

### Extra Tips
- Ensure the recovery worker is properly configured and running at regular intervals.
- Monitor the delivery failure rates and adjust the failure classification and retry logic as needed.
- Consider implementing a maximum retry limit to prevent infinite retries.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

When Telegram polling experiences a transient stall/restart:

outbound deliveries should not be silently lost,
safe transient failures should be retried automatically,
ambiguous failures should be preserved/held rather than blindly retried,
gateway restart should not be required to recover eligible deliveries.

#api #ssr #installation #tensor shape #autograd error #model save/load

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix Telegram delivery reliability: polling stalls can lead to silent outbound message loss [3 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Optional note

Telegram delivery reliability: polling stalls can lead to silent outbound message loss

Summary

Observed behavior

Why this is harmful

Suspected design gap

Proposed direction

Expected behavior

Version notes

Optional note

extent analysis

Fix Plan

Step-by-Step Solution:

FAQ

Expected behavior

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix Telegram delivery reliability: polling stalls can lead to silent outbound message loss [3 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Optional note

Telegram delivery reliability: polling stalls can lead to silent outbound message loss

Summary

Observed behavior

Why this is harmful

Suspected design gap

Proposed direction

Expected behavior

Version notes

Optional note

extent analysis

Fix Plan

Step-by-Step Solution:

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING