hermes - ✅(Solved) Fix fix(gateway): _keep_typing task stuck indefinitely when post-delivery callback hangs [3 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#24971Fetched 2026-05-14 03:50:11
View on GitHub
Comments
0
Participants
1
Timeline
6
Reactions
0
Author
Participants
Timeline (top)
cross-referenced ×3labeled ×3

Error Message

current order (broken)

finally: ... if callable(_post_cb): try: _post_result = _post_cb() if inspect.isawaitable(_post_result): await _post_result # ← can hang except Exception: pass # Stop typing indicator ← never reached if callback hangs await _stop_typing_task()

Root Cause

In gateway/platforms/base.py, the finally block of _process_message_background executes the post-delivery callback before calling _stop_typing_task():

# current order (broken)
finally:
    ...
    if callable(_post_cb):
        try:
            _post_result = _post_cb()
            if inspect.isawaitable(_post_result):
                await _post_result   # ← can hang
        except Exception:
            pass
    # Stop typing indicator       ← never reached if callback hangs
    await _stop_typing_task()

If the callback coroutine hangs — for example, _send_goal_status_notice() making an HTTP call over a Telegram socket that entered CLOSE-WAIT state after a prior network error (httpx.ReadError) — _stop_typing_task() is never called. The _keep_typing task keeps refreshing sendChatAction(typing) every 2 seconds indefinitely.

Fix Action

Fix

Two changes to gateway/platforms/base.py:

  1. Move _stop_typing_task() before the post-delivery callback so it always runs immediately after the response is sent.
  2. Add asyncio.wait_for(..., timeout=30.0) to the callback await so a slow/stuck callback cannot block the cleanup path.
# fixed order
finally:
    ...
    # Stop typing first — must not be delayed by callback work
    await _stop_typing_task()
    if callable(_post_cb):
        try:
            _post_result = _post_cb()
            if inspect.isawaitable(_post_result):
                await asyncio.wait_for(_post_result, timeout=30.0)
        except (asyncio.TimeoutError, Exception):
            pass

The same pattern exists in the in-band pending-message drain path (line ~3209) where _stop_typing_task() is already called before spawning the drain task — so the fix is consistent with the existing approach there.

PR fix notes

PR #24983: fix(gateway): stop typing task before post-delivery callback to prevent indefinite typing indicator

Description (problem / solution / changelog)

Summary

Fixes the _keep_typing background task running indefinitely after the agent has already replied, leaving the bot stuck in the "typing" indicator state.

Root Cause

In gateway/platforms/base.py, _process_message_background's finally block executes the post-delivery callback before calling _stop_typing_task():

# previous order (broken)
finally:
    if callable(_post_cb):
        await _post_result        # ← can hang
    await _stop_typing_task()     # ← never reached if callback hangs

If the callback coroutine hangs — for example, _send_goal_status_notice() making an HTTP call over a Telegram socket in CLOSE-WAIT state — _stop_typing_task() is never called. The _keep_typing task keeps refreshing sendChatAction(typing) every 2 seconds indefinitely.

Fix

Two changes in gateway/platforms/base.py:

  1. Reorder: Move await _stop_typing_task() before the post-delivery callback so typing cleanup always runs immediately after the response is sent.
  2. Timeout: Wrap the callback await with asyncio.wait_for(..., timeout=30.0) so a stuck callback cannot block cleanup indefinitely.
# fixed order
finally:
    await _stop_typing_task()           # always runs first
    if callable(_post_cb):
        await asyncio.wait_for(         # bounded
            _post_result, timeout=30.0)

Regression Coverage

  • test_typing_stopped_before_callback — verifies the function completes (doesn't hang) when the callback is stuck, and the callback is still attempted
  • test_fast_callback_still_runs — verifies a fast callback executes normally without being affected by the reordering

Testing

tests/gateway/test_keep_typing_timeout.py::TestPostDeliveryCallbackTimeout::test_typing_stopped_before_callback PASSED
tests/gateway/test_keep_typing_timeout.py::TestPostDeliveryCallbackTimeout::test_fast_callback_still_runs PASSED
tests/gateway/test_keep_typing_timeout.py::TestKeepTypingTimeoutPerTick (4 existing tests) PASSED
tests/gateway/test_ephemeral_reply.py (10 tests) PASSED
tests/gateway/test_base_topic_sessions.py (6 tests) PASSED

Fixes fix(gateway): _keep_typing task stuck indefinitely when post-delivery callback hangs

Changed files

  • gateway/platforms/base.py (modified, +5/-4)
  • tests/gateway/test_keep_typing_timeout.py (modified, +97/-0)

PR #25003: fix(gateway): stop typing indicator before post-delivery callback to prevent indefinite typing

Description (problem / solution / changelog)

Summary

Move _stop_typing_task() and stop_typing() calls before the post-delivery callback in _process_message_background's finally block. If the callback hangs (e.g. dead Telegram socket in CLOSE-WAIT state), the typing indicator task would previously run indefinitely since _stop_typing_task() was never reached.

Also wrap the awaitable callback result with asyncio.wait_for(timeout=30) so a slow/stuck callback cannot block the cleanup path indefinitely.

Changes

gateway/platforms/base.py

  • Reorder the finally block: call _stop_typing_task() + stop_typing() before executing the post-delivery callback
  • Add asyncio.wait_for(_post_result, timeout=30.0) to bound awaitable callbacks
  • Catch asyncio.TimeoutError alongside generic Exception

Root Cause

The post-delivery callback (e.g. _send_goal_status_notice()) can hang indefinitely when making HTTP calls over a dead Telegram socket that entered CLOSE-WAIT state after a network error. Since _stop_typing_task() was placed after the callback await, it was never reached, causing _keep_typing to refresh sendChatAction(typing) every 2 seconds forever.

Testing

  • All 51 typing-related tests pass
  • All 23 post_delivery/stop_typing tests pass
  • Consistent with existing pattern at line ~3310 where _stop_typing_task() is already called before the drain task

Fixes #24971

Changed files

  • gateway/platforms/base.py (modified, +10/-8)

PR #25210: fix(gateway): stop typing refresh before first outbound delivery

Description (problem / solution / changelog)

What

Stops the _keep_typing() refresh loop immediately before the first outbound delivery (TTS / text / images / media / local files), rather than waiting for the final cleanup block of _process_message_background().

Why

Telegram's Bot API has no explicit "stop typing" call — the indicator clears only when the bot stops issuing sendChatAction refreshes. Before this change, _keep_typing() was only cancelled in the final cleanup block, which runs after outbound delivery (including the Telegram send_message round-trip).

In practice this means the bot can issue one or more typing refreshes after the user has already seen the final reply, producing a "still typing" indicator that some clients do not reliably clear.

Lifecycle ownership: typing belongs to the "agent is still preparing the answer" phase. Once the response is ready and delivery starts, refreshing typing is wrong.

How

  • New idempotent helper _stop_typing_before_delivery() with a nonlocal typing_stopped_for_delivery coalescing flag.
  • Called immediately before the first delivery branch, after response preparation and optional TTS generation.
  • The existing finally-block _stop_typing_task() is retained as a safety net for empty-response and exception paths.

The session guard remains active across delivery — pending messages that arrive while delivery is in flight are still queued and drained afterwards. The change does not add a hard timeout (a timeout cannot prove the message was not delivered, and retrying could duplicate messages).

Tests

Two new regression tests in tests/gateway/test_base_topic_sessions.py:

  • test_process_message_background_stops_typing_before_first_delivery — proves send() is invoked only after the typing task has exited.
  • test_message_arriving_during_delivery_is_queued_after_typing_stops — proves a message arriving during delivery is enqueued and drained after the current delivery, with the active session entry cleared.

Verified locally: new tests fail before the change (regression captured); after the change all 9 tests in test_base_topic_sessions.py pass; live verification with a real Telegram bot confirms the typing indicator clears immediately after the reply is sent.

Risk

  • Implementation contained to _process_message_background internals; no public API changes.
  • Platforms with explicit stop_typing semantics (Discord, Slack) still hit their existing stop path via the retained finally-block cleanup; the only behavior change is timing — typing stops slightly earlier than before.

Changed files

  • gateway/platforms/base.py (modified, +12/-0)
  • tests/gateway/test_base_topic_sessions.py (modified, +93/-0)

Code Example

# current order (broken)
finally:
    ...
    if callable(_post_cb):
        try:
            _post_result = _post_cb()
            if inspect.isawaitable(_post_result):
                await _post_result   # ← can hang
        except Exception:
            pass
    # Stop typing indicator       ← never reached if callback hangs
    await _stop_typing_task()

---

# fixed order
finally:
    ...
    # Stop typing first — must not be delayed by callback work
    await _stop_typing_task()
    if callable(_post_cb):
        try:
            _post_result = _post_cb()
            if inspect.isawaitable(_post_result):
                await asyncio.wait_for(_post_result, timeout=30.0)
        except (asyncio.TimeoutError, Exception):
            pass
RAW_BUFFERClick to expand / collapse

Bug

_keep_typing (the background task that sends sendChatAction(typing) every 2s) can run indefinitely after the agent has already replied, leaving the bot stuck in the "typing" state.

Root Cause

In gateway/platforms/base.py, the finally block of _process_message_background executes the post-delivery callback before calling _stop_typing_task():

# current order (broken)
finally:
    ...
    if callable(_post_cb):
        try:
            _post_result = _post_cb()
            if inspect.isawaitable(_post_result):
                await _post_result   # ← can hang
        except Exception:
            pass
    # Stop typing indicator       ← never reached if callback hangs
    await _stop_typing_task()

If the callback coroutine hangs — for example, _send_goal_status_notice() making an HTTP call over a Telegram socket that entered CLOSE-WAIT state after a prior network error (httpx.ReadError) — _stop_typing_task() is never called. The _keep_typing task keeps refreshing sendChatAction(typing) every 2 seconds indefinitely.

Reproduction Scenario

  1. VPS experiences a transient network error → Telegram connections leave CLOSE-WAIT sockets
  2. User sends a message (or /reset) shortly after reconnect
  3. Gateway processes it, registers a _deliver() post-delivery callback
  4. _deliver() calls _send_goal_status_notice() over the dead socket → hangs
  5. _stop_typing_task() is never awaited → _keep_typing runs forever
  6. Bot shows "typing" indefinitely, even though the reply was sent

Confirmed with ss -tp | grep CLOSE-WAIT showing multiple dead Telegram connections (149.154.166.110:443) after the network errors.

Fix

Two changes to gateway/platforms/base.py:

  1. Move _stop_typing_task() before the post-delivery callback so it always runs immediately after the response is sent.
  2. Add asyncio.wait_for(..., timeout=30.0) to the callback await so a slow/stuck callback cannot block the cleanup path.
# fixed order
finally:
    ...
    # Stop typing first — must not be delayed by callback work
    await _stop_typing_task()
    if callable(_post_cb):
        try:
            _post_result = _post_cb()
            if inspect.isawaitable(_post_result):
                await asyncio.wait_for(_post_result, timeout=30.0)
        except (asyncio.TimeoutError, Exception):
            pass

The same pattern exists in the in-band pending-message drain path (line ~3209) where _stop_typing_task() is already called before spawning the drain task — so the fix is consistent with the existing approach there.

Impact

  • Platform: Telegram (uses one-shot sendChatAction, no native stop)
  • Trigger: any post-delivery callback that makes a slow/blocking network call
  • Symptom: bot appears "typing" for hours after responding; new messages still processed normally (session guard is released)

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - ✅(Solved) Fix fix(gateway): _keep_typing task stuck indefinitely when post-delivery callback hangs [3 pull requests, 1 participants]