hermes - ✅(Solved) Fix Bug: Telegram bot freezes when switching providers/models due to blocking HTTP call in asyncio event loop [2 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#20525Fetched 2026-05-07 03:57:32
View on GitHub
Comments
0
Participants
1
Timeline
6
Reactions
0
Author
Participants
Timeline (top)
labeled ×4cross-referenced ×2

When using the Telegram bot integration and attempting to switch providers or models (e.g., switching to OpenRouter), the bot freezes and loses connection. The Telegram bot becomes completely unresponsive until the blocking operation times out.

Root Cause

The root cause is in agent/models_dev.py inside the fetch_models_dev() function. It uses the synchronous requests library to make an HTTP GET call directly on the asyncio event loop thread:

# agent/models_dev.py
import requests

def fetch_models_dev():
    response = requests.get(MODELS_DEV_URL, timeout=15)  # BLOCKING — freezes the event loop
    ...

This blocking call (with a 15-second timeout) is invoked from gateway/run.py inside async functions:

  1. list_authenticated_providers() (called at ~line 4636 for model picker display)
    1. list_authenticated_providers() (called at ~line 4747 for fallback text provider list)
    1. _switch_model() (called at ~line 4665 inside _on_model_selected closure) Since these blocking calls run on the main asyncio event loop thread, the entire bot is frozen for up to 15 seconds every time a user tries to switch providers or models. This causes Telegram to think the bot is unresponsive and the connection is dropped.

Fix Action

Fix

Wrap the blocking calls in gateway/run.py with asyncio.get_event_loop().run_in_executor() to offload them to a thread pool, preventing event loop blocking:

import functools

# Before (blocking):
providers = list_authenticated_providers(...)

# After (non-blocking):
loop = asyncio.get_event_loop()
providers = await loop.run_in_executor(
    None,
    functools.partial(list_authenticated_providers, ...)
)

Apply the same pattern to the _switch_model() call in _on_model_selected.

Additionally, reduce the requests.get() timeout in agent/models_dev.py from timeout=15 to a lower value (e.g., timeout=8) to prevent excessive waits if the models.dev API is slow.

PR fix notes

PR #20530: fix(gateway): offload /model catalog work

Description (problem / solution / changelog)

Fixes #20525

Summary

  • run gateway /model provider listing through the existing context-preserving executor
  • run both picker callback model switches and text /model ... switches through the same executor
  • add regression coverage for text listing, text switching, and interactive picker switching

Why

list_authenticated_providers, list_picker_providers, and switch_model can touch synchronous model catalog/auth paths such as models.dev. Calling them directly inside the async gateway command handler can block Telegram/Discord event-loop progress during provider/model switching.

Verification

  • scripts/run_tests.sh tests/gateway/test_model_command_custom_providers.py -> 4 passed
  • scripts/run_tests.sh tests/gateway/test_model_command_custom_providers.py tests/gateway/test_session_env.py tests/gateway/test_model_switch_persistence.py -> 25 passed, 2 existing deprecation warnings from tests/conftest.py:492
  • git diff --check

Non-goals

  • no changes to model catalog semantics
  • no timeout tuning in agent/models_dev.py
  • no platform-specific picker behavior changes

Changed files

  • gateway/run.py (modified, +42/-34)
  • tests/gateway/test_model_command_custom_providers.py (modified, +180/-0)

PR #20979: fix(gateway): offload model command switching

Description (problem / solution / changelog)

Summary

Fixes #20525.

The gateway /model path called synchronous model/provider helpers directly from async handlers. Those helpers can hit models.dev and other provider metadata paths using blocking HTTP, which can stall the Telegram gateway event loop while a user opens the picker or switches providers.

This PR keeps the change limited to the gateway command path:

  • offloads picker provider listing with asyncio.to_thread(list_picker_providers, ...)
  • offloads picker callback switching with asyncio.to_thread(switch_model, ...)
  • offloads fallback text provider listing with asyncio.to_thread(list_authenticated_providers, ...)
  • offloads direct /model <name> switching with asyncio.to_thread(switch_model, ...)
  • adds gateway regression tests proving the text list and picker callback paths route their synchronous work through to_thread

Verification

  • scripts/run_tests.sh tests/gateway/test_model_command_custom_providers.py -> 3 passed
  • git diff --check

Non-goals

  • Does not change model/provider resolution behavior.
  • Does not alter the models.dev timeout or cache policy.
  • Does not change adapter callback routing; it only moves blocking work off the event loop.

Changed files

  • gateway/run.py (modified, +8/-4)
  • tests/gateway/test_model_command_custom_providers.py (modified, +101/-0)

Code Example

# agent/models_dev.py
import requests

def fetch_models_dev():
    response = requests.get(MODELS_DEV_URL, timeout=15)  # BLOCKING — freezes the event loop
    ...

---

import functools

# Before (blocking):
providers = list_authenticated_providers(...)

# After (non-blocking):
loop = asyncio.get_event_loop()
providers = await loop.run_in_executor(
    None,
    functools.partial(list_authenticated_providers, ...)
)
RAW_BUFFERClick to expand / collapse

Summary

When using the Telegram bot integration and attempting to switch providers or models (e.g., switching to OpenRouter), the bot freezes and loses connection. The Telegram bot becomes completely unresponsive until the blocking operation times out.

Root Cause

The root cause is in agent/models_dev.py inside the fetch_models_dev() function. It uses the synchronous requests library to make an HTTP GET call directly on the asyncio event loop thread:

# agent/models_dev.py
import requests

def fetch_models_dev():
    response = requests.get(MODELS_DEV_URL, timeout=15)  # BLOCKING — freezes the event loop
    ...

This blocking call (with a 15-second timeout) is invoked from gateway/run.py inside async functions:

  1. list_authenticated_providers() (called at ~line 4636 for model picker display)
    1. list_authenticated_providers() (called at ~line 4747 for fallback text provider list)
    1. _switch_model() (called at ~line 4665 inside _on_model_selected closure) Since these blocking calls run on the main asyncio event loop thread, the entire bot is frozen for up to 15 seconds every time a user tries to switch providers or models. This causes Telegram to think the bot is unresponsive and the connection is dropped.

Steps to Reproduce

  1. Start the Hermes gateway with Telegram integration configured
    1. In Telegram, send the /model command to open the model picker
    1. Select a model from a different provider (e.g., OpenRouter)
    1. Observe: the bot freezes for 10–15 seconds and may drop connection

Expected Behavior

The bot should respond instantly or near-instantly when switching models/providers, with no event loop blocking.

Fix

Wrap the blocking calls in gateway/run.py with asyncio.get_event_loop().run_in_executor() to offload them to a thread pool, preventing event loop blocking:

import functools

# Before (blocking):
providers = list_authenticated_providers(...)

# After (non-blocking):
loop = asyncio.get_event_loop()
providers = await loop.run_in_executor(
    None,
    functools.partial(list_authenticated_providers, ...)
)

Apply the same pattern to the _switch_model() call in _on_model_selected.

Additionally, reduce the requests.get() timeout in agent/models_dev.py from timeout=15 to a lower value (e.g., timeout=8) to prevent excessive waits if the models.dev API is slow.

Affected Files

  • agent/models_dev.py — uses synchronous requests.get() with 15s timeout
    • gateway/run.py — calls list_authenticated_providers() and _switch_model() directly on the event loop thread

Environment

  • Platform: Telegram
    • Provider switching: Any provider change (e.g., switching to OpenRouter)
      • Impact: Complete bot freeze / connection loss during model selection

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING