hermes - 💡(How to fix) Fix MCP HTTP transport: anyio RuntimeError during streamable_http_client cleanup causes reconnect failure loop

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

RuntimeError: The current task is not holding this lock anyio/_core/_synchronization.py:166 anyio/_backends/_asyncio.py:1854

Root Cause

In tools/mcp_tool.py, _run_http() uses async with streamable_http_client(...). When the MCP session tears down for reconnect, streamable_http_client.__aexit__ triggers anyio TaskGroup cleanup. With anyio >= 4.x on asyncio, the anyio Lock acquired inside the TaskGroup cannot be released from the asyncio task driving cleanup — it was held by a different coroutine:

RuntimeError: The current task is not holding this lock
  anyio/_core/_synchronization.py:166
  anyio/_backends/_asyncio.py:1854

This propagates to run() except Exception, counting as a reconnect failure. After 5 retries (all hitting the same bug), the server is permanently given up.

Fix Action

Workaround

Skip retry counting for this known anyio cleanup error when the server was already connected:

if isinstance(exc, RuntimeError) and "not holding this lock" in str(exc):
    if self._ready.is_set():
        continue  # expected anyio cleanup, retry immediately

Code Example

RuntimeError: The current task is not holding this lock
  anyio/_core/_synchronization.py:166
  anyio/_backends/_asyncio.py:1854

---

keepalive failed, triggering reconnect
reconnect requested — tearing down HTTP session
connection lost (attempt 1/5): unhandled errors in a TaskGroup
...
failed after 5 reconnection attempts, giving up

---

if isinstance(exc, RuntimeError) and "not holding this lock" in str(exc):
    if self._ready.is_set():
        continue  # expected anyio cleanup, retry immediately
RAW_BUFFERClick to expand / collapse

Bug Description

When using an HTTP/StreamableHTTP MCP server, the streamable_http_client context manager throws RuntimeError("The current task is not holding this lock") during cleanup when the session is torn down for reconnect. This is treated as a connection failure, consumes retry attempts, and eventually causes the MCP server to be permanently disconnected after 5 retries.

Steps to Reproduce

  1. Configure an HTTP MCP server in config.yaml
  2. Start Hermes Gateway
  3. Wait for keepalive to trigger reconnection (default 180s)
  4. Connection is lost after ~3 minutes, retries fail, server gives up

Root Cause

In tools/mcp_tool.py, _run_http() uses async with streamable_http_client(...). When the MCP session tears down for reconnect, streamable_http_client.__aexit__ triggers anyio TaskGroup cleanup. With anyio >= 4.x on asyncio, the anyio Lock acquired inside the TaskGroup cannot be released from the asyncio task driving cleanup — it was held by a different coroutine:

RuntimeError: The current task is not holding this lock
  anyio/_core/_synchronization.py:166
  anyio/_backends/_asyncio.py:1854

This propagates to run() except Exception, counting as a reconnect failure. After 5 retries (all hitting the same bug), the server is permanently given up.

Environment

  • Hermes Agent: v0.14.0 (c0169496d)
  • Python: 3.11.14 / mcp SDK: 1.24.0 / anyio: 4.12.0
  • MCP server: gbrain v0.37.1.0 (HTTP/StreamableHTTP via ngrok)

Log Evidence

keepalive failed, triggering reconnect
reconnect requested — tearing down HTTP session
connection lost (attempt 1/5): unhandled errors in a TaskGroup
...
failed after 5 reconnection attempts, giving up

Workaround

Skip retry counting for this known anyio cleanup error when the server was already connected:

if isinstance(exc, RuntimeError) and "not holding this lock" in str(exc):
    if self._ready.is_set():
        continue  # expected anyio cleanup, retry immediately

Suggested Fix

Handle streamable_http_client cleanup at the _run_http() boundary when exiting for reconnect, or adjust async context management to ensure Lock ownership consistency.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING