hermes - 💡(How to fix) Fix Stale-stream handler tries to rebuild OpenAI client when provider is Anthropic

Official PRs (…)
ON THIS PAGE

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

When the stream-stale detector fires during an Anthropic Messages API call, the recovery path unconditionally calls agent._replace_primary_openai_client(...). That helper requires OPENAI_API_KEY, which is typically unset in Anthropic-only setups, so it raises and logs a misleading WARNING that suggests the recovery itself failed. In reality the Anthropic client is rebuilt by a separate code path and the turn completes — but the noise pollutes errors.log and confuses debugging.

Error Message

try: if agent.api_mode == "anthropic_messages": try: agent._anthropic_client.close() except Exception: pass agent._rebuild_anthropic_client() else: _close_request_client_once("stale_stream_kill") try: agent._replace_primary_openai_client( reason="stale_stream_pool_cleanup" ) except Exception: pass except Exception: pass

Root Cause

Severity

Low (cosmetic / log noise). Recovery still works because the Anthropic client gets rebuilt on the next outer-retry iteration via the existing error-handling path.

Fix Action

Fix / Workaround

Suggested patch

agent/chat_completion_helpers.py around line 2047–2056:

Code Example

WARNING agent.chat_completion_helpers: Stream stale for 180s (threshold 180s)
  — no chunks received. model=claude-opus-4-7 context=~12,247 tokens.
  Killing connection.
WARNING run_agent: Failed to rebuild shared OpenAI client
  (stale_stream_pool_cleanup) thread=asyncio_1:6173863936
  provider=anthropic base_url=https://api.anthropic.com model=claude-opus-4-7
  error=The api_key client option must be set either by passing api_key to
  the client or by setting the OPENAI_API_KEY environment variable

---

try:
    if agent.api_mode == "anthropic_messages":
        try:
            agent._anthropic_client.close()
        except Exception:
            pass
        agent._rebuild_anthropic_client()
    else:
        _close_request_client_once("stale_stream_kill")
        try:
            agent._replace_primary_openai_client(
                reason="stale_stream_pool_cleanup"
            )
        except Exception:
            pass
except Exception:
    pass
RAW_BUFFERClick to expand / collapse

Bug: stale-stream handler tries to rebuild OpenAI client when provider is Anthropic

Repo

NousResearch/hermes-agent

Summary

When the stream-stale detector fires during an Anthropic Messages API call, the recovery path unconditionally calls agent._replace_primary_openai_client(...). That helper requires OPENAI_API_KEY, which is typically unset in Anthropic-only setups, so it raises and logs a misleading WARNING that suggests the recovery itself failed. In reality the Anthropic client is rebuilt by a separate code path and the turn completes — but the noise pollutes errors.log and confuses debugging.

Repro

  • model.provider = anthropic, model.default = claude-opus-4-7 (or any Anthropic model), OPENAI_API_KEY unset.
  • Send a tool-heavy turn that takes > HERMES_STREAM_STALE_TIMEOUT (180s default) to deliver its first stream chunk.
  • Stale-stream handler at agent/chat_completion_helpers.py:2033 fires.

Observed log

WARNING agent.chat_completion_helpers: Stream stale for 180s (threshold 180s)
  — no chunks received. model=claude-opus-4-7 context=~12,247 tokens.
  Killing connection.
WARNING run_agent: Failed to rebuild shared OpenAI client
  (stale_stream_pool_cleanup) thread=asyncio_1:6173863936
  provider=anthropic base_url=https://api.anthropic.com model=claude-opus-4-7
  error=The api_key client option must be set either by passing api_key to
  the client or by setting the OPENAI_API_KEY environment variable

The user-facing _emit_status banner ("⚠️ No response from provider for 180s … Reconnecting…") also reaches messaging platforms (Discord/Telegram), which is correct behaviour but worth knowing.

Expected

On the stale-stream recovery path:

  • If agent.api_mode == "anthropic_messages", rebuild the Anthropic client (agent._anthropic_client.close() + agent._rebuild_anthropic_client()) and skip the OpenAI pool rebuild.
  • Otherwise behave as today.

This mirrors the existing branching at line ~2066 (interrupt path) and at line ~242 in the non-stream stale handler, which already do the right thing.

Suggested patch

agent/chat_completion_helpers.py around line 2047–2056:

try:
    if agent.api_mode == "anthropic_messages":
        try:
            agent._anthropic_client.close()
        except Exception:
            pass
        agent._rebuild_anthropic_client()
    else:
        _close_request_client_once("stale_stream_kill")
        try:
            agent._replace_primary_openai_client(
                reason="stale_stream_pool_cleanup"
            )
        except Exception:
            pass
except Exception:
    pass

Severity

Low (cosmetic / log noise). Recovery still works because the Anthropic client gets rebuilt on the next outer-retry iteration via the existing error-handling path.

Environment

  • hermes-agent commit: cc93053b4
  • macOS 26.4.1, Python 3.11.15
  • provider=anthropic, model=claude-opus-4-7

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix Stale-stream handler tries to rebuild OpenAI client when provider is Anthropic