hermes - 💡(How to fix) Fix [Bug]: Local claude-cli custom provider timeout is reported as Empty response and fallback loops [4 pull requests]

Error Message

{"request_id":"1dcdfd75-...","model":"claude-opus-4-7","latency_s":127.411,"status":"error","error_class":"RuntimeError","error":"claude CLI turn timed out"} {"request_id":"e5743941-...","model":"claude-opus-4-7","latency_s":120.014,"status":"error","error_class":"RuntimeError","error":"claude CLI turn timed out"}

surface a clear provider timeout error with the provider/model name and elapsed timeout, Hermes wraps the local provider timeout as an empty response, retries several times, and can fall back to the same claude-cli provider path. This looks like Claude produced an empty answer, but the real error is RuntimeError: claude CLI turn timed out from the shim.

Preserve custom-provider timeout/errors as timeout/error classes rather than normalizing them into empty-content retries.

Fix Action

Fixed

Fixed by PR: fix(error_classifier,fallback): classify custom-provider timeouts correctly, prevent self-selection fallback loop (https://github.com/NousResearch/hermes-agent/pull/22664)
Fixed by PR: fix(fallback): skip chain entries matching current provider/model/base_url (https://github.com/NousResearch/hermes-agent/pull/22780)
Fixed by PR: fix(delegate): add do-not-use guidance to acp schema (carve-out of #22680) (https://github.com/NousResearch/hermes-agent/pull/22806)
Fixed by PR: fix(delegate): add explicit do-not-use guidance to acp_command/acp_args schema descriptions (https://github.com/NousResearch/hermes-agent/pull/22680)

Code Example

SUBPROCESS_TIMEOUT = 120
PROCESS_IDLE_TIMEOUT = 1800
...
content, usage, finish_reason, tool_calls = self._read_response(SUBPROCESS_TIMEOUT)
...
raise RuntimeError("claude CLI turn timed out")

---

{"event":"spawn","model":"claude-opus-4-7","resume":true,"session_id":"38869d6c-...","has_system_prompt":true}
{"request_id":"1dcdfd75-...","model":"claude-opus-4-7","latency_s":127.411,"status":"error","error_class":"RuntimeError","error":"claude CLI turn timed out"}
{"request_id":"170feabf-...","model":"claude-opus-4-7","latency_s":3.515,"status":"ok","prompt_tokens":6,"completion_tokens":46,"has_tool_calls":false}
...
{"event":"idle_evict","session_id":"38869d6c-...","idle_s":1800}
{"event":"spawn","model":"claude-opus-4-7","resume":true,"session_id":"38869d6c-...","has_system_prompt":true}
{"request_id":"e5743941-...","model":"claude-opus-4-7","latency_s":120.014,"status":"error","error_class":"RuntimeError","error":"claude CLI turn timed out"}
{"request_id":"2d5bfaed-...","model":"claude-opus-4-7","latency_s":81.105,"status":"ok","prompt_tokens":14,"completion_tokens":2951,"has_tool_calls":true}
{"request_id":"8d56ffbf-...","model":"claude-opus-4-7","latency_s":52.101,"status":"ok","prompt_tokens":9,"completion_tokens":2916,"has_tool_calls":true}

Bug Description

When Hermes routes a selected claude-cli model through an OpenAI-compatible local shim/custom provider, long Claude CLI turns that exceed the shim's internal 120s per-turn timeout surface to the user as Empty response from model and trigger retry/fallback behavior. In the observed setup, fallback can select the same claude-cli path again, so retries loop against the same timeout surface rather than recovering.

This is not the same class as intentional thinking-only or group-silence empty responses. The underlying provider process is producing/continuing work, but the local OpenAI-compatible shim times out first.

Environment

Hermes model picker entry: claude-cli / claude-opus-4-7
Routing path: custom_providers.claude-cli -> OpenAI-compatible endpoint -> local shim at http://127.0.0.1:7891/v1
Shim process: persistent Claude CLI subprocess
Claude CLI args observed in live child process include:
- -p
- --output-format stream-json
- --input-format stream-json
- --include-partial-messages
- --verbose
- --permission-mode dontAsk
- --model claude-opus-4-7
- --resume <session-id>

Local evidence from the shim implementation

In the local shim, the hard timeout is fixed:

SUBPROCESS_TIMEOUT = 120
PROCESS_IDLE_TIMEOUT = 1800
...
content, usage, finish_reason, tool_calls = self._read_response(SUBPROCESS_TIMEOUT)
...
raise RuntimeError("claude CLI turn timed out")

Idle eviction after 1800 seconds forces cold/resumed spawns, which are more likely to exceed the 120s per-turn budget when context is heavy or MCP/tool state reconnects.

Observed Log Pattern

The shim log shows 120s/127s timeout failures followed immediately by successful turns, which indicates the model/CLI can continue but the wrapper budget is too short:

{"event":"spawn","model":"claude-opus-4-7","resume":true,"session_id":"38869d6c-...","has_system_prompt":true}
{"request_id":"1dcdfd75-...","model":"claude-opus-4-7","latency_s":127.411,"status":"error","error_class":"RuntimeError","error":"claude CLI turn timed out"}
{"request_id":"170feabf-...","model":"claude-opus-4-7","latency_s":3.515,"status":"ok","prompt_tokens":6,"completion_tokens":46,"has_tool_calls":false}
...
{"event":"idle_evict","session_id":"38869d6c-...","idle_s":1800}
{"event":"spawn","model":"claude-opus-4-7","resume":true,"session_id":"38869d6c-...","has_system_prompt":true}
{"request_id":"e5743941-...","model":"claude-opus-4-7","latency_s":120.014,"status":"error","error_class":"RuntimeError","error":"claude CLI turn timed out"}
{"request_id":"2d5bfaed-...","model":"claude-opus-4-7","latency_s":81.105,"status":"ok","prompt_tokens":14,"completion_tokens":2951,"has_tool_calls":true}
{"request_id":"8d56ffbf-...","model":"claude-opus-4-7","latency_s":52.101,"status":"ok","prompt_tokens":9,"completion_tokens":2916,"has_tool_calls":true}

Steps to Reproduce

Configure a local OpenAI-compatible custom provider that wraps Claude CLI with a 120s per-turn timeout.
Select that provider/model via Hermes model picker.
Let the shim idle long enough to evict the warm child process, or use a heavy resumed context/tool-call turn.
Send a message that causes the Claude CLI turn to take longer than 120 seconds.
Observe Hermes reporting Empty response from model / retrying, even though the underlying failure is a provider timeout.

Expected Behavior

Hermes should classify this as a provider timeout/failure, not as a model empty-content response. It should either:

surface a clear provider timeout error with the provider/model name and elapsed timeout,
allow custom providers to advertise/request longer per-request timeout budgets,
avoid switching fallback to the same provider/model path that just timed out,
optionally mark first turn after spawn/resume as eligible for a longer timeout budget.

Actual Behavior

Hermes wraps the local provider timeout as an empty response, retries several times, and can fall back to the same claude-cli provider path. This looks like Claude produced an empty answer, but the real error is RuntimeError: claude CLI turn timed out from the shim.

Proposed Fixes

Preserve custom-provider timeout/errors as timeout/error classes rather than normalizing them into empty-content retries.
Add/configure per-custom-provider request timeout metadata, for example timeout_s or request_timeout_ms, and thread it into the provider call path.
Detect fallback self-selection: if current provider/model and fallback provider/model resolve to the same endpoint/model, skip or pick a different fallback.
Consider first-turn-after-spawn / resumed-session timeout budgets separately from warm-turn budgets.
Improve logs/user-visible errors so Empty response from model is reserved for genuinely empty model output, not transport/provider timeout.

Different from #13248, which covers intentional empty responses in group-chat/slack addressing semantics.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix [Bug]: Local claude-cli custom provider timeout is reported as Empty response and fallback loops [4 pull requests]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Fix Action

Fixed

Code Example

Bug Description

Environment

Local evidence from the shim implementation

Observed Log Pattern

Steps to Reproduce

Expected Behavior

Actual Behavior

Proposed Fixes

Related

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix [Bug]: Local claude-cli custom provider timeout is reported as Empty response and fallback loops [4 pull requests]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Fix Action

Fixed

Code Example

Bug Description

Environment

Local evidence from the shim implementation

Observed Log Pattern

Steps to Reproduce

Expected Behavior

Actual Behavior

Proposed Fixes

Related

Still need to ship something?

RELATED_DISCOVERY

TRENDING