hermes - ✅(Solved) Fix [Feature]: adjustable provider reconnection attempt count [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#11616Fetched 2026-04-18 05:59:50
View on GitHub
Comments
1
Participants
2
Timeline
2
Reactions
0
Timeline (top)
commented ×1labeled ×1

PR fix notes

PR #12013: feat(agent): make primary-provider API retry count configurable

Description (problem / solution / changelog)

What & why

The per-call API retry loop in `run_agent.py` uses a hardcoded `max_retries = 3`. Users with a configured `fallback_model` who would rather fail over sooner after an unresponsive primary had no way to shorten the wait — three attempts with exponential backoff stretch to 15+ minutes on a flapping upstream before the fallback kicks in.

Issue #11616 logs exactly this scenario on Qwen via OpenRouter:

``` [20:12] ⚠️ No response from provider for 180s (model: qwen/qwen3-coder-480b-…). Reconnecting… [20:13] ⏳ Retrying in 2.0s (attempt 1/3)… [20:17] ⏳ Retrying in 4.5s (attempt 2/3)… [20:20] ⚠️ No response from provider for 180s. Reconnecting… ← full retry budget burned ```

Change

Read the retry budget from `HERMES_API_MAX_RETRIES` (default `3`, clamped to non-negative, falls back to `3` on malformed values). `0` disables retries entirely, so one failed call routes directly to the fallback provider.

```python

before

max_retries = 3

after

try: max_retries = max(0, int(os.getenv("HERMES_API_MAX_RETRIES", "3"))) except (TypeError, ValueError): max_retries = 3 ```

The env-var-based knob matches the existing `HERMES_API_TIMEOUT` / `HERMES_API_CALL_STALE_TIMEOUT` pattern in the same file and avoids opening a `config.yaml` schema discussion for a single integer. A nested `agent.max_api_retries` config knob (as the issue reporter proposed in ex-1) is a reasonable follow-up if self-hosters ask for it.

Also document the new variable alongside `HERMES_API_TIMEOUT` in `website/docs/reference/environment-variables.md`.

How to test

```bash

Default behaviour unchanged

unset HERMES_API_MAX_RETRIES python -c "import os; print(int(os.getenv('HERMES_API_MAX_RETRIES', '3')))" # 3

Lower for fast failover

export HERMES_API_MAX_RETRIES=1 python -c "import os; print(int(os.getenv('HERMES_API_MAX_RETRIES', '3')))" # 1

Zero disables retries entirely

export HERMES_API_MAX_RETRIES=0 python -c "import os; print(int(os.getenv('HERMES_API_MAX_RETRIES', '3')))" # 0

Malformed values fall back to 3

export HERMES_API_MAX_RETRIES=not-a-number ```

No existing tests pin the `max_retries = 3` literal; the default preserves today's behaviour byte-for-byte.

Platforms tested

  • macOS (Darwin 25.3.0), Python 3.11.13. Change is platform-agnostic.

Related

Closes #11616.

Changed files

  • run_agent.py (modified, +8/-1)
  • website/docs/reference/environment-variables.md (modified, +1/-0)
RAW_BUFFERClick to expand / collapse

Problem or Use Case

unstalbe provider cause agent reconnect again and again, that is a waste of time. any other fellas run into same issue as i do?

[2026/4/17 20:12] agentHermes_bot: ⚠️ No response from provider for 180s (model: qwen/qwen3-coder-480b-a35b-instruct, context: ~36,418 tokens). Reconnecting... [2026/4/17 20:13] agentHermes_bot: ⏳ Retrying in 2.0153926350855s (attempt 1/3)... [2026/4/17 20:16] agentHermes_bot: ⚠️ No response from provider for 180s (model: qwen/qwen3-coder-480b-a35b-instruct, context: ~36,418 tokens). Reconnecting... [2026/4/17 20:17] agentHermes_bot: ⏳ Retrying in 4.472134508713133s (attempt 2/3)... [2026/4/17 20:19] agentHermes_bot: ⏳ Still working... (10 min elapsed — iteration 1/120, waiting for stream response (90s, no chunks yet)) [2026/4/17 20:20] agentHermes_bot: ⚠️ No response from provider for 180s (model: qwen/qwen3-coder-480b-a35b-instruct, context: ~36,418 tokens). Reconnecting... [2026/4/17 20:24] agentHermes_bot: ⚠️ No response from provider for 180s (model: qwen/qwen3-coder-480b-a35b-instruct, context: ~36,418 tokens). Reconnecting... [2026/4/17 20:25] agentHermes_bot: ⏳ Retrying in 2.182153623726161s (attempt 1/3)... [2026/4/17 20:28] agentHermes_bot: ⚠️ No response from provider for 180s (model: qwen/qwen3-coder-480b-a35b-instruct, context: ~36,418 tokens). Reconnecting... [2026/4/17 20:29] agentHermes_bot: ⏳ Still working... (20 min elapsed — iteration 1/120, waiting for stream response (30s, no chunks yet)) [2026/4/17 20:29] agentHermes_bot: ⏳ Retrying in 4.462992539814402s (attempt 2/3)... [2026/4/17 20:30] agentHermes_bot: 💻 terminal: "cd /home/rsa-key-20260402/.hermes/wor..." 📖 read_file: "/home/rsa-key-20260402/.hermes/worksp..." 💻 terminal: "cd /home/rsa-key-20260402/.hermes/wor..." (×2) [2026/4/17 20:39] agentHermes_bot: ⏳ Still working... (30 min elapsed — iteration 5/120, waiting for stream response (120s, no chunks yet)) [2026/4/17 20:40] agentHermes_bot: ⚠️ No response from provider for 180s (model: qwen/qwen3-coder-480b-a35b-instruct, context: ~10,726 tokens). Reconnecting... [2026/4/17 20:41] agentHermes_bot: ⏳ Retrying in 2.953990269220814s (attempt 1/3)... [2026/4/17 20:44] agentHermes_bot: ⚠️ No response from provider for 180s (model: qwen/qwen3-coder-480b-a35b-instruct, context: ~10,726 tokens). Reconnecting... [2026/4/17 20:45] agentHermes_bot: ⏳ Retrying in 4.003901920026081s (attempt 2/3)... [2026/4/17 20:48] agentHermes_bot: ⚠️ No response from provider for 180s (model: qwen/qwen3-coder-480b-a35b-instruct, context: ~10,726 tokens). Reconnecting...

Proposed Solution

is there any way that we can adjust max retry attempt counts so the agent is able to switch to fallback provider straight. say by add it to providers settings or agent section under config.toml. submitting this post 'cause i've not seen an intruction from docs, nor other posts.

ex 1: agent: max_turns: 120 max_retry_before_failover: 1 gateway_timeout: 3600 restart_drain_timeout: 60 service_tier: ''

ex 2: custom_providers:

Alternatives Considered

No response

Feature Type

Performance / reliability

Scope

Small (single file, < 50 lines)

Contribution

  • I'd like to implement this myself and submit a PR

Debug Report (optional)

extent analysis

TL;DR

Adjusting the max_retry_before_failover setting in the agent or custom_providers section of the config.toml file may help the agent switch to a fallback provider more quickly.

Guidance

  • Review the config.toml file to see if the max_retry_before_failover setting is already configured for the agent or custom providers.
  • Consider adding or modifying the max_retry_before_failover setting to a lower value (e.g., 1) to reduce the number of retries before failover.
  • Verify that the fallback provider is properly configured and available.
  • Test the updated configuration to ensure the agent switches to the fallback provider as expected.

Example

agent:
  max_turns: 120
  max_retry_before_failover: 1
  gateway_timeout: 3600
  restart_drain_timeout: 60
  service_tier: ''

custom_providers:
- name: nvidia
  base_url: https://integrate..com/v1
  api_key: ${NDA_API_KEY}
  models:
    - z-ai/glm4.7:
        max_retry_before_failover: 1

Notes

The effectiveness of this solution depends on the specific configuration and environment. It's essential to test and verify the changes to ensure the desired behavior.

Recommendation

Apply the workaround by adjusting the max_retry_before_failover setting, as it may help improve the agent's reliability and performance by reducing the number of retries before switching to a fallback provider.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - ✅(Solved) Fix [Feature]: adjustable provider reconnection attempt count [1 pull requests, 1 comments, 2 participants]