hermes - 💡(How to fix) Fix [Feature Request] Make `_fallback_sticky` configurable via `fallback_model.sticky` [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#23514Fetched 2026-05-11 03:29:08
View on GitHub
Comments
1
Participants
2
Timeline
6
Reactions
0
Author
Participants
Timeline (top)
labeled ×4commented ×1cross-referenced ×1

Root Cause

  1. Performance optimization — Avoids a "3 retries → fallback" cycle on every turn
  2. Cost predictability — Prevents each message from re-triggering the full retry + fallback flow during brief primary model outages
  3. Predictability — Users know that once fallback activates, the session uses a single consistent model

Code Example

def _restore_primary_runtime(self):
    if getattr(self, "_fallback_sticky", False):
        return False
    # Attempt to restore primary model
    ...

---

fallback_model:
  provider: deepseek
  model: deepseek-v4-flash
  base_url: https://api.deepseek.com/v1
  sticky: true  # Default: true, backward compatible
RAW_BUFFERClick to expand / collapse

Overview

When a fallback model is activated in Hermes Agent, the _fallback_sticky flag keeps the fallback active for the entire session. This is currently hardcoded behavior. This issue proposes exposing it as a user-configurable option.

Current Behavior

When the primary model triggers a fallback (e.g., qwen-local gets a TransportError), the system activates the _fallback_sticky mechanism. Once triggered, the fallback model persists for the entire session until /new or /reset is called.

The core logic resides in run_agent.py:

def _restore_primary_runtime(self):
    if getattr(self, "_fallback_sticky", False):
        return False
    # Attempt to restore primary model
    ...

The gateway layer mirrors this with _session_fallback_sticky, ensuring persistence even across agent instance recreations.

Why This Matters

  1. Performance optimization — Avoids a "3 retries → fallback" cycle on every turn
  2. Cost predictability — Prevents each message from re-triggering the full retry + fallback flow during brief primary model outages
  3. Predictability — Users know that once fallback activates, the session uses a single consistent model

Proposed Enhancement

Add a sticky option to the fallback_model configuration in config.yaml:

fallback_model:
  provider: deepseek
  model: deepseek-v4-flash
  base_url: https://api.deepseek.com/v1
  sticky: true  # Default: true, backward compatible
  • sticky: true — Current behavior: fallback persists for the entire session
  • sticky: false — Each turn attempts to restore the primary model (useful for networks with frequent fluctuations)

Implementation Notes

  • Default to true for backward compatibility
  • Only requires changing the initial value of _fallback_sticky from a hardcoded False to reading config.fallback_model.sticky
  • Gateway layer synchronously updates _session_fallback_sticky initialization logic

References

  • run_agent.py_restore_primary_runtime(), _fallback_sticky
  • gateway/run.py_session_fallback_sticky

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix [Feature Request] Make `_fallback_sticky` configurable via `fallback_model.sticky` [1 comments, 2 participants]