hermes - 💡(How to fix) Fix Auxiliary fallback chain should reuse fallback_providers, not maintain a separate hardcoded list

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Root Cause

Free-tier users who explicitly configure only :free models in fallback_providers still get charged (or hit per-key spend limits) because aux tasks invisibly use a paid default. The user has no way to discover this short of reading the source code.

Even paid users are affected: a user who carefully picked a budget- friendly fallback chain will see aux tasks silently use a more expensive model they didn't choose.

Code Example

User's fallback_providers (config.yaml):       Hardcoded aux fallback chain:
┌────────────────────────────────┐             ┌───────────────────────────────┐
1. nvidia/ring-2.6-1t:free     │             │ 1. openrouter →               │
2. nvidia/deepseek-v4-pro      │  ←ignored→  │    google/gemini-3-flash-3. nvidia/glm-5.1              │  each other │    preview (PAID)4. nvidia/minimax-m2.7         │             │ 2. nous → same paid model     │
...                            │             │ 3. custom                     │
└────────────────────────────────┘             │ 4. api-key providers          │
         ↑                                     └───────────────────────────────┘
   Used by main agent                                    ↑
                                                Used by every aux task
                                                (compression, vision, title,
                                                 web_extract, curator, etc.)

---

def _get_provider_chain() -> List[tuple]:
    return [
        (\"openrouter\", _try_openrouter),
        (\"nous\", _try_nous),
        (\"local/custom\", _try_custom_endpoint),
        (\"api-key\", _resolve_api_key_provider),
    ]

---

_OPENROUTER_MODEL = \"google/gemini-3-flash-preview\"   # paid
_NOUS_MODEL = \"google/gemini-3-flash-preview\"          # paid

---

def _resolve_fallback(failed_provider, task):
    # 1. Honor user's fallback_providers first
    for entry in user_fallback_providers:
        if entry.provider == failed_provider:
            continue
        client = try_build_client(entry.provider, entry.model)
        if client: return client, entry.model

    # 2. Hardcoded chain only if user didn't configure anything
    if not user_fallback_providers:
        return _try_payment_fallback(failed_provider, task)

    return None
RAW_BUFFERClick to expand / collapse

The Design Issue

Hermes maintains two parallel fallback systems that don't know about each other:

User's fallback_providers (config.yaml):       Hardcoded aux fallback chain:
┌────────────────────────────────┐             ┌───────────────────────────────┐
│ 1. nvidia/ring-2.6-1t:free     │             │ 1. openrouter →               │
│ 2. nvidia/deepseek-v4-pro      │  ←ignored→  │    google/gemini-3-flash-      │
│ 3. nvidia/glm-5.1              │  each other │    preview (PAID)             │
│ 4. nvidia/minimax-m2.7         │             │ 2. nous → same paid model     │
│ ...                            │             │ 3. custom                     │
└────────────────────────────────┘             │ 4. api-key providers          │
         ↑                                     └───────────────────────────────┘
   Used by main agent                                    ↑
                                                Used by every aux task
                                                (compression, vision, title,
                                                 web_extract, curator, etc.)

When the main provider fails, the main agent walks the user's fallback_providers. But every auxiliary task (compression, title_generation, vision, web_extract, session_search, skills_hub, approval, mcp, triage_specifier, curator) walks a separate, hardcoded fallback chain with hardcoded default models — mostly paid models like google/gemini-3-flash-preview, claude-haiku-4-5, glm-4.5-flash, etc.

This is a violation of the single-source-of-truth principle: the user configured their fallback chain in one place, and most users assume that's the only place the agent will look.

Why This Matters

Free-tier users who explicitly configure only :free models in fallback_providers still get charged (or hit per-key spend limits) because aux tasks invisibly use a paid default. The user has no way to discover this short of reading the source code.

Even paid users are affected: a user who carefully picked a budget- friendly fallback chain will see aux tasks silently use a more expensive model they didn't choose.

Code Evidence

agent/auxiliary_client.py:1823-1841:

def _get_provider_chain() -> List[tuple]:
    return [
        (\"openrouter\", _try_openrouter),
        (\"nous\", _try_nous),
        (\"local/custom\", _try_custom_endpoint),
        (\"api-key\", _resolve_api_key_provider),
    ]

agent/auxiliary_client.py:391-392:

_OPENROUTER_MODEL = \"google/gemini-3-flash-preview\"   # paid
_NOUS_MODEL = \"google/gemini-3-flash-preview\"          # paid

plugins/model-providers/*/__init__.py: every provider's default_aux_model is a paid model.

Nowhere in this fallback path does the code read the user's fallback_providers from config.

Existing Comment Acknowledges Step 1 But Not Step 2

agent/auxiliary_client.py:2451-2457:

"auto" means "use my main chat model for side tasks as well" — no surprise switches to a cheap fallback model for side tasks.

The comment frames Step 1 as preventing surprise model switches, but Step 2's surprise paid-model switch goes unaddressed.

Proposed Fix

When Step 1 (main provider) fails for an aux task, walk the user's fallback_providers list — same order, same models the user picked — before consulting the hardcoded aux chain. The hardcoded chain remains as a last-resort default for users with no fallback_providers configured.

def _resolve_fallback(failed_provider, task):
    # 1. Honor user's fallback_providers first
    for entry in user_fallback_providers:
        if entry.provider == failed_provider:
            continue
        client = try_build_client(entry.provider, entry.model)
        if client: return client, entry.model

    # 2. Hardcoded chain only if user didn't configure anything
    if not user_fallback_providers:
        return _try_payment_fallback(failed_provider, task)

    return None

This makes fallback_providers the single source of truth for the entire agent (main + aux), and respects users who deliberately picked free-only models.

Related

See #24029 for the specific symptom (free-only users getting billed via aux fallback). This issue addresses the underlying design.

Environment

  • Hermes Agent v0.13.0 (2026.5.7)
  • Affects all users with fallback_providers set + auxiliary.*.provider: auto

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix Auxiliary fallback chain should reuse fallback_providers, not maintain a separate hardcoded list