hermes - 💡(How to fix) Fix Per-auxiliary fallback_providers for independent task resilience

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Root Cause

  1. Making the auto chain configurable — lets users reorder/redefine the auto providers globally. Rejected because it couples all auxiliaries to a single chain and cant express per-task differences (vision needs multimodal, compression doesnt).

Code Example

auxiliary:
  vision:
    provider: opencode-go
    model: kimi-k2.6
    fallback_providers:
      - provider: openrouter
        model: google/gemini-3-flash
      - provider: opencode-zen
        model: minimax-m2.7
    timeout: 120

  web_extract:
    provider: opencode-zen
    model: minimax-m2.7
    fallback_providers:
      - provider: main
        model: ""
    timeout: 360
RAW_BUFFERClick to expand / collapse

Feature Description

Each auxiliary task (vision, web_extract, compression, session_search, skills_hub, mcp, flush_memories) should accept an optional fallback_providers list, mirroring the existing model.fallback_providers pattern at the top level. When the primary provider:model fails, the auxiliary client would walk the fallback list before exhausting.

Motivation

The auxiliary client currently offers two resolution modes, neither of which combines determinism with resilience:

  1. Explicit provider:model — deterministic model selection, zero fallback. If the chosen provider is down or rate-limited, the task fails silently. Users who need specific models for specific tasks (e.g., vision needs a multimodal model, compression wants a fast cheap model) have no resilience path.

  2. auto — delegates to a hardcoded resolution chain (_get_provider_chain() in agent/auxiliary_client.py): main provider -> OpenRouter -> Nous -> custom endpoint -> API-key providers. No user control over order, model selection, or which providers participate. Codex is deliberately excluded.

Users who want both deterministic model selection per-task AND cross-provider fallback have no way to achieve it today.

Proposed Solution

Extend the auxiliary task config blocks with an optional fallback_providers key:

auxiliary:
  vision:
    provider: opencode-go
    model: kimi-k2.6
    fallback_providers:
      - provider: openrouter
        model: google/gemini-3-flash
      - provider: opencode-zen
        model: minimax-m2.7
    timeout: 120

  web_extract:
    provider: opencode-zen
    model: minimax-m2.7
    fallback_providers:
      - provider: main
        model: ""
    timeout: 360

Each entry in fallback_providers mirrors the top-level model.fallback_providers format — provider (required), model (optional, defaults to provider default), and optionally base_url / api_key for custom endpoints.

Fallback semantics:

  • Each auxiliary invocation tries its configured provider:model first
  • On failure (rate limits, server errors, auth failures — same triggers as main fallback), walk the fallback_providers list in order
  • If all entries are exhausted, fall through to the existing auto chain as a last resort
  • Per-call scope: each auxiliary invocation resets to the primary provider

Implementation Notes

The change surface is relatively contained:

  1. agent/auxiliary_client.pyresolve_auxiliary_client() reads per-task fallback_providers from config, passes to a new resolver path. The existing _resolve_auto and _get_provider_chain remain untouched.
  2. hermes_cli/config.py — add fallback_providers as a recognized key under auxiliary.<task> with schema validation (list of provider:model objects).
  3. hermes_cli/commands.py — optionally extend the hermes fallback interactive manager to cover auxiliary tasks.

Alternatives Considered

  1. Making the auto chain configurable — lets users reorder/redefine the auto providers globally. Rejected because it couples all auxiliaries to a single chain and cant express per-task differences (vision needs multimodal, compression doesnt).

  2. Simple fallback_model per auxiliary (single provider:model, no list) — simpler but less flexible. The main model already evolved from fallback_model to fallback_providers for good reason (multiple ordered fallbacks). Repeating that evolution on auxiliaries would be wasteful.

  3. Status quo — explicit pinning with zero fallback. Viable for users who accept the single-provider risk but suboptimal for production use.

Backward Compatibility

  • Absent fallback_providers -> current behaviour (single explicit provider, or auto chain)
  • Present fallback_providers + explicit provider -> try explicit first, then fallback list
  • Present fallback_providers + provider: auto -> try Step 1 (main provider), then fallback list, then Step 2 (hardcoded chain)

No existing configs break. Its a purely additive change.


Filed by Aldous (AI agent on behalf of Magnus Hedemark)

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING