hermes - 💡(How to fix) Fix [Feature] Configurable fallback chains for auxiliary tasks [1 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

  1. If all entries fail, raise the last error

Fix Action

Fixed

Code Example

auxiliary:
  vision:
    provider: glm
    model: glm-4v-flash
    fallback_chain:
      - provider: openrouter
        model: google/gemini-3-flash-preview
      - provider: nous
        model: claude-sonnet-4
  compression:
    provider: openrouter
    fallback_chain:
      - provider: openai
        model: gpt-4o-mini
RAW_BUFFERClick to expand / collapse

Feature Description

Add configurable fallback chains for auxiliary tasks so that when the primary provider fails (quota, rate limit, connection), the system automatically tries alternative providers instead of failing silently.

Problem

Currently, auxiliary tasks like vision, tts, and compression have a single provider. When that provider fails:

  • Vision: image analysis silently fails, agent loses visual context
  • Compression: context compaction fails, conversation history dropped without summary
  • TTS: voice output fails, user gets no audio

See #26803 for a detailed analysis of the fallback chain gap in call_llm.

Proposed Config

auxiliary:
  vision:
    provider: glm
    model: glm-4v-flash
    fallback_chain:
      - provider: openrouter
        model: google/gemini-3-flash-preview
      - provider: nous
        model: claude-sonnet-4
  compression:
    provider: openrouter
    fallback_chain:
      - provider: openai
        model: gpt-4o-mini

Behavior

  1. Try primary provider first
  2. On fallback-worthy errors (429 quota, 402 payment, connection timeout, auth failure), try the next entry in fallback_chain
  3. If all entries fail, raise the last error
  4. Log which provider was actually used for observability

This extends the existing fallback logic in call_llm (currently gated on is_auto) to work with explicitly configured providers.

Related

  • #26803 (fallback doesn't trigger on rate limits)
  • #26827 (GLM vision fails silently)

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix [Feature] Configurable fallback chains for auxiliary tasks [1 pull requests]