hermes - 💡(How to fix) Fix [Feature]: Multi-provider credential pools for cross-provider failover and rotation [1 participants]

hermes2026-04-17 18:43:23

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#11737•Fetched 2026-04-18 05:59:07

View on GitHub

Comments

Participants

Timeline

Reactions

Author

brimdor

Participants

brimdor

Error Message

If the selected entry fails with an auth/rate-limit error, mark it exhausted and rotate to the next entry

Error classification: Not all errors should trigger pool rotation — only auth (401/403), rate-limit (429), and billing/quota (402) errors. Server errors (500/502/503) should use retry logic instead.

Code Example

pools:
  openai-compatible:
    strategy: round_robin      # fill_first | round_robin | random | least_used
    entries:
      - provider: openai-codex
        label: "Codex Plus"
        # Uses pool entry from auth.json (openai-codex)

      - provider: custom
        base_url: https://api.openai.com/v1
        api_key_env: OPENAI_API_KEY_DIRECT
        label: "OpenAI Direct"
        api_mode: chat_completions

      - provider: custom
        base_url: https://ollama.com/v1
        pool: custom:local      # Reference existing custom:local pool from auth.json
        label: "Ollama Cloud"
        api_mode: chat_completions

  anthropic-compatible:
    strategy: fill_first
    entries:
      - provider: anthropic
        # Uses pool entry from auth.json (anthropic)

      - provider: custom
        base_url: https://opencode.ai/zen/go/v1
        pool: custom:opencode-go
        label: "OpenCode Go"
        api_mode: anthropic_messages

---

model:
  default: gpt-5.4
  provider: pool:openai-compatible   # Route through a multi-provider pool

  # Or, for single-model failover:
  fallback_providers:
    - provider: pool:openai-compatible
      model: gpt-5.4

RAW_BUFFERClick to expand / collapse

Problem or Use Case

Hermes credential pools are currently scoped to a single provider. Each pool key (e.g., "custom:local", "openai-codex", "copilot") maps to credentials that connect to one API endpoint. When all credentials in a pool are exhausted (rate-limited, billing-quota-hit, or auth-failed), the agent errors out even if other providers have available capacity with compatible models.

This is painful for users who:

Pay for multiple providers (OpenRouter, OpenAI, Anthropic, custom endpoints) and want to use them interchangeably for the same model family (e.g., GPT-5.4 via OpenRouter vs. OpenAI direct).
Run custom providers that serve the same model through different gateways (e.g., ollama.com/v1 and a local Ollama instance both serving glm-5.1) and want automatic failover between them.
Want cost optimization by routing to the cheapest available provider for a given model without manual config changes.
Need resilience — when one provider goes down or rate-limits, seamlessly fall through to the next provider serving a compatible model.

Currently, fallback_model/fallback_providers provides provider-level failover, but it requires the agent to fully fail before trying the next provider. It does not pool credentials across providers for smart selection within a single turn. The provider_routing config (only/ignore/order) restricts which providers OpenRouter uses — it does not bridge across different provider types.

What is needed is a way to define a multi-provider credential pool that groups credentials from different providers (potentially with different base_url, api_mode, and api_key values) into a single failover-capable pool that the CredentialPool class can rotate through.

Proposed Solution

Config-level: Add `pools` section to `config.yaml`

pools:
  openai-compatible:
    strategy: round_robin      # fill_first | round_robin | random | least_used
    entries:
      - provider: openai-codex
        label: "Codex Plus"
        # Uses pool entry from auth.json (openai-codex)

      - provider: custom
        base_url: https://api.openai.com/v1
        api_key_env: OPENAI_API_KEY_DIRECT
        label: "OpenAI Direct"
        api_mode: chat_completions

      - provider: custom
        base_url: https://ollama.com/v1
        pool: custom:local      # Reference existing custom:local pool from auth.json
        label: "Ollama Cloud"
        api_mode: chat_completions

  anthropic-compatible:
    strategy: fill_first
    entries:
      - provider: anthropic
        # Uses pool entry from auth.json (anthropic)

      - provider: custom
        base_url: https://opencode.ai/zen/go/v1
        pool: custom:opencode-go
        label: "OpenCode Go"
        api_mode: anthropic_messages

Model config integration

model:
  default: gpt-5.4
  provider: pool:openai-compatible   # Route through a multi-provider pool

  # Or, for single-model failover:
  fallback_providers:
    - provider: pool:openai-compatible
      model: gpt-5.4

Runtime behavior

When provider resolves to a pool: reference, resolve_runtime_provider() would:

Load the named pool definition from config
Iterate entries using the pool strategy (round_robin, least_used, etc.)
For each entry, resolve credentials (pool reference from auth.json, env var, or explicit key)
If the selected entry fails with an auth/rate-limit error, mark it exhausted and rotate to the next entry
Return the full runtime dict (provider, api_key, base_url, api_mode, credential_pool) just like a single-provider pool does today

Implementation approach

The core change is in hermes_cli/runtime_provider.py — when resolve_runtime_provider() receives a pool:<name> provider string:

Load the pool definition from config
Create a MultiProviderPool that wraps multiple CredentialPool instances (or inline entries)
MultiProviderPool.select() iterates entries using the configured strategy
On mark_exhausted_and_rotate(), advance to the next entry in the pool
Each entry contributes a fully-resolved provider, base_url, api_key, api_mode tuple

The existing CredentialPool class stays unchanged — single-provider pools continue working as before. MultiProviderPool is additive.

Key challenges

Model compatibility: A pool entry should declare which models it serves, so the agent does not route a gpt-5.4 request to an Anthropic-only endpoint. Could use the existing custom_providers.models config pattern.
api_mode bridging: Entries in the same pool may need different api_mode values (e.g., chat_completions for OpenAI-compatible, anthropic_messages for Claude endpoints). The pool entry must carry its own api_mode.
Error classification: Not all errors should trigger pool rotation — only auth (401/403), rate-limit (429), and billing/quota (402) errors. Server errors (500/502/503) should use retry logic instead.

Alternatives Considered

Use fallback_providers for cross-provider failover — This already exists but requires a full agent failure before trying the next provider. It does not pre-emptively rotate credentials or do smart selection. It is also a separate config that does not integrate with pool strategies.
Multiple custom_providers entries pointing to different providers — The current custom_providers list only defines endpoint metadata and model lists. It does not create a failover pool. Each custom provider still resolves to one pool key in auth.json.
Write a custom hook that catches errors and re-resolves — The hook system (gateway/hooks.py) fires agent:end events but cannot modify the in-flight request or switch providers mid-turn.
Shell script wrapper — An external tool cannot inject credentials into a running Hermes agent mid-session.

Feature Type

Gateway / messaging improvement

Scope

Medium (2-3 files, ~300-500 lines)

Additional Context

I investigated the full credential pool and runtime provider resolution pipeline:

agent/credential_pool.py — CredentialPool class is provider-scoped (keyed by provider name like "openai-codex", "custom:local")
agent/credential_pool.py:346 — get_pool_strategy() reads credential_pool_strategies from config, accepts fill_first, round_robin, random, least_used
hermes_cli/runtime_provider.py:704 — resolve_runtime_provider() loads a single pool via load_pool(provider) and returns a single credential_pool reference
hermes_cli/runtime_provider.py:170-230 — _resolve_runtime_from_pool_entry() builds the runtime dict with provider, base_url, api_key, api_mode, credential_pool
run_agent.py:611-1088 — AIAgent.__init__ accepts fallback_model (single dict or list of provider dicts), but this is a separate failover chain, not pool-integrated
Existing provider_routing config only controls OpenRouter routing preferences, not cross-provider pooling

The custom:local pool in our deployment uses round_robin strategy across 4 Ollama API keys, which works great for same-provider key rotation. Extending this concept across providers is the natural next step.

extent analysis

TL;DR

Implement a MultiProviderPool class that allows credential pooling across different providers, enabling smart selection and failover within a single turn.

Guidance

Update hermes_cli/runtime_provider.py: Modify resolve_runtime_provider() to handle pool:<name> provider strings by loading the pool definition from config and creating a MultiProviderPool instance.
Create MultiProviderPool class: Design a class that wraps multiple CredentialPool instances and iterates entries using a configured strategy (e.g., round-robin, least-used).
Integrate with config.yaml: Add a pools section to config.yaml to define multi-provider credential pools, including strategies and entries with provider-specific details.
Implement error classification: Distinguish between errors that should trigger pool rotation (auth, rate-limit, billing/quota) and those that require retry logic (server errors).

Example

class MultiProviderPool:
    def __init__(self, pool_name, strategy, entries):
        self.pool_name = pool_name
        self.strategy = strategy
        self.entries = entries

    def select(self):
        # Iterate entries using the configured strategy
        for entry in self.entries:
            # Resolve credentials and return the full runtime dict
            # ...
            pass

Notes

The implementation should ensure model compatibility between pool entries.
api_mode bridging may require additional logic to handle different api_mode values within the same pool.
Error classification is crucial to prevent unnecessary pool rotation.

Recommendation

Apply the proposed solution by implementing the MultiProviderPool class and updating hermes_cli/runtime_provider.py to support multi-provider credential pools. This will enable smart selection and failover across different providers, improving resilience and cost optimization.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #optimization #training loop #device allocation #model compatibility

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix [Feature]: Multi-provider credential pools for cross-provider failover and rotation [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

Problem or Use Case

Proposed Solution

Config-level: Add `pools` section to `config.yaml`

Model config integration

Runtime behavior

Implementation approach

Key challenges

Alternatives Considered

Feature Type

Scope

Additional Context

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix [Feature]: Multi-provider credential pools for cross-provider failover and rotation [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

Problem or Use Case

Proposed Solution

Config-level: Add pools section to config.yaml

Model config integration

Runtime behavior

Implementation approach

Key challenges

Alternatives Considered

Feature Type

Scope

Additional Context

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING

Config-level: Add `pools` section to `config.yaml`