hermes - ✅(Solved) Fix Single-key credential pool: rate-limit cooldown causes init-time RuntimeError, fallback chain never tried [1 pull requests, 1 comments, 2 participants]

hermes2026-04-30 11:53:04

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#17929•Fetched 2026-05-01 05:55:03

View on GitHub

Comments

Participants

Timeline

Reactions

Author

leslieyeo

Participants

getrealhope

leslieyeo

Timeline (top)

labeled ×3cross-referenced ×2commented ×1

When a provider's credential_pool contains only one entry and that entry is currently in 429-cooldown (marked exhausted by mark_exhausted_and_rotate), AIAgent.__init__ raises RuntimeError before the fallback chain is constructed. As a result, every fresh agent (cron jobs, new gateway sessions, etc.) fails with a misleading "no API key was found" message even though both:

the credential exists in auth.json (status exhausted, not deleted), and
a valid fallback_providers entry is configured and has working credentials.

The eager-fallback safety at run_agent.py:11880 only protects in-flight requests, not init.

Error Message

Init raises immediately. The user-facing error suggests setting an env var that — by design — is not where Hermes stores this provider's key, leading to misdiagnosis (the user opens .env, sees no ALIBABA_CODING_PLAN_API_KEY, and concludes the key is gone).

Root Cause

the credential exists in auth.json (status exhausted, not deleted), and
a valid fallback_providers entry is configured and has working credentials.

The eager-fallback safety at run_agent.py:11880 only protects in-flight requests, not init.

Fix Action

Fix / Workaround

Patch outline (working locally):

PR fix notes

PR #17958: fix(agent): try fallback providers at init when primary credential pool is exhausted

Repository: NousResearch/hermes-agent
Author: luyao618
State: open | merged: False
Link: https://github.com/NousResearch/hermes-agent/pull/17958

Description (problem / solution / changelog)

Summary

When a provider's credential_pool contains only one entry and that entry is in 429-cooldown, resolve_provider_client returns None and AIAgent.__init__ raises a misleading RuntimeError suggesting the API key is missing — even when valid fallback_providers are configured.

This patch makes __init__ iterate the fallback chain before raising, mirroring the existing in-flight fallback logic in the request loop. If a fallback resolves, the agent initializes against it and sets _fallback_activated=True so _restore_primary_runtime can pick the primary back up after cooldown.

Changes

run_agent.py (~1465): Before raising RuntimeError for missing credentials, iterate fallback_model entries and try resolve_provider_client on each. If one resolves, use it as the effective primary.
run_agent.py (~1560): Preserve _fallback_activated flag set during init-time fallback (was unconditionally reset to False).
run_agent.py (~1501): Guard the "No provider configured" raise to skip when fallback was activated.

Test plan

Added tests/run_agent/test_init_fallback_on_exhausted_pool.py with two tests:
1. test_init_tries_fallback_when_primary_returns_none — verifies agent initializes with fallback provider
2. test_init_raises_when_no_fallback_configured — verifies original error is preserved when no fallback exists
All 1166 existing tests/run_agent/ tests pass

Closes #17929

Changed files

run_agent.py (modified, +42/-10)
tests/run_agent/test_init_fallback_on_exhausted_pool.py (added, +69/-0)

Code Example

model:
     default: qwen3.6-plus
     provider: alibaba-coding-plan
     fallback_providers:
       - provider: tencent-token-plan
         model: kimi2.5

---

RuntimeError: Provider 'alibaba-coding-plan' is set in config.yaml but no API key was found.
   Set the ALIBABA_CODING_PLAN_API_KEY environment variable, or switch to a different provider with `hermes model`.

---

# Before the existing `raise RuntimeError(...)` for missing creds:
for fb in (fallback_model if isinstance(fallback_model, list) else [fallback_model]):
    if not isinstance(fb, dict):
        continue
    fb_client, fb_model = resolve_provider_client(
        fb["provider"], model=fb["model"], raw_codex=True,
        explicit_base_url=fb.get("base_url"), explicit_api_key=fb.get("api_key"),
    )
    if fb_client is None:
        continue
    self.provider = fb["provider"]
    self.model = fb_model or fb["model"]
    self._fallback_activated = True
    client_kwargs = {"api_key": fb_client.api_key, "base_url": str(fb_client.base_url)}
    break
else:
    raise RuntimeError(...)

---

14:35:59 INFO  root: Fallback activated: qwen3.6-plus → qwen3.6-plus (alibaba-coding-plan)
14:36:01 INFO  agent.credential_pool: marking coding plan key exhausted (status=429), rotating
14:36:01 INFO  agent.credential_pool: no available entries (all exhausted or empty)
14:36:12 ERROR root: API call failed after 3 retries. HTTP 429: usage allocated quota exceeded.
15:06:17 ERROR cron.scheduler: Job '...' failed: RuntimeError: Provider 'alibaba-coding-plan' is set
                in config.yaml but no API key was found.
[repeats every cron tick for 3+ hours until pool cooldown TTL elapses]

RAW_BUFFERClick to expand / collapse

Summary

the credential exists in auth.json (status exhausted, not deleted), and
a valid fallback_providers entry is configured and has working credentials.

The eager-fallback safety at run_agent.py:11880 only protects in-flight requests, not init.

Affected versions

hermes-agent HEAD 2d137074a (refactor(config): add cfg_get() helper; migrate 20 nested-get call sites (#17304)). The misleading branch dates back to whenever the explicit "fail fast instead of silently routing through OpenRouter" behavior was added.

Repro

Configure a single-key pool for any provider with daily-quota rate limits, e.g. alibaba-coding-plan:

model:
  default: qwen3.6-plus
  provider: alibaba-coding-plan
  fallback_providers:
    - provider: tencent-token-plan
      model: kimi2.5

Drive enough traffic through cron to hit HTTP 429: usage allocated quota exceeded.
After 429, agent.credential_pool correctly logs marking coding plan key exhausted (status=429), rotating and credential pool: no available entries (all exhausted or empty).

Every subsequent fresh AIAgent(...) (the next cron firing, the next gateway message that builds a new agent) fails with:

RuntimeError: Provider 'alibaba-coding-plan' is set in config.yaml but no API key was found.
Set the ALIBABA_CODING_PLAN_API_KEY environment variable, or switch to a different provider with `hermes model`.

Real call site: run_agent.py:1441-1445. The else: branch executes when resolve_provider_client returns None, which happens cleanly when the pool has entries but none are currently available().

Expected behavior

When a fallback chain is configured, init-time credential exhaustion should not abort. The agent should construct a client against the first usable fallback entry and surface an _emit_status notification ("⚠️ Primary rate-limited, using fallback X/Y").

Observed behavior

Proposed fix

In run_agent.py:__init__, before raising the "no API key was found" RuntimeError, iterate the fallback_model argument and try resolve_provider_client for each entry. If any resolves, use it as the effective primary client and set self._fallback_activated = True so the existing _restore_primary_runtime machinery can pick the primary back up after cooldown.

Patch outline (working locally):

# Before the existing `raise RuntimeError(...)` for missing creds:
for fb in (fallback_model if isinstance(fallback_model, list) else [fallback_model]):
    if not isinstance(fb, dict):
        continue
    fb_client, fb_model = resolve_provider_client(
        fb["provider"], model=fb["model"], raw_codex=True,
        explicit_base_url=fb.get("base_url"), explicit_api_key=fb.get("api_key"),
    )
    if fb_client is None:
        continue
    self.provider = fb["provider"]
    self.model = fb_model or fb["model"]
    self._fallback_activated = True
    client_kwargs = {"api_key": fb_client.api_key, "base_url": str(fb_client.base_url)}
    break
else:
    raise RuntimeError(...)

Secondary issue (will file separately if confirmed)

_pool_may_recover_from_rate_limit returns False for len(entries) == 1, which correctly triggers eager fallback in the request loop, but _recover_with_credential_pool for FailoverReason.rate_limit first burns one retry on the same exhausted credential (if not has_retried_429: return False, True). On a known single-credential pool this retry is wasted — quota won't reset within the retry window. Consider gating the "retry once on first 429" path on _pool_may_recover_from_rate_limit(pool).

Logs (anonymized)

14:35:59 INFO  root: Fallback activated: qwen3.6-plus → qwen3.6-plus (alibaba-coding-plan)
14:36:01 INFO  agent.credential_pool: marking coding plan key exhausted (status=429), rotating
14:36:01 INFO  agent.credential_pool: no available entries (all exhausted or empty)
14:36:12 ERROR root: API call failed after 3 retries. HTTP 429: usage allocated quota exceeded.
15:06:17 ERROR cron.scheduler: Job '...' failed: RuntimeError: Provider 'alibaba-coding-plan' is set
                in config.yaml but no API key was found.
[repeats every cron tick for 3+ hours until pool cooldown TTL elapses]

extent analysis

TL;DR

The proposed fix involves iterating through the fallback_model argument in AIAgent.__init__ to find a usable fallback entry when the primary credential is exhausted.

Guidance

Before raising the "no API key was found" RuntimeError, iterate through the fallback_model argument to try resolve_provider_client for each entry.
If a usable fallback entry is found, use it as the effective primary client and set self._fallback_activated = True.
Verify that the fallback chain is correctly configured and that the fallback_providers entry has working credentials.
Test the proposed fix by driving enough traffic through cron to hit the HTTP 429: usage allocated quota exceeded error and verifying that the agent constructs a client against the first usable fallback entry.

Example

for fb in (fallback_model if isinstance(fallback_model, list) else [fallback_model]):
    if not isinstance(fb, dict):
        continue
    fb_client, fb_model = resolve_provider_client(
        fb["provider"], model=fb["model"], raw_codex=True,
        explicit_base_url=fb.get("base_url"), explicit_api_key=fb.get("api_key"),
    )
    if fb_client is None:
        continue
    self.provider = fb["provider"]
    self.model = fb_model or fb["model"]
    self._fallback_activated = True
    client_kwargs = {"api_key": fb_client.api_key, "base_url": str(fb_client.base_url)}
    break
else:
    raise RuntimeError(...)

Notes

The proposed fix assumes that the fallback_providers entry is correctly configured and has working credentials.
The fix may not work if the fallback_providers entry is not correctly configured or if the credentials are not valid.

Recommendation

Apply the proposed workaround by iterating through the fallback_model argument in AIAgent.__init__ to find

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

#api #environment variable #parallel task #integration issue #index setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

hermes - ✅(Solved) Fix Single-key credential pool: rate-limit cooldown causes init-time RuntimeError, fallback chain never tried [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

PR fix notes

PR #17958: fix(agent): try fallback providers at init when primary credential pool is exhausted

Description (problem / solution / changelog)

Summary

Changes

Test plan

Changed files

Code Example

Summary

Affected versions

Repro

Expected behavior

Observed behavior

Proposed fix

Secondary issue (will file separately if confirmed)

Logs (anonymized)

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING