hermes - 💡(How to fix) Fix bug(auxiliary): pooled Codex compression can stay pinned to exhausted auth after rotation [2 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

agent/auxiliary_client.py caches pooled auxiliary clients by (provider, async_mode, base_url, api_key, api_mode, runtime_key, is_vision) but not by the active credential-pool entry id. For pooled OAuth providers — most visibly openai-codex on compression/title/session-search auxiliary tasks — this means a cached client can stay pinned to the credential that just hit 429 usage_limit_reached, even after the pool rotates to a different auth.

This shows up most clearly when auxiliary.<task>.provider is explicitly set to openai-codex: provider fallback is intentionally disabled for explicit providers, so the missing same-provider pool rotation/cache invalidation becomes user-visible immediately.

Related but not identical to:

  • #22212 — broader platform retry / profile-rotation contract
  • #22415 — open PR for Codex auxiliary quota rotation

Error Message

  • Main agent requests can recover because the main loop has credential-pool-aware retry/rotation.
  • Auxiliary compression fails with repeated 429 usage_limit_reached.
  • Logs show auxiliary failures without clean same-provider recovery:
    • Failed to generate context summary: Error code: 429 ... usage_limit_reached
    • API call failed after 3 retries. HTTP 429: The usage limit has been reached | provider=openai-codex ...
  • Even when the pool rotates, the auxiliary cached client can remain pinned to the old exhausted credential unless the cache is explicitly evicted or the cache key changes.

Root Cause

Two bugs compound here:

  1. Auxiliary path parity gap: call_llm() / async_call_llm() lacked same-provider credential-pool recovery for quota/rate-limit failures, unlike the main agent loop.
  2. Cache pinning: _client_cache_key() did not include any discriminator for the currently active pooled credential, so auxiliary cache reuse could outlive pool rotation.

Fix Action

Fixed

RAW_BUFFERClick to expand / collapse

Summary

agent/auxiliary_client.py caches pooled auxiliary clients by (provider, async_mode, base_url, api_key, api_mode, runtime_key, is_vision) but not by the active credential-pool entry id. For pooled OAuth providers — most visibly openai-codex on compression/title/session-search auxiliary tasks — this means a cached client can stay pinned to the credential that just hit 429 usage_limit_reached, even after the pool rotates to a different auth.

This shows up most clearly when auxiliary.<task>.provider is explicitly set to openai-codex: provider fallback is intentionally disabled for explicit providers, so the missing same-provider pool rotation/cache invalidation becomes user-visible immediately.

Related but not identical to:

  • #22212 — broader platform retry / profile-rotation contract
  • #22415 — open PR for Codex auxiliary quota rotation

Environment

  • Hermes Agent on main
  • auxiliary.compression.provider: openai-codex
  • multiple pooled openai-codex OAuth credentials
  • credential_pool_strategies.openai-codex: fill_first

Reproduction

  1. Configure 2+ openai-codex pooled auth entries.
  2. Set auxiliary.compression.provider (or another auxiliary task provider) explicitly to openai-codex.
  3. Ensure the currently selected pool entry returns 429 usage_limit_reached.
  4. Trigger context compression / summary generation repeatedly.

Observed behavior

  • Main agent requests can recover because the main loop has credential-pool-aware retry/rotation.
  • Auxiliary compression fails with repeated 429 usage_limit_reached.
  • Logs show auxiliary failures without clean same-provider recovery:
    • Failed to generate context summary: Error code: 429 ... usage_limit_reached
    • API call failed after 3 retries. HTTP 429: The usage limit has been reached | provider=openai-codex ...
  • Even when the pool rotates, the auxiliary cached client can remain pinned to the old exhausted credential unless the cache is explicitly evicted or the cache key changes.

Expected behavior

For pooled auxiliary providers:

  1. Retry transient 429 once on the same credential if desired.
  2. Mark the failed pool entry exhausted and rotate to the next credential.
  3. Rebuild the auxiliary client against the new pool entry.
  4. Ensure cache lookup cannot reuse the stale client after rotation.
  5. Only after same-provider pool recovery fails should broader provider fallback be considered (and only for provider=auto).

Root cause

Two bugs compound here:

  1. Auxiliary path parity gap: call_llm() / async_call_llm() lacked same-provider credential-pool recovery for quota/rate-limit failures, unlike the main agent loop.
  2. Cache pinning: _client_cache_key() did not include any discriminator for the currently active pooled credential, so auxiliary cache reuse could outlive pool rotation.

Suggested fix

  • Add same-provider pool recovery for auxiliary clients on 401/402/429 where appropriate.
  • Include a non-mutating pool-entry discriminator (e.g. current/peek credential id) in the auxiliary client cache key for pooled providers, including provider="auto" when the resolved main provider comes from a pool.
  • Add regression tests for:
    • explicit Codex auxiliary 429 -> rotate to next pooled auth without cross-provider fallback
    • async auxiliary parity
    • cached Codex auxiliary client rebuild when the active pool entry changes

Impact

  • Fixes context compression/title/session-search failures where healthy pooled Codex auths exist.
  • Prevents auxiliary tasks from getting “stuck” on an exhausted cached OAuth client.
  • Brings auxiliary behavior closer to the main agent loop's credential-pool recovery semantics.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

For pooled auxiliary providers:

  1. Retry transient 429 once on the same credential if desired.
  2. Mark the failed pool entry exhausted and rotate to the next credential.
  3. Rebuild the auxiliary client against the new pool entry.
  4. Ensure cache lookup cannot reuse the stale client after rotation.
  5. Only after same-provider pool recovery fails should broader provider fallback be considered (and only for provider=auto).

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix bug(auxiliary): pooled Codex compression can stay pinned to exhausted auth after rotation [2 pull requests]