hermes - ✅(Solved) Fix Credential pool rotation should not count toward api_max_retries [3 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#16830Fetched 2026-04-29 06:38:45
View on GitHub
Comments
1
Participants
2
Timeline
9
Reactions
0
Timeline (top)
labeled ×4cross-referenced ×3commented ×1referenced ×1

Error Message

Users with multiple API keys from the same provider (e.g., 10 keys with individual rate limits) expect Hermes to cycle through all available keys before reporting a rate limit error. The current behavior wastes 70% of available capacity.

Root Cause

  1. API key #1 gets 429 → has_retried_429 set to True (retry_count = 1)
  2. API key #1 gets 429 again → rotates to key #2, has_retried_429 resets (retry_count = 2)
  3. API key #2 gets 429 → has_retried_429 set to True (retry_count = 3)
  4. Agent stops because retry_count >= max_retries, even though keys #3-#10 were never tried

Fix Action

Fixed

PR fix notes

PR #16902: fix(credential-pool): pool rotation should not count toward api_max_retries (#16830)

Description (problem / solution / changelog)

Summary

  • 429 on a credential with multi-key pool now rotates immediately instead of first burning a retry slot retrying the same key.
  • Single-credential pools keep the original retry-same-then-rotate behavior so a transient 429 on the only available key isn't doubly punished.
  • Pool walking is transparent to api_max_retries — only once every credential is exhausted does retry_count start incrementing.

The bug

Fixes #16830. With a 10-key credential pool and the default api_max_retries: 3, only 2-3 keys ever get tried before the agent gives up. The remaining 7+ keys go unused even though they would have served the request.

Tracing _recover_with_credential_pool (in run_agent.py):

Iter 1: 429 on key #1 → returns (False, True)              → retry_count=1
Iter 2: 429 on key #1 → rotates to key #2, returns (True,…) → retry_count=1
Iter 3: 429 on key #2 → returns (False, True)              → retry_count=2
Iter 4: 429 on key #2 → rotates to key #3, returns (True,…) → retry_count=2
Iter 5: 429 on key #3 → returns (False, True)              → retry_count=3 → exit

Each credential consumed at least one retry slot before being exchanged, because the first 429 returned (False, True) and the outer loop did retry_count += 1 (line 11315) before the second attempt that actually rotates.

The fix

agent/credential_pool.py — new has_unexhausted_alternates(): a read-only query (no _persist, no refresh) that reports whether the pool has at least one entry besides the current one that is either unexhausted or whose cooldown has elapsed.

run_agent.py — rate-limit branch of _recover_with_credential_pool:

has_alternates = False
try:
    has_alternates = bool(pool.has_unexhausted_alternates())
except AttributeError:
    has_alternates = False  # older / hand-rolled mock pools
if not has_retried_429 and not has_alternates:
    return False, True  # single credential — keep original behavior
next_entry = pool.mark_exhausted_and_rotate(...)
...

Multi-credential pool: first 429 → rotate immediately (no retry_count cost, fresh key likely has fresh quota). Single-credential pool: first 429 → retry once before counting, preserving the safety net for transient blips on a lone key.

The AttributeError guard keeps existing tests with hand-rolled fake pools working without modification.

Test plan

  • tests/agent/test_credential_pool_routing.pyTestPoolRotationCycle updated:
    • test_first_429_with_alternates_rotates_immediately (NEW): multi-credential → rotation on first 429
    • test_first_429_no_alternates_retries_same_credential (NEW): single-credential → original behavior
    • test_full_pool_walk_does_not_burn_retry_slots (NEW): 10-credential pool walks ~all keys before recovery returns False
  • tests/run_agent/test_run_agent.py::TestCredentialPoolRecovery::test_recover_with_pool_rotates_immediately_when_alternates_available (NEW): direct repro of the issue scenario.
  • tests/agent/test_credential_pool.py — four new direct unit tests for has_unexhausted_alternates: two ok entries, single entry, other-in-cooldown, other-cooldown-elapsed.
  • Regression guard: reverted run_agent.py + agent/credential_pool.py, the 3 new behavior tests failed with recovered=False where they now expect True. Restored fix → all 60 tests in tests/agent/test_credential_pool* and TestCredentialPoolRecovery pass.
  • CI's pre-existing baseline failures in test_anthropic_adapter.py, test_bedrock_adapter.py, test_copilot_acp_client.py reproduce on clean origin/main (8269f9056) and are unrelated to this change.

Related

  • Fixes #16830

Changed files

  • agent/credential_pool.py (modified, +24/-0)
  • run_agent.py (modified, +14/-2)
  • tests/agent/test_credential_pool.py (modified, +155/-0)
  • tests/agent/test_credential_pool_routing.py (modified, +60/-5)
  • tests/run_agent/test_run_agent.py (modified, +33/-0)

PR #2: fix: resolve 7 identified issues [automated]

Description (problem / solution / changelog)

Summary

This PR resolves 7 identified issues in the Hermes Agent codebase.

Issues Fixed

1. #17076 - kimi-coding vision broken (HTTP 404)

File: agent/auxiliary_client.py Problem: The kimi-coding provider was missing from _PROVIDER_VISION_MODELS, causing the vision auto-detect to fail with HTTP 404 when using image inputs. Fix: Added kimi-coding and kimi-coding-cn to _PROVIDER_VISION_MODELS dict. Commit: 82b920ff

2. #16970 - Model Picker ignores model_catalog.enabled=false

File: hermes_cli/model_switch.py Problem: The list_authenticated_providers() function ignored model_catalog.enabled=false config, showing built-in providers even when the catalog was disabled. Fix: Added catalog_enabled check that skips sections 1-3 (built-in provider enumeration) when model_catalog.enabled=false. Commit: b6b6fbbe

3. #16830 - Credential pool rotation counts toward api_max_retries

File: run_agent.py Problem: When a credential pool rotation occurred, retry_count was not reset, causing premature exhaustion of retries. Fix: Reset retry_count = 0 when _recover_with_credential_pool() succeeds. Commit: 366582d4

4. #16979 - QQ Bot file attachments silently dropped on download failure

File: gateway/platforms/qqbot/adapter.py Problem: When CDN download failed for file attachments, they were silently discarded with no indication to the agent. Fix: Added fallback text (download failed) to attachment_info when download fails. Commit: f956da73

5. #16875 - kimi-k2.6 HTTP 400 via OpenCode Go

Files: hermes_cli/setup.py, hermes_cli/models.py Problem: kimi-k2.6 was listed as an available model for opencode-go provider but returns HTTP 400 on all requests. Fix: Removed kimi-k2.6 from the opencode-go model list in both files. Commit: 5a77400d

6. #16951 - WeChat clips right-side content when scrolling Markdown tables

File: gateway/platforms/weixin.py Problem: WeChat's WebView CSS clips tables with many columns, making rightmost columns invisible. Fix: Added _truncate_wide_table_row() helper that limits table rows to 6 columns and appends ... for overflow indication. Commit: d3e43e10

7. #17009 - Termux hermes update fails with "Failed to determine Android API level"

File: hermes_cli/main.py Problem: Native extension builds (maturin) fail in Termux because ANDROID_API_LEVEL is not set. Fix: In _install_python_dependencies_with_optional_fallback(), detect Termux environment and set ANDROID_API_LEVEL=29 as a safe default. Commit: 096a80b9

Files Modified

  • agent/auxiliary_client.py (1 change)
  • hermes_cli/model_switch.py (1 change)
  • run_agent.py (1 change)
  • gateway/platforms/qqbot/adapter.py (1 change)
  • hermes_cli/setup.py (1 change)
  • hermes_cli/models.py (1 change)
  • gateway/platforms/weixin.py (2 changes)
  • hermes_cli/main.py (1 change)

Testing

All changes were validated with python3 -m py_compile to ensure no syntax errors.

Changed files

  • .gitignore (modified, +1/-0)
  • AGENTS.md (modified, +1/-1)
  • Dockerfile (modified, +6/-2)
  • acp_adapter/entry.py (modified, +11/-0)
  • acp_adapter/server.py (modified, +28/-1)
  • agent/anthropic_adapter.py (modified, +155/-78)
  • agent/auxiliary_client.py (modified, +342/-55)
  • agent/bedrock_adapter.py (modified, +41/-3)
  • agent/context_compressor.py (modified, +113/-5)
  • agent/credential_pool.py (modified, +82/-4)
  • agent/credential_sources.py (modified, +0/-1)
  • agent/error_classifier.py (modified, +32/-0)
  • agent/gemini_cloudcode_adapter.py (modified, +0/-2)
  • agent/gemini_schema.py (modified, +1/-1)
  • agent/google_code_assist.py (modified, +0/-1)
  • agent/google_oauth.py (modified, +3/-3)
  • agent/image_routing.py (added, +236/-0)
  • agent/memory_manager.py (modified, +113/-5)
  • agent/model_metadata.py (modified, +56/-21)
  • agent/nous_rate_guard.py (modified, +144/-1)
  • agent/onboarding.py (added, +191/-0)
  • agent/prompt_builder.py (modified, +38/-0)
  • agent/redact.py (modified, +7/-3)
  • agent/shell_hooks.py (modified, +7/-2)
  • agent/skill_commands.py (modified, +2/-2)
  • agent/title_generator.py (modified, +39/-5)
  • agent/transports/anthropic.py (modified, +1/-7)
  • agent/transports/chat_completions.py (modified, +74/-0)
  • agent/transports/codex.py (modified, +1/-3)
  • cli-config.yaml.example (modified, +28/-8)
  • cli.py (modified, +527/-196)
  • cron/jobs.py (modified, +34/-5)
  • cron/scheduler.py (modified, +39/-5)
  • docker/entrypoint.sh (modified, +9/-7)
  • flake.nix (modified, +1/-0)
  • gateway/builtin_hooks/boot_md.py (removed, +0/-85)
  • gateway/channel_directory.py (modified, +67/-14)
  • gateway/config.py (modified, +84/-3)
  • gateway/display_config.py (modified, +3/-1)
  • gateway/hooks.py (modified, +7/-13)
  • gateway/mirror.py (modified, +57/-11)
  • gateway/pairing.py (modified, +2/-1)
  • gateway/platforms/__init__.py (modified, +2/-0)
  • gateway/platforms/base.py (modified, +233/-16)
  • gateway/platforms/discord.py (modified, +18/-24)
  • gateway/platforms/email.py (modified, +3/-0)
  • gateway/platforms/feishu_comment.py (modified, +0/-1)
  • gateway/platforms/helpers.py (modified, +11/-2)
  • gateway/platforms/matrix.py (modified, +493/-47)
  • gateway/platforms/mattermost.py (modified, +0/-1)
  • gateway/platforms/qqbot/adapter.py (modified, +7/-7)
  • gateway/platforms/slack.py (modified, +753/-70)
  • gateway/platforms/telegram.py (modified, +138/-14)
  • gateway/platforms/weixin.py (modified, +47/-3)
  • gateway/platforms/yuanbao.py (added, +4754/-0)
  • gateway/platforms/yuanbao_media.py (added, +645/-0)
  • gateway/platforms/yuanbao_proto.py (added, +1209/-0)
  • gateway/platforms/yuanbao_sticker.py (added, +558/-0)
  • gateway/run.py (modified, +1139/-279)
  • gateway/runtime_footer.py (added, +150/-0)
  • gateway/session.py (modified, +16/-21)
  • gateway/stream_consumer.py (modified, +110/-0)
  • gateway/whatsapp_identity.py (modified, +21/-1)
  • hermes_cli/auth.py (modified, +40/-4)
  • hermes_cli/azure_detect.py (modified, +1/-1)
  • hermes_cli/backup.py (modified, +272/-1)
  • hermes_cli/banner.py (modified, +0/-1)
  • hermes_cli/claw.py (modified, +67/-6)
  • hermes_cli/commands.py (modified, +116/-5)
  • hermes_cli/config.py (modified, +322/-29)
  • hermes_cli/debug.py (modified, +13/-7)
  • hermes_cli/dingtalk_auth.py (modified, +0/-1)
  • hermes_cli/doctor.py (modified, +11/-1)
  • hermes_cli/env_loader.py (modified, +2/-1)
  • hermes_cli/fallback_cmd.py (added, +361/-0)
  • hermes_cli/gateway.py (modified, +25/-4)
  • hermes_cli/hooks.py (modified, +1/-2)
  • hermes_cli/main.py (modified, +689/-58)
  • hermes_cli/model_catalog.py (added, +329/-0)
  • hermes_cli/model_switch.py (modified, +366/-306)
  • hermes_cli/models.py (modified, +251/-44)
  • hermes_cli/nous_subscription.py (modified, +16/-8)
  • hermes_cli/oneshot.py (modified, +28/-11)
  • hermes_cli/platforms.py (modified, +1/-0)
  • hermes_cli/plugins.py (modified, +14/-0)
  • hermes_cli/plugins_cmd.py (modified, +0/-1)
  • hermes_cli/profiles.py (modified, +58/-2)
  • hermes_cli/providers.py (modified, +26/-0)
  • hermes_cli/runtime_provider.py (modified, +100/-14)
  • hermes_cli/setup.py (modified, +70/-17)
  • hermes_cli/skills_hub.py (modified, +230/-20)
  • hermes_cli/slack_cli.py (added, +152/-0)
  • hermes_cli/status.py (modified, +3/-2)
  • hermes_cli/timeouts.py (modified, +4/-4)
  • hermes_cli/tips.py (modified, +2/-4)
  • hermes_cli/tools_config.py (modified, +173/-4)
  • hermes_cli/web_server.py (modified, +11/-14)
  • hermes_cli/webhook.py (modified, +2/-2)
  • hermes_logging.py (modified, +3/-4)
  • hermes_state.py (modified, +578/-164)

PR #17108: fix: resolve 7 identified issues [automated]

Description (problem / solution / changelog)

Summary

This PR resolves 7 identified issues in the Hermes Agent codebase. All issues were identified from open GitHub issues in NousResearch/hermes-agent.

Issues Fixed

1. #17076 - kimi-coding vision broken (HTTP 404)

File: agent/auxiliary_client.py Problem: The kimi-coding provider was missing from _PROVIDER_VISION_MODELS, causing the vision auto-detect to fail with HTTP 404 when using image inputs. Fix: Added kimi-coding and kimi-coding-cn to _PROVIDER_VISION_MODELS dict. Commit: 82b920ff

2. #16970 - Model Picker ignores model_catalog.enabled=false

File: hermes_cli/model_switch.py Problem: The list_authenticated_providers() function ignored model_catalog.enabled=false config, showing built-in providers even when the catalog was disabled. Fix: Added catalog_enabled check that skips sections 1-3 (built-in provider enumeration) when model_catalog.enabled=false. Commit: b6b6fbbe

3. #16830 - Credential pool rotation counts toward api_max_retries

File: run_agent.py Problem: When a credential pool rotation occurred, retry_count was not reset, causing premature exhaustion of retries. Fix: Reset retry_count = 0 when _recover_with_credential_pool() succeeds. Commit: 366582d4

4. #16979 - QQ Bot file attachments silently dropped on download failure

File: gateway/platforms/qqbot/adapter.py Problem: When CDN download failed for file attachments, they were silently discarded with no indication to the agent. Fix: Added fallback text (download failed) to attachment_info when download fails. Commit: f956da73

5. #16875 - kimi-k2.6 HTTP 400 via OpenCode Go

Files: hermes_cli/setup.py, hermes_cli/models.py Problem: kimi-k2.6 was listed as an available model for opencode-go provider but returns HTTP 400 on all requests. Fix: Removed kimi-k2.6 from the opencode-go model list in both files. Commit: 5a77400d

6. #16951 - WeChat clips right-side content when scrolling Markdown tables

File: gateway/platforms/weixin.py Problem: WeChat's WebView CSS clips tables with many columns, making rightmost columns invisible. Fix: Added _truncate_wide_table_row() helper that limits table rows to 6 columns and appends ... for overflow indication. Commit: d3e43e10

7. #17009 - Termux hermes update fails with "Failed to determine Android API level"

File: hermes_cli/main.py Problem: Native extension builds (maturin) fail in Termux because ANDROID_API_LEVEL is not set. Fix: In _install_python_dependencies_with_optional_fallback(), detect Termux environment and set ANDROID_API_LEVEL=29 as a safe default. Commit: 096a80b9

Files Modified

  • agent/auxiliary_client.py
  • hermes_cli/model_switch.py
  • run_agent.py
  • gateway/platforms/qqbot/adapter.py
  • hermes_cli/setup.py
  • hermes_cli/models.py
  • gateway/platforms/weixin.py
  • hermes_cli/main.py

Commits (7 total)

  • 82b920ff - fix(vision): add kimi-coding provider to _PROVIDER_VISION_MODELS
  • b6b6fbbe - fix(model_picker): respect model_catalog.enabled=false config
  • 366582d4 - fix(credential_pool): reset retry_count on credential rotation
  • f956da73 - fix(qqbot): add fallback text when attachment download fails
  • 5a77400d - fix(setup): remove kimi-k2.6 from opencode-go model list
  • d3e43e10 - fix(weixin): truncate wide tables to prevent viewport clipping
  • 096a80b9 - fix(termux): set ANDROID_API_LEVEL default during update

Changed files

  • agent/auxiliary_client.py (modified, +4/-0)
  • gateway/platforms/qqbot/adapter.py (modified, +5/-0)
  • gateway/platforms/weixin.py (modified, +21/-0)
  • hermes_cli/main.py (modified, +9/-0)
  • hermes_cli/model_switch.py (modified, +355/-344)
  • hermes_cli/models.py (modified, +0/-1)
  • hermes_cli/setup.py (modified, +1/-1)
  • run_agent.py (modified, +1/-0)

Code Example

# Line 5809-5822: _recover_with_credential_pool
if effective_reason == FailoverReason.rate_limit:
    if not has_retried_429:
        return False, True  # First 429: retry same credential
    rotate_status = status_code if status_code is not None else 429
    next_entry = pool.mark_exhausted_and_rotate(...)
    if next_entry is not None:
        self._swap_credential(next_entry)
        return True, False  # Rotated successfully

# Line 11315: retry_count is incremented regardless of rotation
retry_count += 1
RAW_BUFFERClick to expand / collapse

Issue Description

When using a credential pool with multiple API keys under a single provider, the current retry logic causes Hermes to give up after trying only 2-3 keys, even when more keys are available in the pool.

Current Behavior

With the default api_max_retries: 3 and a pool of 10 API keys:

  1. API key #1 gets 429 → has_retried_429 set to True (retry_count = 1)
  2. API key #1 gets 429 again → rotates to key #2, has_retried_429 resets (retry_count = 2)
  3. API key #2 gets 429 → has_retried_429 set to True (retry_count = 3)
  4. Agent stops because retry_count >= max_retries, even though keys #3-#10 were never tried

The relevant code is in run_agent.py:

# Line 5809-5822: _recover_with_credential_pool
if effective_reason == FailoverReason.rate_limit:
    if not has_retried_429:
        return False, True  # First 429: retry same credential
    rotate_status = status_code if status_code is not None else 429
    next_entry = pool.mark_exhausted_and_rotate(...)
    if next_entry is not None:
        self._swap_credential(next_entry)
        return True, False  # Rotated successfully

# Line 11315: retry_count is incremented regardless of rotation
retry_count += 1

Expected Behavior

Credential pool rotation should be transparent to the retry counter:

  1. Rotating to a new API key in the pool should not increment retry_count
  2. Only when all keys in the pool have been exhausted should it count as one retry attempt
  3. This allows the agent to fully utilize the credential pool before giving up

Proposed Solution

Option A (Preferred): Track pool exhaustion separately from API retries

  • Add a flag like pool_fully_exhausted that becomes True only when mark_exhausted_and_rotate() returns None
  • Only increment retry_count when pool_fully_exhausted is True

Option B: Reset retry_count on successful rotation

  • When _recover_with_credential_pool returns True (successfully rotated), don't increment retry_count
  • Only increment when recovery fails (no more keys in pool)

Use Case

Users with multiple API keys from the same provider (e.g., 10 keys with individual rate limits) expect Hermes to cycle through all available keys before reporting a rate limit error. The current behavior wastes 70% of available capacity.

Environment

  • Hermes Agent version: Latest (as of April 2026)
  • Provider: Any provider with credential pool support
  • Config: agent.api_max_retries: 3 (default), credential pool with 10+ keys

extent analysis

TL;DR

Implement a separate tracking mechanism for pool exhaustion to decouple it from the API retry counter, allowing the agent to fully utilize the credential pool before giving up.

Guidance

  • Identify the current retry logic in run_agent.py and assess how it interacts with the credential pool rotation.
  • Consider implementing a pool_fully_exhausted flag that becomes True only when mark_exhausted_and_rotate() returns None, as proposed in Option A.
  • Evaluate the impact of resetting retry_count on successful rotation, as outlined in Option B, to determine the most suitable approach.
  • Verify that the chosen solution correctly handles cases where the pool has multiple keys and the agent should cycle through them before reporting a rate limit error.

Example

# Example of how the pool_fully_exhausted flag could be implemented
pool_fully_exhausted = False
if next_entry is None:
    pool_fully_exhausted = True
if pool_fully_exhausted:
    retry_count += 1

Notes

The proposed solutions (Option A and Option B) aim to address the issue by decoupling pool exhaustion from the retry counter. However, the best approach may depend on specific requirements and edge cases not fully explored in the issue description.

Recommendation

Apply workaround by implementing a separate tracking mechanism for pool exhaustion, such as the pool_fully_exhausted flag, to ensure the agent fully utilizes the credential pool before giving up. This approach aligns with the preferred Option A and should provide a more robust solution.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING