hermes - ✅(Solved) Fix [Feature Request] Configurable credential pool TTL (cooldown duration) [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#13619Fetched 2026-04-22 08:05:17
View on GitHub
Comments
1
Participants
2
Timeline
7
Reactions
0
Author
Timeline (top)
labeled ×3referenced ×2commented ×1cross-referenced ×1

Error Message

import os _DEFAULT_TTL = 60 * 60 # 1 hour default _ENV_TTL = os.getenv("HERMES_CREDENTIAL_TTL_SECONDS", "").strip() if _ENV_TTL: try: _DEFAULT_TTL = max(60, int(_ENV_TTL)) # minimum 60 seconds except ValueError: pass EXHAUSTED_TTL_429_SECONDS = _DEFAULT_TTL EXHAUSTED_TTL_DEFAULT_SECONDS = _DEFAULT_TTL

Fix Action

Fixed

PR fix notes

PR #13680: feat(agent): configurable exhausted cooldown for multi-credential pool

Description (problem / solution / changelog)

Summary

Adds operator-configurable cooldown for credentials in exhausted state when no provider-supplied reset_at applies. Defaults remain one hour; shorter windows reduce fallback cascades after burst rate limits (see #13619).

Behavior

  • HERMES_CREDENTIAL_TTL_SECONDS — primary override (integer seconds, minimum 60).
  • HERMES_CREDENTIAL_COOLDOWN_SECONDS — used when TTL is unset.
  • If both are set to valid integers, TTL wins.
  • Invalid or empty values are ignored for that key only. Legacy module attributes EXHAUSTED_TTL_429_SECONDS / EXHAUSTED_TTL_DEFAULT_SECONDS delegate to the same resolver for backward compatibility.

Changes

  • agent/credential_pool.py — env-based resolver; _exhausted_ttl uses the shared base cooldown.
  • tests/agent/test_credential_pool.py — precedence, minimum clamp, legacy attrs, short-TTL re-selection.
  • .env.example — documents the env keys.

Testing

pytest tests/agent/test_credential_pool.py -q -o addopts=

Related: #13619

Changed files

  • .env.example (modified, +4/-0)
  • agent/credential_pool.py (modified, +35/-9)
  • tests/agent/test_credential_pool.py (modified, +66/-0)

Code Example

EXHAUSTED_TTL_429_SECONDS = 60 * 60          # 1 hour
EXHAUSTED_TTL_DEFAULT_SECONDS = 60 * 60      # 1 hour

---

import os
_DEFAULT_TTL = 60 * 60  # 1 hour default
_ENV_TTL = os.getenv("HERMES_CREDENTIAL_TTL_SECONDS", "").strip()
if _ENV_TTL:
    try:
        _DEFAULT_TTL = max(60, int(_ENV_TTL))  # minimum 60 seconds
    except ValueError:
        pass
EXHAUSTED_TTL_429_SECONDS = _DEFAULT_TTL
EXHAUSTED_TTL_DEFAULT_SECONDS = _DEFAULT_TTL

---

credential_pool:
  exhausted_ttl_seconds: 300  # 5 minutes
RAW_BUFFERClick to expand / collapse

Problem

The credential pool exhaustion TTL (cooldown duration) is currently hardcoded to 1 hour in agent/credential_pool.py:

EXHAUSTED_TTL_429_SECONDS = 60 * 60          # 1 hour
EXHAUSTED_TTL_DEFAULT_SECONDS = 60 * 60      # 1 hour

This causes issues when:

  1. Burst traffic — Multiple concurrent requests trigger rate limits on all providers
  2. Fallback cascade — All providers go into 1-hour cooldown simultaneously
  3. Quota leak — Fallback to unintended providers (e.g., Ollama Cloud) burns quota unexpectedly
  4. Long wait time — Users must wait 1 hour for credentials to reset, or manually reset via hermes auth remove

Proposed Solution

Add environment variable configuration for TTL:

import os
_DEFAULT_TTL = 60 * 60  # 1 hour default
_ENV_TTL = os.getenv("HERMES_CREDENTIAL_TTL_SECONDS", "").strip()
if _ENV_TTL:
    try:
        _DEFAULT_TTL = max(60, int(_ENV_TTL))  # minimum 60 seconds
    except ValueError:
        pass
EXHAUSTED_TTL_429_SECONDS = _DEFAULT_TTL
EXHAUSTED_TTL_DEFAULT_SECONDS = _DEFAULT_TTL

Alternative: Add to config.yaml

credential_pool:
  exhausted_ttl_seconds: 300  # 5 minutes

Benefits

BeforeAfter
1-hour cooldown (hardcoded)Configurable (env var or config.yaml)
Manual reset requiredAuto-reset after shorter TTL
Fallback cascade riskFaster recovery reduces cascade risk

Use Case

I encountered this when running Hermes as a webhook receiver for a trading bot. A burst of 6 concurrent webhooks caused all providers (z.ai, bailian) to hit rate limits simultaneously. With 1-hour TTL, all credentials were exhausted → Hermes cascaded to Ollama Cloud fallback → Ollama quota was burned unexpectedly.

With configurable TTL (e.g., 5 minutes), credentials would reset faster and reduce the chance of fallback cascade.

Environment Variable Name

Suggestion: HERMES_CREDENTIAL_TTL_SECONDS or HERMES_CREDENTIAL_COOLDOWN_SECONDS

Minimum Value

Recommend minimum 60 seconds to prevent rapid retry loops.


Related: credential_pool_strategies in config.yaml already supports selection strategy (fill_first, round_robin) but not TTL.

extent analysis

TL;DR

Configure the credential pool exhaustion TTL using an environment variable or config.yaml to prevent hardcoded 1-hour cooldowns and reduce fallback cascade risks.

Guidance

  • Introduce an environment variable, such as HERMES_CREDENTIAL_TTL_SECONDS, to override the default 1-hour TTL, allowing for more flexible cooldown durations.
  • Alternatively, add a credential_pool section to config.yaml with an exhausted_ttl_seconds key to achieve the same configurability.
  • When setting the TTL, ensure a minimum value of 60 seconds to prevent rapid retry loops and potential quota burns.
  • Test the new configuration with burst traffic scenarios to verify the effectiveness of the reduced TTL in preventing fallback cascades.

Example

import os
_DEFAULT_TTL = 60 * 60  # 1 hour default
_ENV_TTL = os.getenv("HERMES_CREDENTIAL_TTL_SECONDS", "").strip()
if _ENV_TTL:
    try:
        _DEFAULT_TTL = max(60, int(_ENV_TTL))  # minimum 60 seconds
    except ValueError:
        pass
EXHAUSTED_TTL_429_SECONDS = _DEFAULT_TTL
EXHAUSTED_TTL_DEFAULT_SECONDS = _DEFAULT_TTL

Notes

The proposed solution focuses on introducing configurability for the credential pool exhaustion TTL. However, the optimal TTL value may vary depending on specific use cases and traffic patterns. Monitoring and adjusting the TTL as needed will be crucial to achieving the desired balance between preventing fallback cascades and avoiding quota burns.

Recommendation

Apply the workaround by introducing an environment variable or config.yaml configuration for the credential pool exhaustion TTL, as this provides the necessary flexibility to adapt to different scenarios and traffic conditions.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING