hermes - 💡(How to fix) Fix auxiliary_client _client_cache not cleared on .env hot-reload, causes stale 401 errors

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

After load_dotenv succeeds:

load_dotenv(_env_path, override=True, encoding="utf-8")

Invalidate auxiliary client cache so new keys take effect

try: from agent.auxiliary_client import _client_cache, _client_cache_lock with _client_cache_lock: _client_cache.clear() except Exception: pass

Root Cause

  1. gateway/run.py reloads .env before each message (~line 13208):

    load_dotenv(_env_path, override=True, encoding="utf-8")
  2. _get_cached_client() in agent/auxiliary_client.py (~line 3130) caches OpenAI clients by (provider, async_mode, base_url, api_key, api_mode, ...).

  3. When auxiliary.compression has provider: zai with an explicit base_url but no api_key, the resolved api_key parameter is None. The cache key becomes ("custom", False, "https://...", "", None, (), False) — this never changes even after .env is reloaded with a new key.

  4. The cached client holds the old key internally. Subsequent calls hit the cache and reuse the stale client → 401.

Fix Action

Workaround

Configuring auxiliary.compression to use the main provider (e.g., kimi-coding) instead of a separate provider with base_url avoids the worst case, but the underlying cache staleness issue remains for any named provider.

Code Example

load_dotenv(_env_path, override=True, encoding="utf-8")

---

auxiliary:
     compression:
       provider: zai
       model: glm-5-turbo
       base_url: https://open.bigmodel.cn/api/coding/paas/v4

---

# After load_dotenv succeeds:
load_dotenv(_env_path, override=True, encoding="utf-8")

# Invalidate auxiliary client cache so new keys take effect
try:
    from agent.auxiliary_client import _client_cache, _client_cache_lock
    with _client_cache_lock:
        _client_cache.clear()
except Exception:
    pass
RAW_BUFFERClick to expand / collapse

Bug Description

When a gateway is long-running and API keys in .env are updated (e.g., key rotation), the gateway correctly calls load_dotenv(override=True) to refresh os.environ. However, the auxiliary client cache (_client_cache in agent/auxiliary_client.py) retains OpenAI client instances with the old API key baked in.

This causes persistent 401 "令牌已过期或验证不正确" errors on auxiliary operations (compression, summary, vision) until the gateway is manually restarted.

Root Cause

  1. gateway/run.py reloads .env before each message (~line 13208):

    load_dotenv(_env_path, override=True, encoding="utf-8")
  2. _get_cached_client() in agent/auxiliary_client.py (~line 3130) caches OpenAI clients by (provider, async_mode, base_url, api_key, api_mode, ...).

  3. When auxiliary.compression has provider: zai with an explicit base_url but no api_key, the resolved api_key parameter is None. The cache key becomes ("custom", False, "https://...", "", None, (), False) — this never changes even after .env is reloaded with a new key.

  4. The cached client holds the old key internally. Subsequent calls hit the cache and reuse the stale client → 401.

Reproduction

  1. Configure auxiliary.compression with a provider that uses .env API keys:
    auxiliary:
      compression:
        provider: zai
        model: glm-5-turbo
        base_url: https://open.bigmodel.cn/api/coding/paas/v4
  2. Start gateway, let it run for a while (compression client gets cached)
  3. Rotate the API key in .env (e.g., update GLM_API_KEY)
  4. Observe: compression calls still use old key → 401 errors
  5. Restart gateway → works again (fresh cache)

Suggested Fix

Clear _client_cache after .env hot-reload in gateway/run.py:

# After load_dotenv succeeds:
load_dotenv(_env_path, override=True, encoding="utf-8")

# Invalidate auxiliary client cache so new keys take effect
try:
    from agent.auxiliary_client import _client_cache, _client_cache_lock
    with _client_cache_lock:
        _client_cache.clear()
except Exception:
    pass

Alternatively, make the cache key include a hash of the actual resolved API key (not just the api_key parameter).

Workaround

Configuring auxiliary.compression to use the main provider (e.g., kimi-coding) instead of a separate provider with base_url avoids the worst case, but the underlying cache staleness issue remains for any named provider.

Environment

  • Hermes Agent: latest main branch
  • macOS, launchd-managed gateway
  • 4 concurrent gateways (main + 3 TL profiles)

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix auxiliary_client _client_cache not cleared on .env hot-reload, causes stale 401 errors