hermes - 💡(How to fix) Fix auxiliary_client _client_cache not cleared on .env hot-reload, causes stale 401 errors

StepCodex · 2026-05-07T03:39:02Z

[hermes] Bug Description When a gateway is long-running and API keys in .env are updated e.g., key rotation , the gateway correctly calls load dotenv override=… ## Workaround Configuring `auxiliary.compression` to use the main provider (e.g., `kimi-coding`) instead of a separate provider with `base_url` avoids the worst case, but the underlying cache staleness issue remains for any named provider. ## Bug Description When a gateway is long-running and API keys in `.env` are updated (e.g., key rotation), the gateway correctly calls `load_dotenv(override=True)` to refresh `os.environ`. However, the auxiliary client cache (`_client_cache` in `agent/auxiliary_client.py`) retains OpenAI client instances with the **old** API key baked in. This causes persistent 401 `"令牌已过期或验证不正确"` errors on auxiliary operations (compression, summary, vision) until the gateway is manually restarted. ## Root Cause 1. `gateway/run.py` reloads `.env` before each message (~line 13208): ```python load_dotenv(_env_path, override=True, encoding="utf-8") ``` 2. `_get_cached_client()` in `agent/auxiliary_client.py` (~line 3130) caches OpenAI clients by `(provider, async_mode, base_url, api_key, api_mode, ...)`. 3. When `auxiliary.compression` has `provider: zai` with an explicit `base_url` but no `api_key`, the resolved `api_key` parameter is `None`. The cache key becomes `("custom", False, "https://...", "", None, (), False)` — this **never changes** even after `.env` is reloaded with a new key. 4. The cached client holds the old key internally. Subsequent calls hit the cache and reuse the stale client → 401. ## Reproduction 1. Configure `auxiliary.compression` with a provider that uses `.env` API keys: ```yaml auxiliary: compression: provider: zai model: glm-5-turbo base_url: https://open.bigmodel.cn/api/coding/paas/v4 ``` 2. Start gateway, let it run for a while (compression client gets cached) 3. Rotate the API key in `.env` (e.g., update `GLM_API_KEY`) 4. Observe: compression calls still use old key → 401 errors 5. Restart gateway → works again (fresh cache) ## Suggested Fix Clear `_client_cache` after `.env` hot-reload in `gateway/run.py`: ```python # After load_dotenv succeeds: load_dotenv(_env_path, override=True, encoding="utf-8") # Invalidate auxiliary client cache so new keys take effect try: from agent.auxiliary_client import _client_cache, _client_cache_lock with _client_cache_lock: _client_cache.clear() except Exception: pass ``` Alternatively, make the cache key include a hash of the actual resolved API key (not just the `api_key` parameter). ## Workaround Configuring `auxiliary.compression` to use the main provider (e.g., `kimi-coding`) instead of a separate provider with `base_url` avoids the worst case, but the underlying cache staleness issue remains for any named provider. ## Environment - Hermes Agent: latest main branch - macOS, launchd-managed gateway - 4 concurrent gateways (main + 3 TL profiles)

hermes2026-05-07 03:39:02

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Error Message

After load_dotenv succeeds:

load_dotenv(_env_path, override=True, encoding="utf-8")

Invalidate auxiliary client cache so new keys take effect

try: from agent.auxiliary_client import _client_cache, _client_cache_lock with _client_cache_lock: _client_cache.clear() except Exception: pass

Root Cause

gateway/run.py reloads .env before each message (~line 13208):
```
load_dotenv(_env_path, override=True, encoding="utf-8")
```
_get_cached_client() in agent/auxiliary_client.py (~line 3130) caches OpenAI clients by (provider, async_mode, base_url, api_key, api_mode, ...).
When auxiliary.compression has provider: zai with an explicit base_url but no api_key, the resolved api_key parameter is None. The cache key becomes ("custom", False, "https://...", "", None, (), False) — this never changes even after .env is reloaded with a new key.
The cached client holds the old key internally. Subsequent calls hit the cache and reuse the stale client → 401.

Fix Action

Workaround

Configuring auxiliary.compression to use the main provider (e.g., kimi-coding) instead of a separate provider with base_url avoids the worst case, but the underlying cache staleness issue remains for any named provider.

Code Example

load_dotenv(_env_path, override=True, encoding="utf-8")

---

auxiliary:
     compression:
       provider: zai
       model: glm-5-turbo
       base_url: https://open.bigmodel.cn/api/coding/paas/v4

---

# After load_dotenv succeeds:
load_dotenv(_env_path, override=True, encoding="utf-8")

# Invalidate auxiliary client cache so new keys take effect
try:
    from agent.auxiliary_client import _client_cache, _client_cache_lock
    with _client_cache_lock:
        _client_cache.clear()
except Exception:
    pass

RAW_BUFFERClick to expand / collapse

Bug Description

When a gateway is long-running and API keys in .env are updated (e.g., key rotation), the gateway correctly calls load_dotenv(override=True) to refresh os.environ. However, the auxiliary client cache (_client_cache in agent/auxiliary_client.py) retains OpenAI client instances with the old API key baked in.

This causes persistent 401 "令牌已过期或验证不正确" errors on auxiliary operations (compression, summary, vision) until the gateway is manually restarted.

Root Cause

gateway/run.py reloads .env before each message (~line 13208):
```
load_dotenv(_env_path, override=True, encoding="utf-8")
```
_get_cached_client() in agent/auxiliary_client.py (~line 3130) caches OpenAI clients by (provider, async_mode, base_url, api_key, api_mode, ...).
When auxiliary.compression has provider: zai with an explicit base_url but no api_key, the resolved api_key parameter is None. The cache key becomes ("custom", False, "https://...", "", None, (), False) — this never changes even after .env is reloaded with a new key.
The cached client holds the old key internally. Subsequent calls hit the cache and reuse the stale client → 401.

Reproduction

Configure auxiliary.compression with a provider that uses .env API keys:

auxiliary:
  compression:
    provider: zai
    model: glm-5-turbo
    base_url: https://open.bigmodel.cn/api/coding/paas/v4

Start gateway, let it run for a while (compression client gets cached)
Rotate the API key in .env (e.g., update GLM_API_KEY)
Observe: compression calls still use old key → 401 errors
Restart gateway → works again (fresh cache)

Suggested Fix

Clear _client_cache after .env hot-reload in gateway/run.py:

# After load_dotenv succeeds:
load_dotenv(_env_path, override=True, encoding="utf-8")

# Invalidate auxiliary client cache so new keys take effect
try:
    from agent.auxiliary_client import _client_cache, _client_cache_lock
    with _client_cache_lock:
        _client_cache.clear()
except Exception:
    pass

Alternatively, make the cache key include a hash of the actual resolved API key (not just the api_key parameter).

Workaround

Environment

Hermes Agent: latest main branch
macOS, launchd-managed gateway
4 concurrent gateways (main + 3 TL profiles)

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #chain error #conversation history #tool integration #LLM response

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix auxiliary_client _client_cache not cleared on .env hot-reload, causes stale 401 errors

Recommended Tools

GitHub issue graph ai analysis

Error Message

After load_dotenv succeeds:

Invalidate auxiliary client cache so new keys take effect

Root Cause

Fix Action

Workaround

Code Example

Bug Description

Root Cause

Reproduction

Suggested Fix

Workaround

Environment

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix auxiliary_client _client_cache not cleared on .env hot-reload, causes stale 401 errors

Recommended Tools

GitHub issue graph ai analysis

Error Message

After load_dotenv succeeds:

Invalidate auxiliary client cache so new keys take effect

Root Cause

Fix Action

Workaround

Code Example

Bug Description

Root Cause

Reproduction

Suggested Fix

Workaround

Environment

Still need to ship something?

RELATED_DISCOVERY

TRENDING