hermes - ✅(Solved) Fix Auxiliary compression timeout can poison cached sync client, causing later auxiliary calls to fail [1 pull requests, 2 comments, 2 participants]

hermes2026-05-10 22:10:55

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#23432•Fetched 2026-05-11 03:29:29

View on GitHub

Comments

Participants

Timeline

Reactions

Author

yepyhun

Participants

teknium1

yepyhun

Timeline (top)

cross-referenced ×3labeled ×3commented ×2closed ×1

This affected context compression and memory/background auxiliary tasks in a long-running Discord gateway session. The main model route continued to work, so this does not look like a global network/auth outage.

Error Message

When preflight context compression uses the main openai-codex auxiliary route and the Responses stream exceeds the configured auxiliary timeout, the timeout path closes the underlying auxiliary client. The sync auxiliary client can remain in the cache afterward, so later auxiliary calls reuse a closed/poisoned client and fail quickly with Connection error. Failed to generate context summary: Connection error.. Further summary attempts paused for 30 seconds. Brainstack explicit capture validation extractor failed: Connection error. After an auxiliary timeout or connection error:

Root Cause

The timeout handler in the Codex auxiliary Responses stream closes the real client on timeout. The sync cache path can still return the cached client later, because sync cache hits do not validate whether the cached client was previously closed/poisoned.

Fix Action

Fixed

Fixed by PR: fix(auxiliary): evict cached client on timeout/connection error (https://github.com/NousResearch/hermes-agent/pull/23482)

PR fix notes

PR #23482: fix(auxiliary): evict cached client on timeout/connection error

Repository: NousResearch/hermes-agent
Author: teknium1
State: closed | merged: True
Link: https://github.com/NousResearch/hermes-agent/pull/23482

Description (problem / solution / changelog)

Closes #23432.

Summary

After an auxiliary Codex timeout, later provider: main aux calls (memory flush, background review, next compaction) stop failing with stale Connection error. Compression no longer drops the user into the static fallback marker after a single timeout in a long-running gateway session.

Root cause

_CodexCompletionsAdapter._close_client_on_timeout closes the inner OpenAI client to unstick a hung Responses stream. The cached wrapper still pointed at that now-dead transport. Sync _get_cached_client has no liveness check (async does, via loop identity), and call_llm's connection-error fallback only fires when is_auto — so an explicit provider (including auxiliary.compression.provider: main → openai-codex) never evicts.

Changes

agent/auxiliary_client.py: new _evict_cached_client_instance(target) walks _client_cache and drops entries whose stored client is target (or wraps it via _real_client for CodexAuxiliaryClient).
agent/auxiliary_client.py: _close_client_on_timeout evicts the wrapper after closing the inner client.
agent/auxiliary_client.py: call_llm + async_call_llm evict on _is_connection_error(first_err) before re-raising, independent of is_auto.

Validation

	Before	After
Codex aux timeout	inner client closed, cache entry retained	inner client closed AND wrapper evicted
Next `call_llm` (explicit provider)	reuses dead client → `Connection error`	rebuilds via `resolve_provider_client` → succeeds
Non-connection error (e.g. 400)	cache retained	cache retained (no thrash)

6 new tests in TestAuxiliaryClientPoisonedCacheEviction covering helper semantics, the timeout closer eviction, and the explicit-provider call_llm/async_call_llm paths. All pass.
Full tests/agent/test_auxiliary_client.py (147) green.
E2E with real imports: timeout fires → inner client close() called once → cache entry gone → next call_llm resolves a fresh client and returns the response. Non-connection error path verified to not evict.

Changed files

agent/auxiliary_client.py (modified, +60/-0)
tests/agent/test_auxiliary_client.py (modified, +185/-0)

Code Example

Preflight compression: ~234,406 tokens >= 217,600 threshold (model gpt-5.5, ctx 272,000)
context compression started: session=20260510_171239_804511 messages=255 tokens=~234,406 model=gpt-5.5 focus=None
Auxiliary compression: using main (gpt-5.5) at https://chatgpt.com/backend-api/codex/
Failed to generate context summary: Connection error.. Further summary attempts paused for 30 seconds.
context compression done: session=20260510_213719_f58dae messages=255->8 tokens=~27,120

---

Auxiliary compression: using main (gpt-5.5) at https://chatgpt.com/backend-api/codex/
Failed to generate context summary: Codex auxiliary Responses stream exceeded 20.0s total timeout. Further summary attempts paused for 30 seconds.

---

Auxiliary flush_memories: using main (gpt-5.5) at https://chatgpt.com/backend-api/codex/
Brainstack explicit capture validation extractor failed: Connection error.

---

model:
  default: gpt-5.5
  provider: openai-codex
  base_url: https://chatgpt.com/backend-api/codex
auxiliary:
  compression:
    provider: main
    model: ""
    timeout: 20

RAW_BUFFERClick to expand / collapse

Summary

Observed Behavior

Live log sequence from a long-running gateway session:

Preflight compression: ~234,406 tokens >= 217,600 threshold (model gpt-5.5, ctx 272,000)
context compression started: session=20260510_171239_804511 messages=255 tokens=~234,406 model=gpt-5.5 focus=None
Auxiliary compression: using main (gpt-5.5) at https://chatgpt.com/backend-api/codex/
Failed to generate context summary: Connection error.. Further summary attempts paused for 30 seconds.
context compression done: session=20260510_213719_f58dae messages=255->8 tokens=~27,120

Earlier in the same run there was a timeout on the same auxiliary path:

Auxiliary compression: using main (gpt-5.5) at https://chatgpt.com/backend-api/codex/
Failed to generate context summary: Codex auxiliary Responses stream exceeded 20.0s total timeout. Further summary attempts paused for 30 seconds.

After that timeout, repeated downstream auxiliary tasks using provider: main started failing with:

Auxiliary flush_memories: using main (gpt-5.5) at https://chatgpt.com/backend-api/codex/
Brainstack explicit capture validation extractor failed: Connection error.

Meanwhile normal main-agent calls to the same provider/model were succeeding in the same time window, including large requests around 168k-193k input tokens. That makes a stale/closed auxiliary client more likely than a global provider outage.

Config Shape

Relevant config:

model:
  default: gpt-5.5
  provider: openai-codex
  base_url: https://chatgpt.com/backend-api/codex
auxiliary:
  compression:
    provider: main
    model: ""
    timeout: 20

No stale stepfun/step-3.5-flash route was involved in this observed failure. The compression task inherited the main model route as intended.

Suspected Cause

Relevant live-source areas inspected:

agent/auxiliary_client.py: _CodexCompletionsAdapter timeout path calls client close on timeout.
agent/auxiliary_client.py: _get_cached_client(...) returns sync cached clients without a liveness check.
agent/context_compressor.py: _generate_summary(...) calls call_llm(task="compression", main_runtime=...) and then records _last_summary_error when the auxiliary call fails.
run_agent.py: preflight compression emits the user-facing fallback marker when _last_summary_error is set.

Expected Behavior

After an auxiliary timeout or connection error:

the poisoned/closed cached client should be evicted;
the next auxiliary call should build a fresh client;
compression should optionally retry once with a fresh client before inserting a fallback context marker;
downstream auxiliary memory/background tasks should not inherit the broken cached client state.

Suggested Fix

At minimum:

On timeout/connection failure from a cached sync auxiliary client, evict that cache entry before raising.
Add a liveness/closed-state guard for sync cached clients, similar in spirit to the existing async loop validation.
Add a regression test where:
- auxiliary compression times out and closes the wrapped client;
- a later provider: main auxiliary call is made;
- the later call must create/use a fresh client rather than reusing the closed one.

Impact

This can cause context compression to drop middle turns into a static fallback marker even though the main model route is still healthy. It can also make memory/background auxiliary tasks appear broken after a single auxiliary timeout.

Notes

I am reporting this as a Hermes auxiliary-client/runtime issue, not as a memory-provider storage issue. The memory provider was only the downstream consumer that made the poisoned cached auxiliary route visible.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #chain error #conversation history #tool integration #LLM response

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - ✅(Solved) Fix Auxiliary compression timeout can poison cached sync client, causing later auxiliary calls to fail [1 pull requests, 2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fixed

PR fix notes

PR #23482: fix(auxiliary): evict cached client on timeout/connection error

Description (problem / solution / changelog)

Summary

Root cause

Changes

Validation

Changed files

Code Example

Summary

Observed Behavior

Config Shape

Suspected Cause

Expected Behavior

Suggested Fix

Impact

Notes

Still need to ship something?

TRENDING

hermes - ✅(Solved) Fix Auxiliary compression timeout can poison cached sync client, causing later auxiliary calls to fail [1 pull requests, 2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fixed

PR fix notes

PR #23482: fix(auxiliary): evict cached client on timeout/connection error

Description (problem / solution / changelog)

Summary

Root cause

Changes

Validation

Changed files

Code Example

Summary

Observed Behavior

Config Shape

Suspected Cause

Expected Behavior

Suggested Fix

Impact

Notes

Still need to ship something?

RELATED_DISCOVERY

TRENDING