hermes - ✅(Solved) Fix Iteration-limit summary crashes with CodexAuxiliaryClient missing responses for codex_responses custom provider [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#20111Fetched 2026-05-06 06:38:43
View on GitHub
Comments
0
Participants
1
Timeline
4
Reactions
0
Participants
Timeline (top)
labeled ×3cross-referenced ×1

Error Message

I reached the maximum iterations (60) but couldn't summarize. Error: 'CodexAuxiliaryClient' object has no attribute 'responses'

Root Cause

Likely root cause

Fix Action

Fixed

PR fix notes

PR #20122: fix(agent): expose .responses on CodexAuxiliaryClient for iteration-limit summary

Description (problem / solution / changelog)

Summary

Fixes the crash 'CodexAuxiliaryClient' object has no attribute 'responses' when an agent turn hits agent.max_turns while using a named custom provider with api_mode: codex_responses.

Root cause

When a named custom provider declares api_mode: codex_responses, resolve_provider_client() wraps the plain OpenAI client in CodexAuxiliaryClient (via _wrap_if_needed). During fallback activation, self.client is set to this wrapper.

When the iteration-limit summary path runs, it calls _run_codex_stream(codex_kwargs) without passing an explicit client. _run_codex_stream falls back to self._ensure_primary_openai_client() which returns the CodexAuxiliaryClient wrapper. It then tries active_client.responses.stream(...) which crashes because CodexAuxiliaryClient only exposed .chat, .api_key, .base_url, and .close().

The main agent loop avoids this by always creating a fresh request client via _create_request_openai_client() (which produces a plain OpenAI client), but the summary path did not follow the same pattern.

Fix

Delegate .responses to the underlying real OpenAI client in CodexAuxiliaryClient.__init__, making it a proper drop-in wrapper for all OpenAI client interfaces.

Files changed: agent/auxiliary_client.py (+4/-0 effective), tests/agent/test_auxiliary_client.py (+40)

Regression coverage

  • TestCodexAuxiliaryClientResponsesAttribute::test_exposes_responses_from_real_client — verifies .responses delegates correctly
  • TestCodexAuxiliaryClientResponsesAttribute::test_responses_stream_usable — verifies .responses.stream() is callable (the exact interface _run_codex_stream uses)

Testing

168 related tests pass via scripts/run_tests.sh:

  • tests/agent/test_auxiliary_client.py (88 tests)
  • tests/agent/test_auxiliary_named_custom_providers.py (26 tests)
  • tests/run_agent/test_run_agent_codex_responses.py (47 tests)
  • tests/run_agent/test_flush_memories_codex.py (7 tests)

Fixes Iteration-limit summary crashes with CodexAuxiliaryClient missing responses for codex_responses custom provider #20111

Changed files

  • agent/auxiliary_client.py (modified, +6/-1)
  • gateway/platforms/whatsapp.py (modified, +4/-1)
  • tests/agent/test_auxiliary_client.py (modified, +40/-0)
  • tests/tools/test_file_tools.py (modified, +37/-0)
  • tools/file_tools.py (modified, +6/-6)

Code Example

I reached the maximum iterations (60) but couldn't summarize. Error: 'CodexAuxiliaryClient' object has no attribute 'responses'

---

model:
  default: gpt-5.5
  provider: custom-or-named-provider
  api_mode: codex_responses
  context_length: 256000

custom_providers:
  - name: named_custom_responses_provider
    base_url: https://example.invalid/openai/v1
    key_env: EXAMPLE_API_KEY
    api_mode: codex_responses
    model: gpt-5.5

fallback_providers:
  - provider: named_custom_responses_provider
    model: gpt-5.5
    base_url: https://example.invalid/openai/v1
    key_env: EXAMPLE_API_KEY
    api_mode: codex_responses

agent:
  max_turns: 60

---

INFO root: Fallback activated: gpt-5.5 → gpt-5.5 (named_custom_responses_provider)
WARNING root: Failed to get summary response: 'CodexAuxiliaryClient' object has no attribute 'responses'
INFO gateway.run: response ready: ... api_calls=60 response=127 chars

---

I reached the maximum iterations (60) but couldn't summarize. Error: 'CodexAuxiliaryClient' object has no attribute 'responses'

---

if entry_api_mode == "codex_responses" and not isinstance(client, CodexAuxiliaryClient):
    client = CodexAuxiliaryClient(client, final_model)

---

if raw_codex:
    return False

---

if (
    entry_api_mode == "codex_responses"
    and not raw_codex
    and not isinstance(client, CodexAuxiliaryClient)
):
    client = CodexAuxiliaryClient(client, final_model)
else:
    client = _wrap_if_needed(client, final_model, raw_base_for_wrap, custom_key)
RAW_BUFFERClick to expand / collapse

Bug description

When an agent turn reaches agent.max_turns while running a codex_responses model through a named custom/fallback provider, Hermes tries to generate the iteration-limit summary but crashes with:

I reached the maximum iterations (60) but couldn't summarize. Error: 'CodexAuxiliaryClient' object has no attribute 'responses'

The original agent turn correctly hit the iteration limit, but the fallback summary path should still produce a summary. Instead, the summary path receives a CodexAuxiliaryClient wrapper where it expects a raw OpenAI client exposing .responses.

Environment

  • Hermes Agent: v0.12.0 (2026.4.30)
  • Commit: 75e1339d4
  • Python: 3.11.15
  • OpenAI SDK: 2.32.0
  • OS: macOS-26.4.1-arm64-arm-64bit
  • Runtime: Discord gateway session

Relevant configuration shape

Sanitized config excerpt:

model:
  default: gpt-5.5
  provider: custom-or-named-provider
  api_mode: codex_responses
  context_length: 256000

custom_providers:
  - name: named_custom_responses_provider
    base_url: https://example.invalid/openai/v1
    key_env: EXAMPLE_API_KEY
    api_mode: codex_responses
    model: gpt-5.5

fallback_providers:
  - provider: named_custom_responses_provider
    model: gpt-5.5
    base_url: https://example.invalid/openai/v1
    key_env: EXAMPLE_API_KEY
    api_mode: codex_responses

agent:
  max_turns: 60

Steps to reproduce

  1. Configure a named custom provider with api_mode: codex_responses.
  2. Configure it as a fallback provider, or otherwise route a long-running gateway/CLI turn through that named custom provider.
  3. Run a task that exhausts agent.max_turns.
  4. Hermes enters the iteration-limit summary path.
  5. Summary generation fails with CodexAuxiliaryClient missing .responses.

Actual behavior

Gateway logs show the provider fallback and then summary failure:

INFO root: Fallback activated: gpt-5.5 → gpt-5.5 (named_custom_responses_provider)
WARNING root: Failed to get summary response: 'CodexAuxiliaryClient' object has no attribute 'responses'
INFO gateway.run: response ready: ... api_calls=60 response=127 chars

User-visible response:

I reached the maximum iterations (60) but couldn't summarize. Error: 'CodexAuxiliaryClient' object has no attribute 'responses'

Expected behavior

When a codex_responses turn reaches the iteration limit, Hermes should generate a summary using a client that supports the Responses API (client.responses.create/stream) or otherwise route through the correct transport adapter. It should not surface an internal wrapper attribute error to the user.

Likely root cause

In agent/auxiliary_client.py, the named custom_providers branch appears to wrap the OpenAI client with CodexAuxiliaryClient whenever the custom entry declares api_mode: codex_responses:

if entry_api_mode == "codex_responses" and not isinstance(client, CodexAuxiliaryClient):
    client = CodexAuxiliaryClient(client, final_model)

However, some main-agent paths call resolve_provider_client(..., raw_codex=True) because they need direct client.responses.* access. CodexAuxiliaryClient only exposes client.chat.completions.create(...) compatibility and does not expose .responses, so the main-agent Responses path crashes when it receives the wrapped client.

This differs from the _needs_codex_wrap() helper, which already respects raw_codex:

if raw_codex:
    return False

Suggested fix

In the named custom provider branch, respect raw_codex before wrapping:

if (
    entry_api_mode == "codex_responses"
    and not raw_codex
    and not isinstance(client, CodexAuxiliaryClient)
):
    client = CodexAuxiliaryClient(client, final_model)
else:
    client = _wrap_if_needed(client, final_model, raw_base_for_wrap, custom_key)

or route this branch through _wrap_if_needed() consistently so the existing raw_codex guard is honored.

Additional note

Raising agent.max_turns only delays the first condition. It does not fix the .responses attribute error in the iteration-limit summary path.

extent analysis

TL;DR

The issue can be fixed by modifying the agent/auxiliary_client.py file to respect the raw_codex parameter when wrapping the OpenAI client with CodexAuxiliaryClient.

Guidance

  • Check the agent/auxiliary_client.py file and verify that the raw_codex parameter is being respected when wrapping the OpenAI client with CodexAuxiliaryClient.
  • Update the code to consistently route the named custom provider branch through _wrap_if_needed() to honor the existing raw_codex guard.
  • Test the changes by running a task that exhausts agent.max_turns and verify that the iteration-limit summary path generates a summary without errors.

Example

The suggested fix involves modifying the code as follows:

if (
    entry_api_mode == "codex_responses"
    and not raw_codex
    and not isinstance(client, CodexAuxiliaryClient)
):
    client = CodexAuxiliaryClient(client, final_model)
else:
    client = _wrap_if_needed(client, final_model, raw_base_for_wrap, custom_key)

This change ensures that the raw_codex parameter is respected when wrapping the OpenAI client.

Notes

The issue is specific to the codex_responses API mode and the named custom provider branch. The fix should not affect other API modes or provider configurations.

Recommendation

Apply the suggested fix to the agent/auxiliary_client.py file to resolve the issue. This fix respects the raw_codex parameter and ensures that the iteration-limit summary path generates a summary without errors.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

When a codex_responses turn reaches the iteration limit, Hermes should generate a summary using a client that supports the Responses API (client.responses.create/stream) or otherwise route through the correct transport adapter. It should not surface an internal wrapper attribute error to the user.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - ✅(Solved) Fix Iteration-limit summary crashes with CodexAuxiliaryClient missing responses for codex_responses custom provider [1 pull requests, 1 participants]