hermes - 💡(How to fix) Fix [Bug]: All openai-codex / gpt-5.5 primary calls hang silently for full stale timeout

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

When the primary model is openai-codex / gpt-5.5 (base_url: https://chatgpt.com/backend-api/codex), the agent hangs silently on every turn for the full non-streaming stale timeout (≈300 s) before Hermes's stale-call detector kills the connection. Only then does the fallback chain activate. The intervening 5 minutes give zero feedback — no tokens, no error, no spinner update — making the agent appear completely frozen.

  • errors.log is empty for the session — no timeout traceback, no 4xx/5xx, no APITimeoutError

Additional Logs / Traceback (optional)

The ChatGPT Codex backend (chatgpt.com/backend-api/codex) is stricter than the public OpenAI API for gpt-5.5. When Hermes sends the Responses API payload built by ResponsesApiTransport.build_kwargs() in agent/transports/codex.py, certain fields cause the backend to silently drop the request rather than return a structured error.

Why the error is "silent" from Hermes's perspective

However, when Hermes calls the same endpoint through the streaming responses.stream() path and the rejection happens before any stream events are emitted, the SDK never raises an exception — the connection simply sits idle. Without the stale-call detector (_compute_non_stream_stale_timeout), the hang would be indefinite. The 300 s timeout is Hermes protecting itself from a silent backend, not the backend surfacing an error.

Root Cause

Root Cause Analysis (optional)

Fix Action

Fix / Workaround

Note: This issue is not about the Nous Portal fallback (separately tracked: my local checkout still lacks the May 4 fix/nous-gpt5-fallback-chat-completions patch, which is unrelated). The problem here is the primary Codex path itself.

Code Example

model:
     default: gpt-5.5
     provider: openai-codex
     base_url: https://chatgpt.com/backend-api/codex
   fallback_providers:
     - provider: nous
       model: openai/gpt-5.5
       base_url: https://inference-api.nousresearch.com/v1
     - provider: nous
       model: z-ai/glm-5.1

---

Fallback activated: gpt-5.5 → openai/gpt-5.5 (nous)

---

2026-05-07 05:09:51,158 INFO [session_id] root: Fallback activated: gpt-5.5 → openai/gpt-5.5 (nous)

---

Debug report uploaded:
  Report       https://paste.rs/wXppy
  agent.log    https://paste.rs/wW161
  gateway.log  https://dpaste.com/5KP8EUDHE

---
RAW_BUFFERClick to expand / collapse

Bug Description

When the primary model is openai-codex / gpt-5.5 (base_url: https://chatgpt.com/backend-api/codex), the agent hangs silently on every turn for the full non-streaming stale timeout (≈300 s) before Hermes's stale-call detector kills the connection. Only then does the fallback chain activate. The intervening 5 minutes give zero feedback — no tokens, no error, no spinner update — making the agent appear completely frozen.

The same OAuth profile and base URL works immediately with gpt-5.4-codex as the primary model. This strongly points to a backend-specific incompatibility with how Hermes builds the gpt-5.5 payload for the ChatGPT Codex endpoint.

Note: This issue is not about the Nous Portal fallback (separately tracked: my local checkout still lacks the May 4 fix/nous-gpt5-fallback-chat-completions patch, which is unrelated). The problem here is the primary Codex path itself.

Steps to Reproduce

  1. Configure primary model as openai-codex / gpt-5.5:
    model:
      default: gpt-5.5
      provider: openai-codex
      base_url: https://chatgpt.com/backend-api/codex
    fallback_providers:
      - provider: nous
        model: openai/gpt-5.5
        base_url: https://inference-api.nousresearch.com/v1
      - provider: nous
        model: z-ai/glm-5.1
  2. Authenticate: hermes auth add openai-codex (ChatGPT Plus, OAuth device-code)
  3. Run hermes chat and send any message
  4. Observe: spinner spins for ~300 s, then:
    Fallback activated: gpt-5.5 → openai/gpt-5.5 (nous)
    (which itself fails after ~10 s due to #nous-gpt5-routing, then falls through to GLM-5.1)

Expected Behavior

gpt-5.5 via openai-codex should respond the same way as gpt-5.4-codex on the same OAuth profile.

Actual Behavior

  • Agent freezes silently for ~300 s
  • agent.log shows only:
    2026-05-07 05:09:51,158 INFO [session_id] root: Fallback activated: gpt-5.5 → openai/gpt-5.5 (nous)
  • errors.log is empty for the session — no timeout traceback, no 4xx/5xx, no APITimeoutError
  • The 300 s delay exactly matches the default stale_timeout_seconds (300.0) for non-streaming calls (run_agent.py:2897)

Affected Component

Agent Core (conversation loop, context compression, memory)

Messaging Platform (if gateway-related)

No response

Debug Report

Debug report uploaded:
  Report       https://paste.rs/wXppy
  agent.log    https://paste.rs/wW161
  gateway.log  https://dpaste.com/5KP8EUDHE

Operating System

Ubuntu 24.04

Python Version

3.11.14

Hermes Version

v0.13.0 (2026.5.7)

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

Hypothesis

The ChatGPT Codex backend (chatgpt.com/backend-api/codex) is stricter than the public OpenAI API for gpt-5.5. When Hermes sends the Responses API payload built by ResponsesApiTransport.build_kwargs() in agent/transports/codex.py, certain fields cause the backend to silently drop the request rather than return a structured error.

Prime suspects (based on Codex backend behavior and upstream evidence):

  1. reasoning: {effort: "xhigh", summary: "auto"}gpt-5.5 via the ChatGPT Codex endpoint may not accept reasoning configuration the same way gpt-5.4 did. Newer model, newer backend gate. The Codex Aux adapter (#17004) translates extra_body.reasoning, but the main transport unconditionally injects it.

  2. include: ["reasoning.encrypted_content"] — paired with reasoning, this may cause the backend to enter a "thinking" mode that times out internally.

  3. store: falsegpt-5.5 on Codex may behave differently with store=false vs store=true (which the official Codex CLI uses). Related to #10217 which stripped reasoning item IDs when store=false.

  4. Tool schema sanitizationgpt-5.5 may hit a stricter schema validator than gpt-5.4. The May 7 commit 3924cb408 ("strip Codex-hostile top-level schema combinators") confirms the Codex backend rejects schemas that the public API accepts.

Why the error is "silent" from Hermes's perspective

The Codex backend does reject gpt-5.5 for some ChatGPT Plus accounts. In tool-level code (Codex CLI, curl) this materializes as a clear 400 with body {"detail":"The 'gpt-5.5' model is not supported when using Codex with a ChatGPT account."} (documented in openai/codex#19654 and multiple Reddit threads). However, when Hermes calls the same endpoint through the streaming responses.stream() path and the rejection happens before any stream events are emitted, the SDK never raises an exception — the connection simply sits idle. Without the stale-call detector (_compute_non_stream_stale_timeout), the hang would be indefinite. The 300 s timeout is Hermes protecting itself from a silent backend, not the backend surfacing an error.

Why gpt-5.5 specifically?

  • A/B test in the same codebase: commit facea8455 (Apr 25, landed in v0.12.0) explicitly states: "the same backend can accept temperature for some models and reject it for others (e.g. gpt-5.4 accepts but gpt-5.5 rejects on the same OpenAI endpoint)". This shows gpt-5.5 is already gated differently by the ChatGPT/Codex backend.
  • Upstream reports: openai/codex#19654, reddit r/codex threads (May 2026) — ChatGPT Plus accounts hitting The 'gpt-5.5' model is not supported when using Codex with a ChatGPT account.
  • OpenClaw ecosystem hit identical bug: openclaw/openclaw#72966openai-codex/gpt-5.5 hung silently because unsupported native ChatGPT payload fields caused silent rejection.
  • Auxiliary compression also timeouts on Codex: #20250 — same provider/model/base_url, same "read operation timed out" symptom.
  • May 7 commit 3924cb408 already fixed one Codex payload issue (schema combinators) for the same backend.
  • May 7 commit 5533ad764 added Codex Responses stream timeout enforcement for the aux path, acknowledging Codex can hang silently.

Proposed Fix (optional)

Add a gpt-5.5 / ChatGPT Codex backend payload sanitizer similar to OpenClaw's approach:

  1. In agent/transports/codex.py, when is_codex_backend=True and model starts with gpt-5.5, strip or adjust:

    • reasoning → omit or reduce to minimal {effort: "low"} or disable entirely
    • include → omit when reasoning is off
    • store → test whether true behaves differently
    • Any other fields the ChatGPT endpoint doesn't accept for this model
  2. Alternatively, add a _preflight_codex_api_kwargs enhancement (already exists per preflight_kwargs) that conditionally sanitizes based on backend + model.

  3. Add explicit logging when sanitization occurs so users know reasoning was disabled for Codex compatibility.

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix [Bug]: All openai-codex / gpt-5.5 primary calls hang silently for full stale timeout