hermes - 💡(How to fix) Fix [Feature]: Hermes codex_responses stream fails when codex.rate_limits arrives before response.created [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#14634Fetched 2026-04-24 06:15:51
View on GitHub
Comments
0
Participants
1
Timeline
3
Reactions
0
Author
Participants
Timeline (top)
labeled ×3

Error Message

Observed error:

Code Example

No response received: Expected to have received response.created before codex.rate_limits

---

Expected to have received `response.created` before `codex.rate_limits`

---

Expected to have received `response.created` before ...

---
RAW_BUFFERClick to expand / collapse

Problem or Use Case

When Hermes is used with provider=custom and api_mode=codex_responses against a codex-lb /v1/responses backend, the streaming path can fail if the backend emits codex.rate_limits before response.created.

Observed error:

No response received: Expected to have received response.created before codex.rate_limits

In regression testing, the core RuntimeError is:

Expected to have received `response.created` before `codex.rate_limits`

Hermes already handles one Responses streaming edge case by falling back when response.completed is missing. However, it does not currently handle another real-world compatibility case: a provider-specific prelude event arriving before response.created.

This makes the current stream event-order assumption too strict for some OpenAI-compatible backends such as codex-lb, causing the whole request to fail even though a valid final response could still be obtained through a fallback path.

Proposed Solution

Add a fallback branch in _run_codex_stream(...) for errors matching:

Expected to have received `response.created` before ...

Instead of aborting the conversation, Hermes should fall back to a non-streaming responses.create(...) call without stream=True.

Suggested behavior:

  • keep the existing fallback for missing response.completed
  • add detection for response.created / prelude ordering mismatches
  • fall back to non-stream responses.create(...) for this class of stream-protocol mismatch
  • add a regression test covering the codex.rate_limits prelude case

This approach is low-risk, preserves existing behavior, and improves compatibility with OpenAI-compatible backends that emit provider-specific prelude events before response.created.

Alternatives Considered

No response

Feature Type

Performance / reliability

Scope

None

Contribution

  • I'd like to implement this myself and submit a PR

Debug Report (optional)

extent analysis

TL;DR

Implement a fallback branch in _run_codex_stream(...) to handle errors where codex.rate_limits is received before response.created by switching to a non-streaming responses.create(...) call.

Guidance

  • Identify the _run_codex_stream(...) function and prepare to add a new error handling branch for the specific error message indicating that response.created was expected before codex.rate_limits.
  • Detect the error and fall back to a non-streaming responses.create(...) call without stream=True to preserve compatibility with OpenAI-compatible backends.
  • Ensure the existing fallback for missing response.completed remains intact to handle both edge cases.
  • Consider adding a regression test to cover the codex.rate_limits prelude case for thorough testing.

Example

try:
    # existing streaming logic
except RuntimeError as e:
    if "Expected to have received `response.created` before" in str(e):
        # fall back to non-streaming responses.create call
        return responses.create(..., stream=False)

Notes

This solution aims to improve compatibility with specific backends by relaxing the stream event-order assumption, but it may require additional testing to ensure it does not introduce new issues.

Recommendation

Apply workaround: Implement the fallback branch as described to improve compatibility with OpenAI-compatible backends that emit provider-specific prelude events before response.created.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING