hermes - 💡(How to fix) Fix [Feature]: Hermes codex_responses stream fails when codex.rate_limits arrives before response.created [1 participants]

hermes2026-04-23 16:26:38

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#14634•Fetched 2026-04-24 06:15:51

View on GitHub

Comments

Participants

Timeline

Reactions

Author

w-o0

Participants

w-o0

Timeline (top)

labeled ×3

Error Message

Observed error:

Code Example

No response received: Expected to have received response.created before codex.rate_limits

---

Expected to have received `response.created` before `codex.rate_limits`

---

Expected to have received `response.created` before ...

---

RAW_BUFFERClick to expand / collapse

Problem or Use Case

When Hermes is used with provider=custom and api_mode=codex_responses against a codex-lb /v1/responses backend, the streaming path can fail if the backend emits codex.rate_limits before response.created.

Observed error:

No response received: Expected to have received response.created before codex.rate_limits

In regression testing, the core RuntimeError is:

Expected to have received `response.created` before `codex.rate_limits`

Hermes already handles one Responses streaming edge case by falling back when response.completed is missing. However, it does not currently handle another real-world compatibility case: a provider-specific prelude event arriving before response.created.

This makes the current stream event-order assumption too strict for some OpenAI-compatible backends such as codex-lb, causing the whole request to fail even though a valid final response could still be obtained through a fallback path.

Proposed Solution

Add a fallback branch in _run_codex_stream(...) for errors matching:

Expected to have received `response.created` before ...

Instead of aborting the conversation, Hermes should fall back to a non-streaming responses.create(...) call without stream=True.

Suggested behavior:

keep the existing fallback for missing response.completed
add detection for response.created / prelude ordering mismatches
fall back to non-stream responses.create(...) for this class of stream-protocol mismatch
add a regression test covering the codex.rate_limits prelude case

This approach is low-risk, preserves existing behavior, and improves compatibility with OpenAI-compatible backends that emit provider-specific prelude events before response.created.

Alternatives Considered

No response

Feature Type

Performance / reliability

Scope

None

Contribution

I'd like to implement this myself and submit a PR

Debug Report (optional)

extent analysis

TL;DR

Implement a fallback branch in _run_codex_stream(...) to handle errors where codex.rate_limits is received before response.created by switching to a non-streaming responses.create(...) call.

Guidance

Identify the _run_codex_stream(...) function and prepare to add a new error handling branch for the specific error message indicating that response.created was expected before codex.rate_limits.
Detect the error and fall back to a non-streaming responses.create(...) call without stream=True to preserve compatibility with OpenAI-compatible backends.
Ensure the existing fallback for missing response.completed remains intact to handle both edge cases.
Consider adding a regression test to cover the codex.rate_limits prelude case for thorough testing.

Example

try:
    # existing streaming logic
except RuntimeError as e:
    if "Expected to have received `response.created` before" in str(e):
        # fall back to non-streaming responses.create call
        return responses.create(..., stream=False)

Notes

This solution aims to improve compatibility with specific backends by relaxing the stream event-order assumption, but it may require additional testing to ensure it does not introduce new issues.

Recommendation

Apply workaround: Implement the fallback branch as described to improve compatibility with OpenAI-compatible backends that emit provider-specific prelude events before response.created.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #model save/load #optimization #mixed precision #training loop

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix [Feature]: Hermes codex_responses stream fails when codex.rate_limits arrives before response.created [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

Problem or Use Case

Proposed Solution

Alternatives Considered

Feature Type

Scope

Contribution

Debug Report (optional)

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix [Feature]: Hermes codex_responses stream fails when codex.rate_limits arrives before response.created [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

Problem or Use Case

Proposed Solution

Alternatives Considered

Feature Type

Scope

Contribution

Debug Report (optional)

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING