hermes - 💡(How to fix) Fix Expose response metadata and SSE comments to context engine plugins [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#15506Fetched 2026-04-26 05:27:00
View on GitHub
Comments
0
Participants
1
Timeline
4
Reactions
0
Author
Participants
Timeline (top)
labeled ×4

Code Example

usage_dict = {
    "prompt_tokens": prompt_tokens,
    "completion_tokens": completion_tokens,
    "total_tokens": total_tokens,
    "response_model": getattr(response, "model", None),
    "response_headers": dict(response_headers or {}),
    "sse_metadata": sse_metadata or {},
}
context_engine.update_from_response(usage_dict)

---

context_engine.update_from_response({
    "usage": usage,
    "model": requested_model,
    "provider": provider,
    "response_model": response_model,
    "headers": response_headers,
    "sse_comments": parsed_sse_comments,
})
RAW_BUFFERClick to expand / collapse

Feature request

Please expose provider response metadata to Hermes plugins/context engines, especially response headers and final SSE metadata comments.

Motivation

Some OpenAI-compatible routing proxies can switch the actual upstream model at runtime through fallback, routing, or model-combo strategies. A Hermes context-engine plugin needs to know the actual served model so it can keep the effective context window and compression threshold accurate after routing/model switching.

Concrete example: OmniRoute exposes the actual routed model as response metadata:

  • Non-streaming/streaming response header: X-OmniRoute-Model: <actual model>
  • Streaming final metadata comments such as : x-omniroute-model=<actual model>

A context engine such as RouterCtx can already resolve model-specific context windows when it receives a truthful served model. However, Hermes currently appears to call the context engine with token usage only, e.g. prompt/completion/total tokens, so plugins cannot see response headers, response.model, or SSE comments.

Requested capability

Add an upgrade-safe plugin/context-engine API path that passes response metadata into context engines, for example:

usage_dict = {
    "prompt_tokens": prompt_tokens,
    "completion_tokens": completion_tokens,
    "total_tokens": total_tokens,
    "response_model": getattr(response, "model", None),
    "response_headers": dict(response_headers or {}),
    "sse_metadata": sse_metadata or {},
}
context_engine.update_from_response(usage_dict)

Or preferably a structured object/event such as:

context_engine.update_from_response({
    "usage": usage,
    "model": requested_model,
    "provider": provider,
    "response_model": response_model,
    "headers": response_headers,
    "sse_comments": parsed_sse_comments,
})

Desired behavior

  • For non-streaming responses, expose response headers and response.model if available.
  • For streaming responses, expose response headers and any final/metadata SSE comments observed during stream processing.
  • Preserve backward compatibility with existing context engines that only expect usage token fields.
  • Avoid requiring modifications to Hermes core for router-specific integrations.

Use case

When Hermes requests gpt-5.5 through a local router, the router may fall back to claude-sonnet-4.5. If the router exposes X-OmniRoute-Model: claude-sonnet-4.5, a context-engine plugin can update the effective context window from the requested model's context to the actual served model's context.

Without this metadata handoff, plugins can only see the requested model and token counts, which makes silent server-side fallback impossible to handle reliably.

Related integrations

  • OmniRoute: exposes X-OmniRoute-Model and x-omniroute-model SSE metadata comments.
  • RouterCtx: context-engine plugin that can consume served-model metadata if Hermes passes it through.

Thanks!

extent analysis

TL;DR

To address the feature request, modify the Hermes plugin/context-engine API to pass response metadata, including response headers and final SSE metadata comments, to context engines.

Guidance

  • Identify the specific response metadata that needs to be exposed, such as X-OmniRoute-Model headers and x-omniroute-model SSE comments.
  • Design a structured object or event to pass this metadata to context engines, ensuring backward compatibility with existing engines.
  • Consider implementing a flexible API path that allows for easy extension or modification of the metadata passed to context engines.
  • Verify that the modified API can handle both non-streaming and streaming responses correctly.

Example

context_engine.update_from_response({
    "usage": usage,
    "model": requested_model,
    "provider": provider,
    "response_model": response_model,
    "headers": response_headers,
    "sse_comments": parsed_sse_comments,
})

Notes

The proposed solution requires careful consideration of backward compatibility and the potential impact on existing context engines. Additionally, the implementation should avoid requiring modifications to the Hermes core for router-specific integrations.

Recommendation

Apply a workaround by modifying the Hermes plugin/context-engine API to pass response metadata to context engines, as this will allow for more flexibility and customization in handling different routing scenarios.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix Expose response metadata and SSE comments to context engine plugins [1 participants]