hermes - 💡(How to fix) Fix Expose response metadata and SSE comments to context engine plugins [1 participants]

hermes2026-04-25 04:44:48

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#15506•Fetched 2026-04-26 05:27:00

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Marco9113

Participants

Marco9113

Timeline (top)

labeled ×4

Code Example

usage_dict = {
    "prompt_tokens": prompt_tokens,
    "completion_tokens": completion_tokens,
    "total_tokens": total_tokens,
    "response_model": getattr(response, "model", None),
    "response_headers": dict(response_headers or {}),
    "sse_metadata": sse_metadata or {},
}
context_engine.update_from_response(usage_dict)

---

context_engine.update_from_response({
    "usage": usage,
    "model": requested_model,
    "provider": provider,
    "response_model": response_model,
    "headers": response_headers,
    "sse_comments": parsed_sse_comments,
})

RAW_BUFFERClick to expand / collapse

Feature request

Please expose provider response metadata to Hermes plugins/context engines, especially response headers and final SSE metadata comments.

Motivation

Some OpenAI-compatible routing proxies can switch the actual upstream model at runtime through fallback, routing, or model-combo strategies. A Hermes context-engine plugin needs to know the actual served model so it can keep the effective context window and compression threshold accurate after routing/model switching.

Concrete example: OmniRoute exposes the actual routed model as response metadata:

Non-streaming/streaming response header: X-OmniRoute-Model: <actual model>
Streaming final metadata comments such as : x-omniroute-model=<actual model>

A context engine such as RouterCtx can already resolve model-specific context windows when it receives a truthful served model. However, Hermes currently appears to call the context engine with token usage only, e.g. prompt/completion/total tokens, so plugins cannot see response headers, response.model, or SSE comments.

Requested capability

Add an upgrade-safe plugin/context-engine API path that passes response metadata into context engines, for example:

usage_dict = {
    "prompt_tokens": prompt_tokens,
    "completion_tokens": completion_tokens,
    "total_tokens": total_tokens,
    "response_model": getattr(response, "model", None),
    "response_headers": dict(response_headers or {}),
    "sse_metadata": sse_metadata or {},
}
context_engine.update_from_response(usage_dict)

Or preferably a structured object/event such as:

context_engine.update_from_response({
    "usage": usage,
    "model": requested_model,
    "provider": provider,
    "response_model": response_model,
    "headers": response_headers,
    "sse_comments": parsed_sse_comments,
})

Desired behavior

For non-streaming responses, expose response headers and response.model if available.
For streaming responses, expose response headers and any final/metadata SSE comments observed during stream processing.
Preserve backward compatibility with existing context engines that only expect usage token fields.
Avoid requiring modifications to Hermes core for router-specific integrations.

Use case

When Hermes requests gpt-5.5 through a local router, the router may fall back to claude-sonnet-4.5. If the router exposes X-OmniRoute-Model: claude-sonnet-4.5, a context-engine plugin can update the effective context window from the requested model's context to the actual served model's context.

Without this metadata handoff, plugins can only see the requested model and token counts, which makes silent server-side fallback impossible to handle reliably.

Related integrations

OmniRoute: exposes X-OmniRoute-Model and x-omniroute-model SSE metadata comments.
RouterCtx: context-engine plugin that can consume served-model metadata if Hermes passes it through.

Thanks!

extent analysis

TL;DR

To address the feature request, modify the Hermes plugin/context-engine API to pass response metadata, including response headers and final SSE metadata comments, to context engines.

Guidance

Identify the specific response metadata that needs to be exposed, such as X-OmniRoute-Model headers and x-omniroute-model SSE comments.
Design a structured object or event to pass this metadata to context engines, ensuring backward compatibility with existing engines.
Consider implementing a flexible API path that allows for easy extension or modification of the metadata passed to context engines.
Verify that the modified API can handle both non-streaming and streaming responses correctly.

Example

context_engine.update_from_response({
    "usage": usage,
    "model": requested_model,
    "provider": provider,
    "response_model": response_model,
    "headers": response_headers,
    "sse_comments": parsed_sse_comments,
})

Notes

The proposed solution requires careful consideration of backward compatibility and the potential impact on existing context engines. Additionally, the implementation should avoid requiring modifications to the Hermes core for router-specific integrations.

Recommendation

Apply a workaround by modifying the Hermes plugin/context-engine API to pass response metadata to context engines, as this will allow for more flexibility and customization in handling different routing scenarios.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #agent execution #callback error #memory management #API rate limit

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix Expose response metadata and SSE comments to context engine plugins [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Code Example

Feature request

Motivation

Requested capability

Desired behavior

Use case

Related integrations

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix Expose response metadata and SSE comments to context engine plugins [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Code Example

Feature request

Motivation

Requested capability

Desired behavior

Use case

Related integrations

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING