hermes - 💡(How to fix) Fix api_server: reasoning.available event carries final response text instead of reasoning_content

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Fix Action

Fix / Workaround

Workaround for downstream consumers

Code Example

curl -sS -X POST http://127.0.0.1:8642/v1/runs \
     -H 'content-type: application/json' \
     -d '{"input":"say hi"}'

---

curl -sS -N http://127.0.0.1:8642/v1/runs/<run_id>/events

---

# WRONG (current):
if (assistant_message.content and self.tool_progress_callback):
    _think_text = assistant_message.content.strip()
    ...
    elif _think_text:
        self.tool_progress_callback("reasoning.available", "_thinking", _think_text[:500], None)

# RIGHT (proposed):
reasoning_text = getattr(assistant_message, "reasoning_content", None) or ""
if reasoning_text.strip() and self.tool_progress_callback:
    self.tool_progress_callback("reasoning.available", "_thinking", reasoning_text.strip()[:500], None)
RAW_BUFFERClick to expand / collapse

Bug: reasoning.available event carries the final assistant text, not reasoning content

Symptom

When consuming the structured event stream via gateway/platforms/api_server.py (/v1/runs/{run_id}/events), each run emits a reasoning.available event followed by run.completed — and the text field on reasoning.available is identical to the output field on run.completed. This causes external UIs (e.g. a "thinking" pane separate from the "response" bubble) to display the final assistant text twice — once labeled as reasoning, once as the actual reply.

Reproduction

  1. Run hermes with api_server enabled (e.g. 127.0.0.1:8642).
  2. Start a run:
    curl -sS -X POST http://127.0.0.1:8642/v1/runs \
      -H 'content-type: application/json' \
      -d '{"input":"say hi"}'
  3. Consume the SSE stream from the returned run_id:
    curl -sS -N http://127.0.0.1:8642/v1/runs/<run_id>/events
  4. Observe that the text payload on reasoning.available equals the output payload on run.completed.

Mechanism (verified)

run_agent.py lines 14388–14406 emit the reasoning.available event by reading assistant_message.content (the final assistant text), stripping XML reasoning tags, and passing it to tool_progress_callback("reasoning.available", "_thinking", _think_text[:500], None).

The variable is even named _think_text but its source is .content, not .reasoning_content. Meanwhile, the actual reasoning content lives in a separate field that's already extracted elsewhere (e.g. line 9805: raw_reasoning_content = getattr(assistant_message, "reasoning_content", None)).

So the event payload has the wrong source. Two distinct fields exist on the message — the bug is that this emission path pulls from the wrong one.

Expected behavior

reasoning.available should fire from reasoning_content (the actual model reasoning block, when present), not from content (the visible assistant reply). For models that don't emit separate reasoning_content, the event should either not fire or fire with empty text — never with the final reply text.

Impact

Any external consumer of /v1/runs/{id}/events that renders reasoning.available and run.completed in distinct UI surfaces will show duplicate content. The richer the streaming UI is, the more visible the bug. For chat-style integrations (janus-ui v17 chat surface, which prompted finding this), the user sees their Janus's response in both the thinking pane and the message bubble.

Proposed fix

In run_agent.py around line 14388, gate the reasoning.available emission on reasoning_content (the structured field), not content. Specifically:

# WRONG (current):
if (assistant_message.content and self.tool_progress_callback):
    _think_text = assistant_message.content.strip()
    ...
    elif _think_text:
        self.tool_progress_callback("reasoning.available", "_thinking", _think_text[:500], None)

# RIGHT (proposed):
reasoning_text = getattr(assistant_message, "reasoning_content", None) or ""
if reasoning_text.strip() and self.tool_progress_callback:
    self.tool_progress_callback("reasoning.available", "_thinking", reasoning_text.strip()[:500], None)

The subagent-delegation path (lines 14399–14403, which relays the first line of _think_text to the parent display) is a separate feature and may need its own treatment depending on whether _thinking (without the reasoning. prefix) is also being misrouted, but that's a follow-on diagnosis.

Workaround for downstream consumers

Filter out reasoning.available events whose text equals the eventual run.completed.output. This is symptom-suppression, not a fix, but unblocks UIs in the interim.

⚒️

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

reasoning.available should fire from reasoning_content (the actual model reasoning block, when present), not from content (the visible assistant reply). For models that don't emit separate reasoning_content, the event should either not fire or fire with empty text — never with the final reply text.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING