hermes - 💡(How to fix) Fix API server /v1/responses drops agent reasoning from output_items

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Root Cause

In gateway/platforms/api_server.py:

  • _create_agent (api_server.py:830) builds AIAgent without passing reasoning_callback.
  • _dispatch (api_server.py:1695) only routes __tool_started__ / __tool_completed__ tags + plain text deltas — no __reasoning__ branch.
  • _extract_output_items (api_server.py:2563) (batch path) only emits function_call / function_call_output / message.

AIAgent._fire_reasoning_delta exists and is invoked during streaming in run_agent.py, but its target callback is never installed in the api_server platform.

Fix Action

Fix / Workaround

  • _create_agent (api_server.py:830) builds AIAgent without passing reasoning_callback.
  • _dispatch (api_server.py:1695) only routes __tool_started__ / __tool_completed__ tags + plain text deltas — no __reasoning__ branch.
  • _extract_output_items (api_server.py:2563) (batch path) only emits function_call / function_call_output / message.
  1. _create_agent / _run_agent accept reasoning_callback kwarg, forward to AIAgent.
  2. _handle_responses installs an _on_reasoning callback that pushes ("__reasoning__", text) onto the SSE queue.
  3. _dispatch buffers reasoning deltas in reasoning_parts.
  4. New _flush_reasoning() helper emits a single response.output_item.added/done pair (type:'reasoning', with summary + content text parts) before the assistant message item closes — preserving the canonical order function_calls → reasoning → message.
  5. The incomplete-snapshot path (early disconnect) also appends the buffered reasoning, so GET /v1/responses/{id} still surfaces thinking.

Happy to send a PR (already have a working patch + 2 new tests in tests/gateway/test_api_server.py, 140 passed).

RAW_BUFFERClick to expand / collapse

Problem

The api_server platform's /v1/responses endpoint (both SSE streaming and batch paths) emits function_call, function_call_output, and message output items, but never emits the agent's reasoning as a reasoning output_item — even though run_agent.py already produces reasoning text and fires reasoning_callback during streaming.

This breaks downstream consumers that persist output[] from the Responses API and expect the OpenAI Responses spec shape {type:'reasoning', summary, content}. CLI/Telegram/Slack get a preview via _emit_reasoning_preview, but Responses-API clients lose the chain-of-thought entirely.

Reproduction

  1. Run hermes-agent exposing /v1/responses.
  2. Use a model/profile that emits reasoning (e.g. a thinking-enabled OpenAI/Anthropic model).
  3. POST to /v1/responses with stream:true (or stream:false).
  4. Inspect response.completed.output[] (or the GET /v1/responses/{id} snapshot).

Observed: output[] contains only function_call, function_call_output, message. No reasoning item.

Expected: A reasoning item (per OpenAI Responses spec) carrying summary[].summary_text and content[].reasoning_text, ordered before the final message.

Root cause

In gateway/platforms/api_server.py:

  • _create_agent (api_server.py:830) builds AIAgent without passing reasoning_callback.
  • _dispatch (api_server.py:1695) only routes __tool_started__ / __tool_completed__ tags + plain text deltas — no __reasoning__ branch.
  • _extract_output_items (api_server.py:2563) (batch path) only emits function_call / function_call_output / message.

AIAgent._fire_reasoning_delta exists and is invoked during streaming in run_agent.py, but its target callback is never installed in the api_server platform.

Proposed fix

Wire reasoning_callback end-to-end:

  1. _create_agent / _run_agent accept reasoning_callback kwarg, forward to AIAgent.
  2. _handle_responses installs an _on_reasoning callback that pushes ("__reasoning__", text) onto the SSE queue.
  3. _dispatch buffers reasoning deltas in reasoning_parts.
  4. New _flush_reasoning() helper emits a single response.output_item.added/done pair (type:'reasoning', with summary + content text parts) before the assistant message item closes — preserving the canonical order function_calls → reasoning → message.
  5. The incomplete-snapshot path (early disconnect) also appends the buffered reasoning, so GET /v1/responses/{id} still surfaces thinking.

Happy to send a PR (already have a working patch + 2 new tests in tests/gateway/test_api_server.py, 140 passed).

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix API server /v1/responses drops agent reasoning from output_items