hermes - 💡(How to fix) Fix SSE streaming: response.completed never sent due to race condition with agent_task.done()

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Root Cause

The None sentinel is safe because _on_delta filters None from agent callbacks, and the callback fires after the task completes (all agent callbacks have returned).

Fix Action

Fix

In gateway/platforms/api_server.py, after creating agent_task, add a done callback that pushes a None sentinel to the queue:

agent_task = asyncio.ensure_future(self._run_agent(...))
agent_task.add_done_callback(lambda _: _stream_q.put(None))

Apply at both streaming paths:

  • /v1/responses path (~line 2188)
  • /v1/chat/completions path (~line 1160)

The None sentinel is safe because _on_delta filters None from agent callbacks, and the callback fires after the task completes (all agent callbacks have returned).

Code Example

agent_task = asyncio.ensure_future(self._run_agent(...))
agent_task.add_done_callback(lambda _: _stream_q.put(None))
RAW_BUFFERClick to expand / collapse

Bug

When streaming via /v1/responses or /v1/chat/completions, the SSE loop relies on agent_task.done() to detect completion when the queue is empty. However, _on_delta filters out None (the agent's native EOS signal), so the loop can only exit via agent_task.done() — which can race with the queue-timeout check, causing keepalive messages to be sent indefinitely until the client disconnects.

Reproduction

  1. Start the gateway
  2. curl -sN http://127.0.0.1:8642/v1/responses -H "Content-Type: application/json" -d '{"input":"say hello","stream":true}'
  3. Observe that response.completed is sometimes never sent; the stream just sends : keepalive every 30 seconds

Fix

In gateway/platforms/api_server.py, after creating agent_task, add a done callback that pushes a None sentinel to the queue:

agent_task = asyncio.ensure_future(self._run_agent(...))
agent_task.add_done_callback(lambda _: _stream_q.put(None))

Apply at both streaming paths:

  • /v1/responses path (~line 2188)
  • /v1/chat/completions path (~line 1160)

The None sentinel is safe because _on_delta filters None from agent callbacks, and the callback fires after the task completes (all agent callbacks have returned).

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix SSE streaming: response.completed never sent due to race condition with agent_task.done()