hermes - ✅(Solved) Fix Pending response lost when session split occurs at response boundary [2 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#14238Fetched 2026-04-23 07:45:57
View on GitHub
Comments
1
Participants
2
Timeline
7
Reactions
0
Timeline (top)
labeled ×4cross-referenced ×2commented ×1

Root Cause

When Session split detected fires, the gateway creates a new child session lineage. If the agent's run_conversation() returns its final response after the split event has swapped the active session reference but before the platform adapter's send call executes, the send logic appears to target the wrong (or already-closed) session context, causing the delivery to be silently dropped.

This is likely a race condition in the session handoff path during compression.

Fix Action

Fix / Workaround

Alternatively, delay the session split until after the current turn's response has been fully dispatched.

PR fix notes

PR #3: fix: preserve tool-turn context in chat/completions (#14270)

Description (problem / solution / changelog)

결론

Open WebUI 계열 클라이언트에서 도구 호출 맥락이 다음 턴에 소실되는 문제(#14270)를 chat/completions 경로에서 재현 테스트 후 수정했습니다.

원인

  • /v1/chat/completions 파서가 user/assistant만 conversation_history에 포함하고 tool role은 버렸습니다.
  • 또한 마지막 메시지를 무조건 user_message로 간주해, 일부 클라이언트에서 trailing tool/assistant 턴이 섞일 때 입력 추출이 불안정했습니다.
  • assistant tool-call 턴이 content: "" + tool_calls 형태로 전달되면, 이전 턴에 도구를 불렀다는 힌트가 완전히 사라졌습니다.

수정 내용

  1. tool role 메시지를 conversation_history에 포함
  2. assistant 턴이 empty content + tool_calls인 경우, 툴 이름 기반 marker를 history에 보존
    • 예: [assistant issued tool calls: read_file]
  3. user_message 추출을 "마지막 메시지"가 아닌 "마지막 유효 user 턴" 기준으로 변경

테스트

  • test_tool_messages_and_assistant_tool_calls_are_preserved
  • test_last_user_message_selected_when_trailing_tool_message_exists
  • 기존 test_conversation_history_passed 회귀 포함

실행:

  • scripts/run_tests.sh tests/gateway/test_api_server.py::TestChatCompletionsEndpoint

결과:

  • 17 passed

연계 이슈

  • Closes #14270
  • #14238, #14210 은 별도 패치로 이어서 진행 예정 (레이스/FD 누수 성격으로 범위가 큼)

Changed files

  • agent/error_classifier.py (modified, +24/-13)
  • gateway/platforms/api_server.py (modified, +37/-5)
  • run_agent.py (modified, +1/-1)
  • tests/agent/test_error_classifier.py (modified, +23/-0)
  • tests/gateway/test_api_server.py (modified, +69/-0)
  • tests/run_agent/test_run_agent.py (modified, +20/-0)
  • tests/tools/test_registry.py (modified, +24/-0)
  • tests/tools/test_todo_tool.py (modified, +23/-0)
  • tools/registry.py (modified, +43/-1)
  • tools/todo_tool.py (modified, +16/-2)

PR #14391: fix(gateway): do not drop final reply after session split

Description (problem / solution / changelog)

Fixes #14238.

Root cause

Gateway delivery suppression treated response_previewed=True as proof that the final reply had reached the user. During compression/session handoff, unrelated interim commentary could be delivered while the actual final response still needed the normal platform send path. That left the response persisted in the child session but not sent to chat.

Fix

  • Track exact commentary text delivered by GatewayStreamConsumer.
  • Only suppress the normal final send when the stream consumer confirms final delivery, or when the exact final response text was already delivered as a preview.
  • Apply the same guard before processing queued follow-up messages so the first response is not skipped at a session boundary.

Tests

  • python -m pytest tests/gateway/test_run_progress_topics.py::test_run_agent_previewed_final_marks_already_sent tests/gateway/test_run_progress_topics.py::test_run_agent_previewed_split_keeps_final_delivery_pending tests/gateway/test_run_progress_topics.py::test_run_agent_queued_message_does_not_treat_commentary_as_final tests/gateway/test_run_progress_topics.py::test_run_agent_streaming_does_not_enable_completed_interim_commentary tests/gateway/test_stream_consumer.py::TestInterimCommentaryMessages -q --tb=short
  • python -m pytest tests/gateway/test_run_progress_topics.py tests/gateway/test_stream_consumer.py -q --tb=short
  • git diff --check

Note: the current full-suite baseline on main is covered by #13352, which is green and mergeable. If this PR shows the same unrelated full-suite failures before #13352 lands, the targeted regression coverage above is the relevant signal for this patch.

Changed files

  • gateway/run.py (modified, +34/-9)
  • gateway/stream_consumer.py (modified, +13/-0)
  • tests/gateway/test_run_progress_topics.py (modified, +34/-0)

Code Example

... Session split detected: <parent_session_id><child_session_id> (compression)
... response ready: platform=feishu chat=<chat_id> time=<duration>s api_calls=<n> response=<n> chars

# ❌ MISSING: "Sending response" log line
# Next log entry proceeds to unrelated routine work, no platform delivery occurred

---

... response ready: ... response=<n> chars
... gateway.platforms.base: [Feishu] Sending response (<n> chars) to ...
RAW_BUFFERClick to expand / collapse

Bug Description

When a long-running gateway session triggers automatic context compression (Session split detected), if the split occurs at the exact moment the agent returns its final response, the response is persisted to the new child session file but never sent to the messaging platform (Feishu/Discord/Telegram/WhatsApp/etc).

This creates a confusing UX where:

  • The session transcript (JSON) shows a complete assistant response
  • The user on the platform receives nothing
  • The agent appears to have "done the work but gone silent"

Steps to Reproduce

  1. Start Hermes gateway with a platform adapter (observed on Feishu, likely affects all platforms)
  2. Have a conversation long enough to approach the compression threshold
  3. Send a message that requires significant processing (many tool calls / long generation time)
  4. If the session happens to cross the compression threshold during that turn, the bug triggers

Evidence Pattern

From agent.log in an affected session, the following log pattern is observed:

... Session split detected: <parent_session_id> → <child_session_id> (compression)
... response ready: platform=feishu chat=<chat_id> time=<duration>s api_calls=<n> response=<n> chars

# ❌ MISSING: "Sending response" log line
# Next log entry proceeds to unrelated routine work, no platform delivery occurred

The response is saved in the child session JSON file (as an assistant message), but the platform adapter never sends it.

For comparison, a normal successful flow always shows:

... response ready: ... response=<n> chars
... gateway.platforms.base: [Feishu] Sending response (<n> chars) to ...

Root Cause Analysis

When Session split detected fires, the gateway creates a new child session lineage. If the agent's run_conversation() returns its final response after the split event has swapped the active session reference but before the platform adapter's send call executes, the send logic appears to target the wrong (or already-closed) session context, causing the delivery to be silently dropped.

This is likely a race condition in the session handoff path during compression.

Expected Behavior

Every response ready event must have a corresponding platform delivery attempt. Session split/compression should not drop in-flight responses.

Actual Behavior

Responses generated at the compression boundary are persisted but not delivered.

Related Issues

  • #12029 — "sessions still leak open rows and ghost stubs across gateway/cli/cron" (proposed fix: "Make session split/compression explicitly finalize or transition the prior session")
  • #12026 — Compression trigger includes reasoning tokens, causing premature session splits

Environment

  • OS: macOS (darwin)
  • Python: 3.11.15
  • Hermes Version: latest (2026.4.x)
  • Platform: Feishu (Lark)
  • Profile: independent profile with custom gateway

Proposed Fix

Ensure the platform adapter holds a reference to the target chat/channel ID independently of the active session object, so that even if a session split occurs mid-response, the delivery still targets the correct platform destination.

Alternatively, delay the session split until after the current turn's response has been fully dispatched.

extent analysis

TL;DR

Modify the platform adapter to hold a reference to the target chat/channel ID independently of the active session object to prevent dropped responses during session compression.

Guidance

  • Review the run_conversation() function to ensure it returns the final response after the session split event has completed, or modify the session handoff path to delay the split until after the current turn's response has been dispatched.
  • Verify the fix by checking the agent.log for the presence of the "Sending response" log line after a session split, and confirm that the response is delivered to the platform.
  • Consider implementing a temporary workaround to delay the session split until after the current turn's response has been fully dispatched, to mitigate the issue until a permanent fix is implemented.
  • Investigate related issues #12029 and #12026 to ensure that the proposed fix does not introduce new problems or interact with existing bugs in unexpected ways.

Example

No code snippet is provided, as the issue requires a high-level understanding of the session management and platform adapter logic.

Notes

The proposed fix assumes that the issue is caused by a race condition in the session handoff path during compression. However, the root cause may be more complex, and additional debugging may be necessary to fully resolve the issue.

Recommendation

Apply the proposed fix to modify the platform adapter to hold a reference to the target chat/channel ID independently of the active session object, as this approach is more likely to prevent dropped responses during session compression.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING