hermes - ✅(Solved) Fix Gateway auto-continue note can be persisted and amplified by interrupt-triggered preflight compression [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#25242Fetched 2026-05-14 03:47:55
View on GitHub
Comments
0
Participants
1
Timeline
5
Reactions
0
Author
Participants
Timeline (top)
labeled ×4cross-referenced ×1

Gateway interrupt + tool-tail auto-continue + preflight compression can turn a one-time recovery hint into durable session poison.

When a gateway turn is interrupted after tool output has been appended, the next user turn sees a trailing role="tool" and gateway prepends:

[System note: Your previous turn was interrupted before you could process the last tool result(s)...]

If that next turn immediately triggers preflight compression, the synthetic note and stale tool-tail content can be serialized/compacted into the child session. Later turns then keep seeing the old task/tool output as if it were fresh context.

Related but not identical: #23975 covers compression being interrupted and falling back to a weak marker. This issue covers auto-continue note persistence/replay plus missing consumed-state for inferred tool-tail recovery.

Root Cause

  1. The pending/new user message is processed immediately. Because history ends in role="tool", gateway prepends the auto-continue note:

Fix Action

Fix / Workaround

This is not fixed by switching to busy_input_mode: queue; interrupt mode is a valid required mode. The bug is that interrupt recovery has no one-shot acknowledgement for inferred tool tails, and its synthetic instruction can become durable context during compression.

PR fix notes

PR #25278: fix(gateway): prevent auto-continue note from poisoning session context

Description (problem / solution / changelog)

Fix for #25242

Problem

Gateway interrupt + tool-tail auto-continue + preflight compression can turn a one-time recovery hint into durable session poison. The synthetic auto-continue note gets persisted into the transcript and can replay across session splits.

Root Cause

At gateway/run.py:15298-15314, the auto-continue note is prepended directly into the user message string, which then gets persisted to the transcript via run_conversation(). There is:

  1. No mechanism to prevent the note from being stored
  2. No ack to prevent the same stale tool tail from triggering the note again
  3. No separation between API-only context and persisted user content

Fix (3 parts)

1. persist_user_message: Save the original user message before prepending the synthetic note. Pass it via persist_user_message to run_conversation() so only the clean message gets stored. The note is sent to the model as API-only context.

2. One-shot tool-tail ack: Compute a signature from the trailing tool batch's tool_call_ids. After delivering the recovery note once, mark the key as consumed. Subsequent user messages skip the note for that exact tool tail.

3. Fresh key per interruption: Each new tool-tail interruption gets a distinct key, so genuinely new interruptions still get their recovery note.

Tests

13 new tests in tests/gateway/test_auto_continue_note_persistence.py:

  • Tool-tail key computation (empty, non-tool, single, multiple, ordering, no-ids)
  • Ack lifecycle (first trigger, second skip, different key, full lifecycle)
  • persist_user_message logic (different messages, same messages, skills reload)

2439 existing gateway tests pass (1 pre-existing matrix failure).

Impact

  • Prevents stale tool output from being repeatedly summarized across turns
  • Prevents synthetic notes from appearing in compacted/transferred sessions
  • One-shot delivery: each tool tail gets exactly one recovery attempt

Changed files

  • gateway/run.py (modified, +50/-1)
  • tests/gateway/test_auto_continue_note_persistence.py (added, +164/-0)

Code Example

[System note: Your previous turn was interrupted before you could process the last tool result(s)...]

---

_has_fresh_tool_tail = bool(
        agent_history
        and agent_history[-1].get("role") == "tool"
        and _interruption_is_fresh
    )

---

display:
     busy_input_mode: interrupt
   compression:
     enabled: true

---

[
       {"role": "assistant", "tool_calls": [{"id": "call_1", ...}]},
       {"role": "tool", "tool_call_id": "call_1", "content": "...tool output..."},
   ]

---

T+00.000 Turn ended: reason=interrupted_during_api_call ... last_msg_role=tool ... response_len=65

---

T+00.075 conversation turn: history=N msg='[System note: Your previous turn was interrupted before you could proc...'

---

T+00.082 Preflight compression: ~218,403 tokens >= 217,600 threshold
   T+00.082 context compression started: messages=173
   T+178.130 context compression done: messages=173->10
   T+178.200 Turn ended: reason=interrupted_by_user api_calls=0 last_msg_role=user response_len=0

---

agent.run_conversation(
       run_message_with_recovery_note,
       conversation_history=agent_history,
       task_id=session_id,
       persist_user_message=clean_user_message,
   )

---

auto_continue_tool_tail_key: Optional[str]
   auto_continue_tool_tail_ack_at: Optional[datetime]
RAW_BUFFERClick to expand / collapse

Summary

Gateway interrupt + tool-tail auto-continue + preflight compression can turn a one-time recovery hint into durable session poison.

When a gateway turn is interrupted after tool output has been appended, the next user turn sees a trailing role="tool" and gateway prepends:

[System note: Your previous turn was interrupted before you could process the last tool result(s)...]

If that next turn immediately triggers preflight compression, the synthetic note and stale tool-tail content can be serialized/compacted into the child session. Later turns then keep seeing the old task/tool output as if it were fresh context.

Related but not identical: #23975 covers compression being interrupted and falling back to a weak marker. This issue covers auto-continue note persistence/replay plus missing consumed-state for inferred tool-tail recovery.

Affected code paths

  • gateway/run.py

    • _has_fresh_tool_tail is inferred from transcript shape:

      _has_fresh_tool_tail = bool(
          agent_history
          and agent_history[-1].get("role") == "tool"
          and _interruption_is_fresh
      )
    • The recovery note is prepended directly into message.

    • agent.run_conversation(...) is called without persist_user_message=..., so the synthetic note can become persisted user content.

    • Session split handling happens after final-response handling, so interrupted/no-response compression splits are fragile.

  • run_agent.py

    • preflight compression runs before the main tool loop checks _interrupt_requested.
    • _compress_context() can rotate self.session_id even if the turn later exits interrupted_by_user with api_calls=0.
  • gateway/session.py

    • resume_pending has durable state and cleanup.
    • inferred tool-tail auto-continue has no equivalent "this exact tool tail was already delivered" ack.

Reproduction steps

No personal data is required. This can be reproduced with any gateway platform that supports interrupting an active turn.

  1. Configure gateway input handling to interrupt active work:

    display:
      busy_input_mode: interrupt
    compression:
      enabled: true
  2. Use a long-running gateway session whose transcript is near the preflight compression threshold.

    For a deterministic test, lower the compression threshold or use a fixture history large enough that adding one recovery turn crosses the threshold.

  3. Start a turn that calls at least one tool and then requires a follow-up model call to summarize/process the tool result.

    The important transcript tail shape is:

    [
        {"role": "assistant", "tool_calls": [{"id": "call_1", ...}]},
        {"role": "tool", "tool_call_id": "call_1", "content": "...tool output..."},
    ]
  4. While the agent is in the next model/API call after that tool output, send a new gateway message in the same session.

  5. Observe the first turn exits with an interrupted tool tail, e.g. anonymized log shape:

    T+00.000 Turn ended: reason=interrupted_during_api_call ... last_msg_role=tool ... response_len=65
  6. The pending/new user message is processed immediately. Because history ends in role="tool", gateway prepends the auto-continue note:

    T+00.075 conversation turn: history=N msg='[System note: Your previous turn was interrupted before you could proc...'
  7. Because the transcript is near threshold, preflight compression runs before the turn reaches the normal interrupt check:

    T+00.082 Preflight compression: ~218,403 tokens >= 217,600 threshold
    T+00.082 context compression started: messages=173
    T+178.130 context compression done: messages=173->10
    T+178.200 Turn ended: reason=interrupted_by_user api_calls=0 last_msg_role=user response_len=0
  8. Continue sending messages in the same session.

Actual behavior

  • The synthetic auto-continue note appears in persisted/compacted user-message history.
  • The stale tool output remains semantically active and can be repeatedly summarized or obeyed.
  • The same old task/tool result can reappear across later session splits, even after the model has already responded to it.
  • Since tool-tail recovery is inferred only from transcript shape, there is no durable marker saying "this tool tail was already handed back to the model".

Expected behavior

  • A trailing tool result may trigger at most one recovery attempt.
  • The recovery instruction should be API-only context, not user-authored transcript text.
  • Preflight compression should not commit a session split while a turn is already interrupted, unless the split is safely propagated and the synthetic recovery note is not serialized.
  • Once a specific trailing tool batch is delivered to the model, subsequent user turns should not receive the same auto-continue note for that same batch.

Proposed fix direction

  1. Use the existing AIAgent.run_conversation(..., persist_user_message=...) support for the gateway auto-continue prefix:

    agent.run_conversation(
        run_message_with_recovery_note,
        conversation_history=agent_history,
        task_id=session_id,
        persist_user_message=clean_user_message,
    )
  2. Add durable consumed state for inferred tool-tail recovery, similar in spirit to resume_pending:

    auto_continue_tool_tail_key: Optional[str]
    auto_continue_tool_tail_ack_at: Optional[datetime]

    Compute the key from the trailing consecutive tool messages, e.g. tool_call_id, tool_name, and a content hash.

  3. Gate _has_fresh_tool_tail on that key not already being acknowledged.

  4. Mark the key acknowledged once the recovery turn has actually reached the model.

  5. Make preflight compression interrupt-aware:

    • check _interrupt_requested before starting compression;
    • re-check before committing a session split;
    • or defer gateway interrupts while compression is in a critical section.
  6. Ensure compression-induced agent.session_id changes are propagated back to gateway/session store even for interrupted/no-final-response turns, or avoid committing the split in that path.

Notes

This is not fixed by switching to busy_input_mode: queue; interrupt mode is a valid required mode. The bug is that interrupt recovery has no one-shot acknowledgement for inferred tool tails, and its synthetic instruction can become durable context during compression.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

  • A trailing tool result may trigger at most one recovery attempt.
  • The recovery instruction should be API-only context, not user-authored transcript text.
  • Preflight compression should not commit a session split while a turn is already interrupted, unless the split is safely propagated and the synthetic recovery note is not serialized.
  • Once a specific trailing tool batch is delivered to the model, subsequent user turns should not receive the same auto-continue note for that same batch.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING