hermes - ✅(Solved) Fix [Bug]: VS Code ACP prompt can remain in-flight indefinitely after repeated compression timeout [2 pull requests, 2 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#20250Fetched 2026-05-06 06:37:51
View on GitHub
Comments
2
Participants
3
Timeline
9
Reactions
0
Timeline (top)
labeled ×4cross-referenced ×3commented ×2

Error Message

A long VS Code ACP session had already compressed several times. The latest prompt started but did not finish:

Root Cause

In VS Code the UX looks like a spinner/stale session, but the backend process is not dead. Because the log mtime/size continues to advance with heartbeat-like session/update events, a simple "is the log growing?" check can incorrectly classify the turn as active even though the prompt is not completing and no messages are being persisted.

The practical recovery was to reload VS Code / kill only the VS Code-spawned hermes acp process and reconnect, then start a fresh ACP session because the old one had compressed 6 times.

Fix Action

Fixed

PR fix notes

PR #20356: fix(acp): drop shadowing import json in build_tool_start (#20250)

Description (problem / solution / changelog)

Summary

  • The generic-fallback branch of acp_adapter.tools.build_tool_start had a function-scope import json even though json is imported at module scope. Python treats it as a local binding for the entire function body, which made the earlier json.dumps call in the _POLISHED_TOOLS branch raise UnboundLocalError for any polished tool without a dedicated branch.
  • Affected tools include discord, discord_admin, kanban_*, ha_*, browser_* (the ones not given dedicated rendering branches), feishu_*, yb_*, vision_analyze, image_generate, text_to_speech, cronjob, send_message, mixture_of_agents, clarify, etc.
  • This matches the secondary error reported in #20250:
    UnboundLocalError: cannot access local variable 'json' where it is not associated with a value
      File ".../acp_adapter/server.py", line 622, in _replay_session_history
        if not await _send(build_tool_start(tool_call_id, tool_name, args)):
      File ".../acp_adapter/tools.py", line 1128, in build_tool_start
  • Removed the redundant import, kept a comment explaining the foot-gun, and added a parametrized regression test covering 20 affected polished tool names.

Refs #20250 (this is the secondary build_tool_start error; the primary lifecycle issue is not addressed here).

Testing

  • scripts/run_tests.sh tests/acp/test_tools.py::TestBuildToolStart -q
  • scripts/run_tests.sh tests/acp/test_tools.py -q (full file)
  • Manual repro before fix: build_tool_start("tc", "discord", {})UnboundLocalError. After fix: returns a ToolCallStart.

Test output

▶ running pytest with 4 workers, hermetic env, in /tmp/hermes-20250a-fix
  (TZ=UTC LANG=C.UTF-8 PYTHONHASHSEED=0; all credential env vars unset)
bringing up nodes...
bringing up nodes...

...............................                                          [100%]
31 passed in 2.48s

Full file:

........................................................................ [ 96%]
...                                                                      [100%]
75 passed in 2.70s

Changed files

  • acp_adapter/tools.py (modified, +8/-1)
  • tests/acp/test_tools.py (modified, +40/-0)

PR #20357: fix(acp): accept allow_permanent kwarg in approval callback (#20250)

Description (problem / solution / changelog)

Summary

  • acp_adapter.permissions.make_approval_callback returned a callback whose signature was (command, description) only.
  • tools.approval.prompt_dangerous_approval always invokes its callback with allow_permanent= (see tools/approval.py line 596), so the ACP callback raised:
    TypeError: make_approval_callback.<locals>._callback() got an unexpected keyword argument 'allow_permanent'
  • The error was swallowed inside prompt_dangerous_approval and converted to a hard "deny", leaving VS Code/Zed ACP users unable to approve any dangerous command and only this log line to point at the cause:
    Approval callback failed: ... unexpected keyword argument 'allow_permanent'
  • Updates the inner callback to accept *, allow_permanent=True and to drop the allow_always option from the ACP options list when allow_permanent is False (mirrors the CLI's [a]lways suppression behaviour for tirith-flagged commands).
  • Adds 6 regression tests in tests/acp/test_permissions.py.

Refs #20250 (this is the second of two secondary errors reported alongside the prompt-lifecycle issue; the first was the UnboundLocalError in build_tool_start, addressed separately in #20356).

Testing

  • scripts/run_tests.sh tests/acp/test_permissions.py -q
  • scripts/run_tests.sh tests/acp/test_permissions.py::TestApprovalCallbackSignature -q
  • Verified that the new tests fail on origin/main (TypeError on the kwarg), and pass after the patch.

Test output

▶ running pytest with 4 workers, hermetic env, in /tmp/hermes-20250b-fix
  (TZ=UTC LANG=C.UTF-8 PYTHONHASHSEED=0; all credential env vars unset)
bringing up nodes...
bringing up nodes...

...........                                                              [100%]
11 passed in 2.46s

Changed files

  • acp_adapter/permissions.py (modified, +18/-6)
  • tests/acp/test_permissions.py (modified, +113/-0)

Code Example

[acp] <-- #9 OK
[session] prompt done b47219a4-...-704d used=52825213 cached=396647936
[ui] prompt finished
[ui] send (27 chars)
[ui] run prompt (27 chars)
[ui] attached IDE context (437 chars)
[session] reusing b47219a4-...-704d
[session] prompt b47219a4-...-704d (464 chars)
[acp] --> session/prompt #10
2026-05-05 14:43:29 [INFO] acp_adapter.server: Prompt on session b47219a4-...-704d: [Active file: _plans/android-watch-integration/03-findings-ledger.json]
2026-05-05 14:43:29 [INFO] agent.auxiliary_client: Auxiliary compression: using auto (gpt-5.5) at https://chatgpt.com/backend-api/codex/
2026-05-05 14:54:18 [INFO] agent.auxiliary_client: Auxiliary compression: connection error on auto (The read operation timed out), trying fallback
2026-05-05 14:54:19 [INFO] agent.auxiliary_client: Auxiliary compression: connection error on auto — falling back to api-key (gemini-3-flash-preview)
2026-05-05 14:56:19 [WARNING] root: Failed to generate context summary: The read operation timed out. Further summary attempts paused for 60 seconds.
⚠️  Session compressed 6 times — accuracy may degrade. Consider /new to start fresh.

---

[acp] <-- session/update
[acp] <-- session/update
[acp] <-- session/update
...

---

UnboundLocalError: cannot access local variable 'json' where it is not associated with a value
  File ".../acp_adapter/server.py", line 622, in _replay_session_history
    if not await _send(build_tool_start(tool_call_id, tool_name, args)):
  File ".../acp_adapter/tools.py", line 1128, in build_tool_start

---

Approval callback failed: make_approval_callback.<locals>._callback() got an unexpected keyword argument 'allow_permanent'
  File ".../tools/approval.py", line 596, in prompt_dangerous_approval
    return approval_callback(command, description, ...)
RAW_BUFFERClick to expand / collapse

Bug description

A VS Code-spawned Hermes ACP session can remain in-flight indefinitely after repeated context compression failures. The backend process stays alive and the VS Code output log keeps growing with repeated session/update events, but the active session/prompt #N never completes and no new assistant/tool messages are persisted to state.db.

This looks different from a simple UI-only stale spinner: the Hermes ACP prompt lifecycle itself is left without a matching #N OK, [session] prompt done, or [ui] prompt finished.

Environment

  • Hermes Agent: v0.12.0 (2026.4.30)
  • Hermes source commit: b2dd73455
  • VS Code: 1.117.0 arm64 on macOS
  • VS Code extension: [email protected]
  • ACP command: hermes acp
  • Provider/model: openai-codex / gpt-5.5
  • Session source: ACP / VS Code

Observed behavior

A long VS Code ACP session had already compressed several times. The latest prompt started but did not finish:

[acp] <-- #9 OK
[session] prompt done b47219a4-...-704d used=52825213 cached=396647936
[ui] prompt finished
[ui] send (27 chars)
[ui] run prompt (27 chars)
[ui] attached IDE context (437 chars)
[session] reusing b47219a4-...-704d
[session] prompt b47219a4-...-704d (464 chars)
[acp] --> session/prompt #10
2026-05-05 14:43:29 [INFO] acp_adapter.server: Prompt on session b47219a4-...-704d: [Active file: _plans/android-watch-integration/03-findings-ledger.json]
2026-05-05 14:43:29 [INFO] agent.auxiliary_client: Auxiliary compression: using auto (gpt-5.5) at https://chatgpt.com/backend-api/codex/
2026-05-05 14:54:18 [INFO] agent.auxiliary_client: Auxiliary compression: connection error on auto (The read operation timed out), trying fallback
2026-05-05 14:54:19 [INFO] agent.auxiliary_client: Auxiliary compression: connection error on auto — falling back to api-key (gemini-3-flash-preview)
2026-05-05 14:56:19 [WARNING] root: Failed to generate context summary: The read operation timed out. Further summary attempts paused for 60 seconds.
⚠️  Session compressed 6 times — accuracy may degrade. Consider /new to start fresh.

After this point:

  • The VS Code Hermes log kept appending repeated events like:
[acp] <-- session/update
[acp] <-- session/update
[acp] <-- session/update
...
  • There was still no matching completion for prompt #10:
    • no [acp] <-- #10 OK
    • no [session] prompt done ...
    • no [ui] prompt finished
  • The VS Code-spawned ACP process remained alive but mostly idle/sleeping.
  • ~/.hermes/state.db had no newer persisted messages for the ACP session after the prior completed response at 14:40:53.

The last completed assistant response before the stuck prompt was persisted normally and the session/database did not show later progress for prompt #10.

Expected behavior

If compression fails or repeatedly times out during an ACP prompt, Hermes should still settle the ACP prompt lifecycle. Any of these outcomes would be better than leaving a prompt in-flight indefinitely:

  1. Continue with a safe fallback context marker and eventually emit #N OK / prompt-finished events.
  2. Return a structured error/failure update to the ACP client and mark the prompt complete.
  3. Trip a bounded watchdog/timeout that cancels the turn and tells the user to start a fresh session.
  4. Stop sending heartbeat-only session/update events once the backend has no real progress to report, or include an explicit stalled/compression-failed state so clients can surface recovery guidance.

Why this matters

In VS Code the UX looks like a spinner/stale session, but the backend process is not dead. Because the log mtime/size continues to advance with heartbeat-like session/update events, a simple "is the log growing?" check can incorrectly classify the turn as active even though the prompt is not completing and no messages are being persisted.

The practical recovery was to reload VS Code / kill only the VS Code-spawned hermes acp process and reconnect, then start a fresh ACP session because the old one had compressed 6 times.

Related issues

This appears related but not identical to:

  • #18458 — compression summary failures/timeouts, but this issue is specifically about the ACP prompt lifecycle remaining in-flight afterward.
  • #15618 — exposing context/compression metadata would help UIs, but this bug still needs backend lifecycle/error handling.
  • joaompfp/hermes-vscode#3 — VS Code ACP sessions going quiet; this report adds backend evidence of an unmatched ACP prompt after repeated compression timeout.

Additional secondary errors seen in same ACP process

These may deserve separate issues if they are not already tracked, but I am including them because they appeared in the same VS Code ACP run:

UnboundLocalError: cannot access local variable 'json' where it is not associated with a value
  File ".../acp_adapter/server.py", line 622, in _replay_session_history
    if not await _send(build_tool_start(tool_call_id, tool_name, args)):
  File ".../acp_adapter/tools.py", line 1128, in build_tool_start

and:

Approval callback failed: make_approval_callback.<locals>._callback() got an unexpected keyword argument 'allow_permanent'
  File ".../tools/approval.py", line 596, in prompt_dangerous_approval
    return approval_callback(command, description, ...)

Those may be separate ACP regressions, but the main issue here is the unmatched session/prompt #10 after compression timeout.

extent analysis

TL;DR

The Hermes ACP session remains in-flight indefinitely after repeated context compression failures, causing the backend process to stay alive and the VS Code output log to grow with repeated session/update events.

Guidance

  • Investigate the compression timeout issue and consider implementing a bounded watchdog/timeout that cancels the turn and notifies the user to start a fresh session.
  • Review the error handling for compression failures and consider returning a structured error/failure update to the ACP client and marking the prompt complete.
  • Check the ACP prompt lifecycle handling to ensure it settles the prompt lifecycle even if compression fails or times out.
  • Consider adding an explicit stalled/compression-failed state to the session/update events to allow clients to surface recovery guidance.

Example

No code snippet is provided as the issue is more related to the overall architecture and error handling of the Hermes ACP session.

Notes

The issue seems to be related to the handling of compression failures and timeouts in the Hermes ACP session. The provided logs and error messages suggest that the session remains in-flight indefinitely, causing the backend process to stay alive and the VS Code output log to grow. The issue may be related to the acp_adapter.server and agent.auxiliary_client components.

Recommendation

Apply a workaround by implementing a bounded watchdog/timeout that cancels the turn and notifies the user to start a fresh session. This will prevent the session from remaining in-flight indefinitely and allow the user to recover from the compression failure.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

If compression fails or repeatedly times out during an ACP prompt, Hermes should still settle the ACP prompt lifecycle. Any of these outcomes would be better than leaving a prompt in-flight indefinitely:

  1. Continue with a safe fallback context marker and eventually emit #N OK / prompt-finished events.
  2. Return a structured error/failure update to the ACP client and mark the prompt complete.
  3. Trip a bounded watchdog/timeout that cancels the turn and tells the user to start a fresh session.
  4. Stop sending heartbeat-only session/update events once the backend has no real progress to report, or include an explicit stalled/compression-failed state so clients can surface recovery guidance.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING