hermes - 💡(How to fix) Fix Concurrent checkpoint preflight side effects in tool_executor

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Root Cause

Observed bad outcomes:

  • a blocked tool can still create a checkpoint
  • a blocked tool can consume the per-turn checkpoint dedup slot for a directory
  • a later allowed tool in the same directory and same turn can lose its real pre-change checkpoint because the blocked sibling already consumed it

Fix Action

Fix / Workaround

Affected concurrent tool families:

  • write_file
  • patch
  • destructive terminal commands

Call-order difference:

  1. the concurrent path performs checkpoint preflight for write_file, patch, and destructive terminal
  2. only after that does Hermes evaluate pre-tool blocking and tool guardrails
  3. the sequential path does the opposite and only checkpoints when execution is not blocked

Case 1: Concurrent write_file and patch checkpoint before block evaluation

RAW_BUFFERClick to expand / collapse

Concurrent Checkpoint Preflight Side Effects In Tool Executor

Bug

Hermes can mutate checkpoint state for a concurrent tool call before Hermes knows whether that tool call will actually be allowed to execute.

Affected concurrent tool families:

  • write_file
  • patch
  • destructive terminal commands

Observed bad outcomes:

  • a blocked tool can still create a checkpoint
  • a blocked tool can consume the per-turn checkpoint dedup slot for a directory
  • a later allowed tool in the same directory and same turn can lose its real pre-change checkpoint because the blocked sibling already consumed it

This is a checkpoint-order bug in Hermes core. It is not a bug where a blocked tool still executes.

System Details

Observed environment for this report:

  • OS runtime string: Microsoft Windows NT 10.0.26200.0
  • OS architecture: 64-bit
  • shell: PowerShell 7.6.0
  • Hermes version: 0.15.1
  • repo branch: master
  • repo commit: 94a70b439ecd514a90420de6f86f43e7e411a1d9
  • working directory: C:\Users\innad\AppData\Local\hermes
  • report timestamp: 2026-05-29 20:42:29 +01:00

Where The Bug Is

Primary location:

  • hermes-agent/agent/tool_executor.py

Failing concurrent path:

  • concurrent tool parsing and checkpoint preflight in hermes-agent/agent/tool_executor.py

Relevant checkpoint manager state:

  • CheckpointManager.ensure_checkpoint() in hermes-agent/tools/checkpoint_manager.py

Correct reference path for comparison:

  • sequential tool execution flow in hermes-agent/agent/tool_executor.py

Call-order difference:

  1. the concurrent path performs checkpoint preflight for write_file, patch, and destructive terminal
  2. only after that does Hermes evaluate pre-tool blocking and tool guardrails
  3. the sequential path does the opposite and only checkpoints when execution is not blocked

What Actually Fails

The bug is the order of operations in the concurrent executor.

The concurrent path assumes checkpoint preflight is harmless before execution eligibility is known. That assumption is false because ensure_checkpoint() is stateful. It updates per-turn checkpoint dedup state and can create a real checkpoint before the tool has cleared later block checks.

That creates two failure modes:

  • ghost checkpointing: a blocked tool leaves checkpoint side effects even though it never executed
  • checkpoint slot stealing: a blocked tool consumes the per-turn checkpoint slot for a directory, causing a later allowed tool in the same directory to skip its real pre-change checkpoint

Evidence Collected

Case 1: Concurrent write_file and patch checkpoint before block evaluation

Observed in hermes-agent/agent/tool_executor.py:

  • the concurrent path unwraps tool calls
  • it then checkpoints write_file and patch
  • only after that does it compute block_result and blocked_by_guardrail

Interpretation:

  • a concurrent file-mutating tool can touch checkpoint state before Hermes decides whether execution is allowed

Case 2: Concurrent destructive terminal checkpoint before block evaluation

Observed in hermes-agent/agent/tool_executor.py:

  • the concurrent path checks destructive terminal commands
  • it calls ensure_checkpoint(...)
  • only after that does it compute block_result and blocked_by_guardrail

Interpretation:

  • the same ordering bug applies to destructive terminal checkpointing, not only file tools

Case 3: Checkpoint dedup state mutates before the snapshot attempt returns

Observed in hermes-agent/tools/checkpoint_manager.py:

  • new_turn() clears _checkpointed_dirs
  • ensure_checkpoint() returns early if the directory is already present
  • otherwise it adds the normalized directory to _checkpointed_dirs
  • only then does it call _take(...)

Interpretation:

  • checkpoint preflight is not read-only
  • once a blocked concurrent call reaches ensure_checkpoint(), the per-turn dedup state can already be consumed even if the tool never runs

Case 4: Sequential path already uses the correct order

Observed in hermes-agent/agent/tool_executor.py:

  • the sequential path wraps checkpoint creation in if not _execution_blocked

Interpretation:

  • Hermes already has the correct behavior in the non-concurrent path
  • the bug is a concurrent-path ordering drift, not a missing product concept

How To Reproduce

Reproduction Shape

The bug reproduces when all of these are true:

  1. Hermes receives concurrent tool calls in one batch.
  2. At least one call targets write_file, patch, or a destructive terminal command.
  3. At least one of those calls is blocked after parsing but before execution.
  4. Another call in the same turn targets the same working directory and actually runs, or checkpoint history is inspected after the blocked call.

Concrete Reproduction

  1. Prepare a repo or working directory where Hermes checkpointing is enabled.
  2. Trigger a concurrent batch with two writes against the same working directory.
  3. Ensure the first write is blocked after parsing but before execution.
  4. Ensure the second write is allowed and executes.
  5. Inspect checkpoint history for that directory.

Actual bad results that this shape allows:

  • a checkpoint exists even though the blocked tool never executed
  • or the allowed second write has no fresh pre-change checkpoint because the blocked first write already consumed the per-turn checkpoint slot

Terminal variant:

  1. Trigger a concurrent batch containing a destructive terminal command and another tool for the same working directory.
  2. Ensure the destructive terminal call is blocked after parsing but before execution.
  3. Inspect checkpoint history.

Actual bad result:

  • checkpoint side effects can still exist for the blocked destructive terminal call

Non-Bug Clarifications

These are not the root cause:

  • the blocked tool actually executing
  • any single permission policy implementation
  • checkpoint manager dedup by itself

The root cause is the concurrent executor calling checkpoint preflight before Hermes finishes deciding whether the tool call is allowed to run.

Fix

Recommended Fix

Make the concurrent executor use the same gating rule the sequential executor already uses: checkpoint only after Hermes has decided the tool call will execute.

Implementation rule:

  1. Parse and normalize concurrent tool calls.
  2. Compute all block outcomes first.
  3. Only for calls with no block outcome, perform checkpoint creation for write_file, patch, and destructive terminal.
  4. Leave CheckpointManager.ensure_checkpoint() stateful behavior unchanged unless a separate checkpoint-manager issue is identified.

Exact Scope

Files to change:

  • hermes-agent/agent/tool_executor.py
  • add or update concurrent executor tests under hermes-agent/tests/run_agent/

Exact Behavior

Concurrent execution should do this:

  1. parse the tool call
  2. resolve any tool-search indirection
  3. evaluate pre-tool blocking and guardrails
  4. if blocked:
    • do not call ensure_checkpoint(...)
    • return the blocked result
  5. if allowed:
    • call ensure_checkpoint(...) for write_file, patch, or destructive terminal
    • execute the tool normally

Why This Fix Is Correct

It restores the same safety invariant Hermes already enforces in the sequential path:

  • blocked tools do not mutate checkpoint state
  • only tools that will actually execute can consume the per-turn checkpoint slot or create a checkpoint

It also fixes both visible failure modes with one ordering correction:

  • no ghost checkpoints for blocked tools
  • no checkpoint slot stealing from later allowed tools in the same directory and turn

Why Not Leave It As-Is

Without this fix, checkpoint history is not a reliable representation of real execution in concurrent runs.

That causes two product problems:

  • users can see checkpoint side effects tied to tools that never ran
  • recovery quality degrades because a real allowed mutation can lose its pre-change checkpoint if a blocked sibling consumed the slot first

Suggested Tests

Add tests that cover:

  1. concurrent write_file blocked before execution does not create or consume a checkpoint
  2. concurrent patch blocked before execution does not create or consume a checkpoint
  3. concurrent destructive terminal blocked before execution does not create or consume a checkpoint
  4. concurrent blocked write followed by concurrent allowed write in the same directory still creates a checkpoint for the allowed write
  5. sequential blocked write remains unchanged and still does not checkpoint

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix Concurrent checkpoint preflight side effects in tool_executor