openclaw - ✅(Solved) Fix Stronger run interruptibility: unified generation invalidation and stale-output fencing [1 pull requests, 1 comments, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#70319Fetched 2026-04-23 07:26:22
View on GitHub
Comments
1
Participants
1
Timeline
2
Reactions
0
Participants
Timeline (top)
commented ×1cross-referenced ×1

Error Message

  • /stop cuts some parts of the system but not all
  • Child processes may keep running after chat is aborted
  • A tool batch may continue with subsequent tools after partial cancellation
  • Stale progress/typing/messages keep arriving after stop
  • New user messages may not immediately supersede the active run

Fix Action

Fixed

PR fix notes

PR #70363: feat(auto-reply):

run-generation fence for stronger interruptibility (refs #70319)

Description (problem / solution / changelog)

Closes (partially) openclaw/openclaw#70319.

Summary

Introduces a per-session run-generation counter that layers on top of the existing abort primitives (replyRunRegistry, abortEmbeddedPiRun, chat.abort, cancelScope) to provide a unified invalidation signal downstream code can consult before producing side effects.

Goal from the issue: after /stop or a new user message, the superseded run must not produce more output — tool calls, deltas, typing, final replies.

This PR ships the foundation (Pieces A + partial C from the issue's implementation sketch) and wires one real emission site so the fence is live.

What's in this PR

Piece A — Run generation registry (new)

src/auto-reply/reply/run-generation.ts: getCurrentGeneration, incrementGeneration, isCurrentGeneration, forgetGeneration. Global-singleton backed.

Piece A wiring — ReplyOperation carries the generation

  • ReplyOperation.runGeneration captured at createReplyOperation.
  • ReplyOperation.isCurrent() convenience wrapper over isCurrentGeneration(key, runGeneration).
  • abortWithReason in the registry bumps the generation, so any abortByUser / abortForRestart path invalidates the captured value.

Fast-abort integration

tryFastAbortFromMessage in abort.ts also bumps generation up-front so /stop fences late output even when the registry lookup misses (race between end-of-run and the stop message).

Piece C — Stale-output fence (partial, opt-in)

createBlockReplyDeliveryHandler accepts an optional isRunCurrent?: () => boolean parameter. When provided and returns false, the block reply is silently dropped before hitting the channel. Callers pass () => replyOperation.isCurrent().

Live wiring

agent-runner-execution.ts passes the fence callback so post-abort block replies are actually dropped in the real runtime.

Piece D — SIGTERM→SIGKILL escalation

Already covered by src/process/kill-tree.test.ts. No change needed.

What's explicitly deferred

  • Piece B (pre-tool gate): wiring into pi-embedded-runner's tool dispatch. Pattern demonstrated in tests; wiring-only follow-up.
  • Piece E (new-message takeover): dispatch.ts / queue.ts integration. The primitive (incrementGeneration) is in place.
  • More emission sites: typing.ts, reply-delivery for non-block path, followup-delivery.ts. Same opt-in shape.

Tests

src/auto-reply/reply/run-generation.test.ts — 17 tests covering:

  • Registry primitives.
  • ReplyOperation generation wiring (capture at begin, flip on abort, next run captures new gen).
  • Stale-output fence pattern (Tests 2, 3, 5 from the issue).
  • Integration with createBlockReplyDeliveryHandler.
pnpm check:changed  → exit 0
  typecheck core    → ok
  typecheck tests   → ok
  lint core         → ok
  import cycles     → ok
  guards            → ok
  auto-reply suite  → 102 files / 1174 tests passed

pnpm build also passes clean.

Constraints honored

  • No change to gateway protocol.
  • No modification to AGENTS.md / CONTRIBUTING.md / CLAUDE.md.
  • Existing abort behavior unchanged — the new system layers on top.
  • TypeScript ESM, strict types, no any.
  • American English in code/comments.

Happy to break into smaller commits or reshape if the approach lands differently than maintainers prefer.

Changed files

  • src/auto-reply/reply/abort.ts (modified, +6/-0)
  • src/auto-reply/reply/agent-runner-execution.test.ts (modified, +2/-0)
  • src/auto-reply/reply/agent-runner-execution.ts (modified, +5/-0)
  • src/auto-reply/reply/reply-delivery.ts (modified, +12/-0)
  • src/auto-reply/reply/reply-run-registry.ts (modified, +26/-0)
  • src/auto-reply/reply/run-generation.test.ts (added, +274/-0)
  • src/auto-reply/reply/run-generation.ts (added, +102/-0)
RAW_BUFFERClick to expand / collapse

Problem

When a user sends /stop or a new message while the agent is mid-run (executing tools, streaming, running subprocesses), the current run may continue producing side effects: launching more tools, emitting stale progress updates, and finishing subprocess chains.

This creates a real UX and safety issue: the user loses confidence in their ability to regain control.

Observed behavior

  • /stop cuts some parts of the system but not all
  • Child processes may keep running after chat is aborted
  • A tool batch may continue with subsequent tools after partial cancellation
  • Stale progress/typing/messages keep arriving after stop
  • New user messages may not immediately supersede the active run

Expected behavior

If the user sends a new message or /stop, the previous run should stop producing effects immediately. The new message should become the dominant instruction.

Existing primitives (they're good!)

OpenClaw already has solid abort primitives scattered across subsystems:

  • chat.abort RPC handler
  • abortEmbeddedPiRun() for embedded agent runs
  • clearSessionQueues() for queue cleanup
  • managedRun.cancel("manual-cancel") for exec processes
  • cancel(runId) / cancelScope(scopeKey) in the process supervisor
  • replyRunRegistry.abort() for reply run tracking
  • abortedLastRun flag in session store
  • handlerGeneration invalidation pattern in heartbeat-wake

These are all good building blocks. The issue is not missing cancellation, but missing coordination between them.

What's missing: unified run invalidation

The gap is a single coherent guarantee that:

  1. A new user message or /stop invalidates the active run
  2. The invalidated run cannot produce new side effects (messages, tool calls, progress, typing)
  3. Subprocesses owned by the invalidated run are cancelled
  4. Pending tool calls in the invalidated run are skipped
  5. The new user message becomes the dominant instruction immediately

Proposed approach

Introduce stronger run-scoped interruption semantics, inspired by patterns from Hermes (which implements a well-tested version of this):

1. Run generation counter per session

A simple incrementing counter. When abort or new message arrives, increment generation. All downstream checks validate their captured generation is still current.

2. Pre-tool gate

Before each tool execution, check if the run's generation is still current. If not, skip the tool and return a cancelled result. (Hermes tests this explicitly in test_all_tools_skipped_when_interrupted.)

3. Stale output fence

Prevent stale runs from emitting visible effects. Before emitting streaming deltas, typing indicators, progress updates, or final messages: check generation. The pattern already exists in heartbeat-wake.ts — apply it to the reply pipeline.

4. Stronger subprocess cancellation

Wire exec/supervisor processes to session run scope. On generation change, cancel associated processes.

5. New message takeover

When a new user message arrives during an active run: increment generation → cancel active processes → abort embedded run → clear queues → new message becomes next input.

Prior art

Hermes agent demonstrates these patterns with good test coverage:

  • Thread-scoped interrupt signaling (tools/interrupt.py)
  • Pre-tool interrupt checks with test coverage
  • Gateway run generation invalidation for stale outputs
  • SIGTERM→SIGKILL escalation for resistant processes
  • Pending message queue drain and combination

The goal is not a line-by-line port, but adapting these concepts to OpenClaw's async architecture.

Benefits

  • Safer production behavior (tool chains stop reliably)
  • Stronger user control and trust
  • More predictable /stop semantics
  • Fewer stale messages after abort
  • Foundation for safer autonomous operation

I'm willing to contribute a PR

I have a prototype implementation plan and would be happy to contribute a PR if the maintainers are interested. The implementation is designed to layer on top of existing primitives without breaking current behavior.

This would be AI-assisted (Claude Code) with testing.

extent analysis

TL;DR

Introduce a unified run invalidation mechanism to ensure that a new user message or /stop command immediately stops the active run and prevents further side effects.

Guidance

  • Implement a run generation counter per session to track the current run and validate its continuity before executing tools or emitting outputs.
  • Add a pre-tool gate to skip tools and return a cancelled result if the run's generation is no longer current.
  • Establish a stale output fence to prevent stale runs from emitting visible effects, such as streaming deltas, typing indicators, or progress updates.
  • Enhance subprocess cancellation by wiring exec/supervisor processes to the session run scope and cancelling associated processes on generation change.
  • Develop a new message takeover mechanism to increment the generation, cancel active processes, abort the embedded run, clear queues, and prioritize the new message as the next input.

Example

A potential implementation could involve creating a RunManager class that maintains the run generation counter and provides methods for validating the current run, skipping tools, and cancelling subprocesses. For instance:

class RunManager {
  private generation: number;

  constructor() {
    this.generation = 0;
  }

  incrementGeneration() {
    this.generation++;
  }

  isValidRun(generation: number) {
    return generation === this.generation;
  }

  //...
}

Notes

The proposed approach is inspired by the Hermes agent, which has demonstrated these patterns with good test coverage. However, the implementation should be adapted to OpenClaw's async architecture.

Recommendation

Apply the workaround by introducing a unified run invalidation mechanism, as it provides a comprehensive solution to the problem and ensures safer production behavior, stronger user control, and more predictable /stop semantics.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

If the user sends a new message or /stop, the previous run should stop producing effects immediately. The new message should become the dominant instruction.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix Stronger run interruptibility: unified generation invalidation and stale-output fencing [1 pull requests, 1 comments, 1 participants]