openclaw - ✅(Solved) Fix Stronger run interruptibility: unified generation invalidation and stale-output fencing [1 pull requests, 1 comments, 1 participants]

Q: Expected behavior

If the user sends a new message or `/stop`, the previous run should stop producing effects immediately. The new message should become the dominant instruction.

darconadalabarga · 2026-04-22T19:38:38Z

[openclaw] PR 70363: feat auto-reply : run-generation fence for stronger interruptibility refs 70319 - Repository: openclaw/openclaw - Author: darconadalabarga… # PR #70363: feat(auto-reply): run-generation fence for stronger interruptibility (refs #70319) - Repository: openclaw/openclaw - Author: darconadalabarga - State: open | merged: False - Link: https://github.com/openclaw/openclaw/pull/70363 ## Description (problem / solution / changelog) Closes (partially) openclaw/openclaw#70319. ## Summary Introduces a per-session run-generation counter that layers on top of the existing abort primitives (`replyRunRegistry`, `abortEmbeddedPiRun`, `chat.abort`, `cancelScope`) to provide a unified invalidation signal downstream code can consult before producing side effects. Goal from the issue: after `/stop` or a new user message, the superseded run must not produce more output — tool calls, deltas, typing, final replies. This PR ships the foundation (Pieces A + partial C from the issue's implementation sketch) and wires one real emission site so the fence is live. ## What's in this PR ### Piece A — Run generation registry (new) `src/auto-reply/reply/run-generation.ts`: `getCurrentGeneration`, `incrementGeneration`, `isCurrentGeneration`, `forgetGeneration`. Global-singleton backed. ### Piece A wiring — `ReplyOperation` carries the generation - `ReplyOperation.runGeneration` captured at `createReplyOperation`. - `ReplyOperation.isCurrent()` convenience wrapper over `isCurrentGeneration(key, runGeneration)`. - `abortWithReason` in the registry bumps the generation, so any `abortByUser` / `abortForRestart` path invalidates the captured value. ### Fast-abort integration `tryFastAbortFromMessage` in `abort.ts` also bumps generation up-front so `/stop` fences late output even when the registry lookup misses (race between end-of-run and the stop message). ### Piece C — Stale-output fence (partial, opt-in) `createBlockReplyDeliveryHandler` accepts an optional `isRunCurrent?: () => boolean` parameter. When provided and returns false, the block reply is silently dropped before hitting the channel. Callers pass `() => replyOperation.isCurrent()`. ### Live wiring `agent-runner-execution.ts` passes the fence callback so post-abort block replies are actually dropped in the real runtime. ### Piece D — SIGTERM→SIGKILL escalation Already covered by `src/process/kill-tree.test.ts`. No change needed. ## What's explicitly deferred - **Piece B (pre-tool gate)**: wiring into pi-embedded-runner's tool dispatch. Pattern demonstrated in tests; wiring-only follow-up. - **Piece E (new-message takeover)**: `dispatch.ts` / `queue.ts` integration. The primitive (`incrementGeneration`) is in place. - **More emission sites**: `typing.ts`, reply-delivery for non-block path, `followup-delivery.ts`. Same opt-in shape. ## Tests `src/auto-reply/reply/run-generation.test.ts` — 17 tests covering: - Registry primitives. - `ReplyOperation` generation wiring (capture at begin, flip on abort, next run captures new gen). - Stale-output fence pattern (Tests 2, 3, 5 from the issue). - Integration with `createBlockReplyDeliveryHandler`. ``` pnpm check:changed → exit 0 typecheck core → ok typecheck tests → ok lint core → ok import cycles → ok guards → ok auto-reply suite → 102 files / 1174 tests passed ``` `pnpm build` also passes clean. ## Constraints honored - No change to gateway protocol. - No modification to `AGENTS.md` / `CONTRIBUTING.md` / `CLAUDE.md`. - Existing abort behavior unchanged — the new system layers on top. - TypeScript ESM, strict types, no `any`. - American English in code/comments. Happy to break into smaller commits or reshape if the approach lands differently than maintainers prefer. ## Changed files - `src/auto-reply/reply/abort.ts` (modified, +6/-0) - `src/auto-reply/reply/agent-runner-execution.test.ts` (modified, +2/-0) - `src/auto-reply/reply/agent-runner-execution.ts` (modified, +5/-0) - `src/auto-reply/reply/reply-delivery.ts` (modified, +12/-0) - `src/auto-reply/reply/reply-run-registry.ts` (modified, +26/-0) - `src/auto-reply/reply/run-generation.test.ts` (added, +274/-0) - `src/auto-reply/reply/run-generation.ts` (added, +102/-0) ## Fixed - Fixed by PR: feat(auto-reply): run-generation fence for stronger interruptibility (refs #70319) (https://github.com/openclaw/openclaw/pull/70363) ## Problem When a user sends `/stop` or a new message while the agent is mid-run (executing tools, streaming, running subprocesses), the current run may continue producing side effects: launching more tools, emitting stale progress updates, and finishing subprocess chains. This creates a real UX and safety issue: the user loses confidence in their ability to regain control. ### Observed behavior - `/stop` cuts some parts of the system but not all - Child processes may keep running after chat is aborted - A tool batch may continue with subsequent tools after partial cancellation - Stale progress/typing/messages keep arriving after stop - New user me

openclaw2026-04-22 19:38:38

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#70319•Fetched 2026-04-23 07:26:22

View on GitHub

Comments

Participants

Timeline

Reactions

Author

darconadalabarga

Participants

darconadalabarga

Timeline (top)

commented ×1cross-referenced ×1

Error Message

/stop cuts some parts of the system but not all
Child processes may keep running after chat is aborted
A tool batch may continue with subsequent tools after partial cancellation
Stale progress/typing/messages keep arriving after stop
New user messages may not immediately supersede the active run

Fix Action

Fixed

Fixed by PR: feat(auto-reply): run-generation fence for stronger interruptibility (refs #70319) (https://github.com/openclaw/openclaw/pull/70363)

PR fix notes

PR #70363: feat(auto-reply):

run-generation fence for stronger interruptibility (refs #70319)

Repository: openclaw/openclaw
Author: darconadalabarga
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/70363

Description (problem / solution / changelog)

Closes (partially) openclaw/openclaw#70319.

Summary

Introduces a per-session run-generation counter that layers on top of the existing abort primitives (replyRunRegistry, abortEmbeddedPiRun, chat.abort, cancelScope) to provide a unified invalidation signal downstream code can consult before producing side effects.

Goal from the issue: after /stop or a new user message, the superseded run must not produce more output — tool calls, deltas, typing, final replies.

This PR ships the foundation (Pieces A + partial C from the issue's implementation sketch) and wires one real emission site so the fence is live.

What's in this PR

Piece A — Run generation registry (new)

src/auto-reply/reply/run-generation.ts: getCurrentGeneration, incrementGeneration, isCurrentGeneration, forgetGeneration. Global-singleton backed.

Piece A wiring — `ReplyOperation` carries the generation

ReplyOperation.runGeneration captured at createReplyOperation.
ReplyOperation.isCurrent() convenience wrapper over isCurrentGeneration(key, runGeneration).
abortWithReason in the registry bumps the generation, so any abortByUser / abortForRestart path invalidates the captured value.

Fast-abort integration

tryFastAbortFromMessage in abort.ts also bumps generation up-front so /stop fences late output even when the registry lookup misses (race between end-of-run and the stop message).

Piece C — Stale-output fence (partial, opt-in)

createBlockReplyDeliveryHandler accepts an optional isRunCurrent?: () => boolean parameter. When provided and returns false, the block reply is silently dropped before hitting the channel. Callers pass () => replyOperation.isCurrent().

Live wiring

agent-runner-execution.ts passes the fence callback so post-abort block replies are actually dropped in the real runtime.

Piece D — SIGTERM→SIGKILL escalation

Already covered by src/process/kill-tree.test.ts. No change needed.

What's explicitly deferred

Piece B (pre-tool gate): wiring into pi-embedded-runner's tool dispatch. Pattern demonstrated in tests; wiring-only follow-up.
Piece E (new-message takeover): dispatch.ts / queue.ts integration. The primitive (incrementGeneration) is in place.
More emission sites: typing.ts, reply-delivery for non-block path, followup-delivery.ts. Same opt-in shape.

Tests

src/auto-reply/reply/run-generation.test.ts — 17 tests covering:

Registry primitives.
ReplyOperation generation wiring (capture at begin, flip on abort, next run captures new gen).
Stale-output fence pattern (Tests 2, 3, 5 from the issue).
Integration with createBlockReplyDeliveryHandler.

pnpm check:changed  → exit 0
  typecheck core    → ok
  typecheck tests   → ok
  lint core         → ok
  import cycles     → ok
  guards            → ok
  auto-reply suite  → 102 files / 1174 tests passed

pnpm build also passes clean.

Constraints honored

No change to gateway protocol.
No modification to AGENTS.md / CONTRIBUTING.md / CLAUDE.md.
Existing abort behavior unchanged — the new system layers on top.
TypeScript ESM, strict types, no any.
American English in code/comments.

Happy to break into smaller commits or reshape if the approach lands differently than maintainers prefer.

Changed files

src/auto-reply/reply/abort.ts (modified, +6/-0)
src/auto-reply/reply/agent-runner-execution.test.ts (modified, +2/-0)
src/auto-reply/reply/agent-runner-execution.ts (modified, +5/-0)
src/auto-reply/reply/reply-delivery.ts (modified, +12/-0)
src/auto-reply/reply/reply-run-registry.ts (modified, +26/-0)
src/auto-reply/reply/run-generation.test.ts (added, +274/-0)
src/auto-reply/reply/run-generation.ts (added, +102/-0)

RAW_BUFFERClick to expand / collapse

Problem

When a user sends /stop or a new message while the agent is mid-run (executing tools, streaming, running subprocesses), the current run may continue producing side effects: launching more tools, emitting stale progress updates, and finishing subprocess chains.

This creates a real UX and safety issue: the user loses confidence in their ability to regain control.

Observed behavior

/stop cuts some parts of the system but not all
Child processes may keep running after chat is aborted
A tool batch may continue with subsequent tools after partial cancellation
Stale progress/typing/messages keep arriving after stop
New user messages may not immediately supersede the active run

Expected behavior

If the user sends a new message or /stop, the previous run should stop producing effects immediately. The new message should become the dominant instruction.

Existing primitives (they're good!)

OpenClaw already has solid abort primitives scattered across subsystems:

chat.abort RPC handler
abortEmbeddedPiRun() for embedded agent runs
clearSessionQueues() for queue cleanup
managedRun.cancel("manual-cancel") for exec processes
cancel(runId) / cancelScope(scopeKey) in the process supervisor
replyRunRegistry.abort() for reply run tracking
abortedLastRun flag in session store
handlerGeneration invalidation pattern in heartbeat-wake

These are all good building blocks. The issue is not missing cancellation, but missing coordination between them.

What's missing: unified run invalidation

The gap is a single coherent guarantee that:

A new user message or /stop invalidates the active run
The invalidated run cannot produce new side effects (messages, tool calls, progress, typing)
Subprocesses owned by the invalidated run are cancelled
Pending tool calls in the invalidated run are skipped
The new user message becomes the dominant instruction immediately

Proposed approach

Introduce stronger run-scoped interruption semantics, inspired by patterns from Hermes (which implements a well-tested version of this):

1. Run generation counter per session

A simple incrementing counter. When abort or new message arrives, increment generation. All downstream checks validate their captured generation is still current.

2. Pre-tool gate

Before each tool execution, check if the run's generation is still current. If not, skip the tool and return a cancelled result. (Hermes tests this explicitly in test_all_tools_skipped_when_interrupted.)

3. Stale output fence

Prevent stale runs from emitting visible effects. Before emitting streaming deltas, typing indicators, progress updates, or final messages: check generation. The pattern already exists in heartbeat-wake.ts — apply it to the reply pipeline.

4. Stronger subprocess cancellation

Wire exec/supervisor processes to session run scope. On generation change, cancel associated processes.

5. New message takeover

When a new user message arrives during an active run: increment generation → cancel active processes → abort embedded run → clear queues → new message becomes next input.

Prior art

Hermes agent demonstrates these patterns with good test coverage:

Thread-scoped interrupt signaling (tools/interrupt.py)
Pre-tool interrupt checks with test coverage
Gateway run generation invalidation for stale outputs
SIGTERM→SIGKILL escalation for resistant processes
Pending message queue drain and combination

The goal is not a line-by-line port, but adapting these concepts to OpenClaw's async architecture.

Benefits

Safer production behavior (tool chains stop reliably)
Stronger user control and trust
More predictable /stop semantics
Fewer stale messages after abort
Foundation for safer autonomous operation

I'm willing to contribute a PR

I have a prototype implementation plan and would be happy to contribute a PR if the maintainers are interested. The implementation is designed to layer on top of existing primitives without breaking current behavior.

This would be AI-assisted (Claude Code) with testing.

extent analysis

TL;DR

Introduce a unified run invalidation mechanism to ensure that a new user message or /stop command immediately stops the active run and prevents further side effects.

Guidance

Implement a run generation counter per session to track the current run and validate its continuity before executing tools or emitting outputs.
Add a pre-tool gate to skip tools and return a cancelled result if the run's generation is no longer current.
Establish a stale output fence to prevent stale runs from emitting visible effects, such as streaming deltas, typing indicators, or progress updates.
Enhance subprocess cancellation by wiring exec/supervisor processes to the session run scope and cancelling associated processes on generation change.
Develop a new message takeover mechanism to increment the generation, cancel active processes, abort the embedded run, clear queues, and prioritize the new message as the next input.

Example

A potential implementation could involve creating a RunManager class that maintains the run generation counter and provides methods for validating the current run, skipping tools, and cancelling subprocesses. For instance:

class RunManager {
  private generation: number;

  constructor() {
    this.generation = 0;
  }

  incrementGeneration() {
    this.generation++;
  }

  isValidRun(generation: number) {
    return generation === this.generation;
  }

  //...
}

Notes

The proposed approach is inspired by the Hermes agent, which has demonstrated these patterns with good test coverage. However, the implementation should be adapted to OpenClaw's async architecture.

Recommendation

Apply the workaround by introducing a unified run invalidation mechanism, as it provides a comprehensive solution to the problem and ensures safer production behavior, stronger user control, and more predictable /stop semantics.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

If the user sends a new message or /stop, the previous run should stop producing effects immediately. The new message should become the dominant instruction.

#tokenizer error #prompt formatting #chain error #conversation history #tool integration

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix Stronger run interruptibility: unified generation invalidation and stale-output fencing [1 pull requests, 1 comments, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Fix Action

Fixed

PR fix notes

PR #70363: feat(auto-reply):

Description (problem / solution / changelog)

Summary

What's in this PR

Piece A — Run generation registry (new)

Piece A wiring — ReplyOperation carries the generation

Fast-abort integration

Piece C — Stale-output fence (partial, opt-in)

Live wiring

Piece D — SIGTERM→SIGKILL escalation

What's explicitly deferred

Tests

Constraints honored

Changed files

Problem

Observed behavior

Expected behavior

Existing primitives (they're good!)

What's missing: unified run invalidation

Proposed approach

1. Run generation counter per session

2. Pre-tool gate

3. Stale output fence

4. Stronger subprocess cancellation

5. New message takeover

Prior art

Benefits

I'm willing to contribute a PR

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING

Piece A wiring — `ReplyOperation` carries the generation