openclaw - 💡(How to fix) Fix [Bug]: Agent / sessions_spawn tool_use can emit with no matching tool_result and no parent-visible signal — silent subagent drop (~3.8% historical rate)

StepCodex · 2026-06-01T01:32:13Z

[openclaw] Environment - OpenClaw : 2026.5.22 - Agent harness : claude-cli direct Anthropic , model claude-opus-4-7 - Host : single-VPS Linux deployment, no co… ## Fix / Workaround ## Current local workaround The hook is a workaround, not a fix: **This is an observed-symptom proposal, not a claim about where the bug lives.** I haven't traced the subagent dispatch path in the upstream source; the right fix could be in the harness's subagent-completion handler, in the runtime's `tool_result` writer, or in the dispatch resolver. All three are plausible homes for the signal. ## Environment - **OpenClaw**: 2026.5.22 - **Agent harness**: `claude-cli` (direct Anthropic), model `claude-opus-4-7` - **Host**: single-VPS Linux deployment, no container - **Transcript scope**: workspace project under `~/.claude/projects/ /` ## TL;DR Across the local historical transcript corpus, **23 of 602 (3.82%) `Agent` / `mcp__openclaw__sessions_spawn` `tool_use` blocks emitted by the assistant have no matching `tool_result` block anywhere later in the same transcript**. The subagent either died, was killed, or the runtime dropped the result before the parent resumed — but no signal of that failure reaches the parent's context. The parent transcript shows the spawn `tool_use` and nothing after it, leaving the model to either narrate forward as if the spawn completed (worst case), wait indefinitely, or fabricate the result. The ask is for first-class harness signal on a dropped / killed / disconnected subagent: e.g. an injected `tool_result` with `is_error=true` and an `error_type` field, or a harness-side notification surfaced into the parent context that the model can read on the next turn. ## Repro shape This is an observed-rate report against an existing on-disk transcript corpus, not a synthetic single-shot repro. The detection is mechanical: ``` spawn_ids = jq '.type=="assistant" → .message.content[] → select(.type=="tool_use" AND (.name=="Agent" OR .name=="mcp__openclaw__sessions_spawn")) → .id' result_ids = jq '.type=="user" → .message.content[] → select(.type=="tool_result") → .tool_use_id' orphans = spawn_ids − result_ids ``` A spawn id that appears in `spawn_ids` and never in `result_ids` is an orphan. Backtest result on this host's transcript corpus: | Metric | Value | |---|---| | Transcripts scanned | 13,505 | | Total spawn `tool_use`s | 602 | | Orphans | 23 | | Orphan rate | 3.82% | ## Why this matters Three downstream behaviors the parent can take when the result never arrives, all of them bad: 1. **Silent narrate-forward.** The model, with no visible failure signal, continues as if the subagent returned cleanly. Whatever the parent was going to do with the result (summarize, decide, act) gets executed against an empty / fabricated stand-in. 2. **Indefinite stall.** The model waits for the result, with no way to know it isn't coming. The user sees a non-responding session. 3. **Fabricated synthesis.** The model hallucinates plausible output for the missing subagent and downstream work proceeds against fiction. All three are operator-visible only by inspecting the transcript after the fact. A first-class signal (synthetic error `tool_result`, or harness-emitted system reminder) would let the model see the failure on the next turn and decide whether to re-fire, refine the prompt, or explicitly drop the work. ## Current local workaround A `Stop` hook that diffs `spawn_ids − result_ids` per transcript and surfaces new orphans via `exit 2` stderr as a next-turn system reminder. The model reads the reminder on the next turn and chooses re-fire / refine / drop. ```bash # stderr message shape ORPHANED SUBAGENT(S) DETECTED: N Agent/sessions_spawn tool_use(s) emitted with no tool_result returned. The subagent(s) likely died, were killed, or the runtime dropped the result before the parent resumed. - (+ M older orphan(s) silenced; see ) Decide: re-fire with same or refined prompt, or explicitly drop the work. Do not silently narrate forward as if it completed. ``` State (already-surfaced ids) is kept in an append-only ledger so the same orphan doesn't re-fire on every subsequent `Stop` event. The hook is detection-only — it does not retry the spawn or mutate the transcript. The hook is a workaround, not a fix: - It runs on every `Stop`, so the first signal arrives at next-turn boundary — not at the moment the subagent dropped. - It surfaces stderr text the model has to parse, rather than the structured `tool_result` shape it already knows how to consume. - Every downstream user has to install and maintain it independently. First-class harness signal would obviate that. ## Proposed change (observed-symptom framing, not an upstream-code claim) Two viable shapes for first-class signal: 1. **Synthetic `tool_result` injection.** When the harness detects a subagent has died / been killed / disconnected without returning a result, inject a `tool_result` bloc

openclaw2026-06-01 01:32:13

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Error Message

All three are operator-visible only by inspecting the transcript after the fact. A first-class signal (synthetic error tool_result, or harness-emitted system reminder) would let the model see the failure on the next turn and decide whether to re-fire, refine the prompt, or explicitly drop the work. A complementary observation: if there's an existing failure path in the harness that already knows the subagent dropped (e.g. a logged error), the missing link is just the surface to the parent context — not a new detection layer. The 3.82% rate suggests the failure is detected somewhere; it just isn't being signalled upstream to the parent.

Root Cause

Three downstream behaviors the parent can take when the result never arrives, all of them bad:

Silent narrate-forward. The model, with no visible failure signal, continues as if the subagent returned cleanly. Whatever the parent was going to do with the result (summarize, decide, act) gets executed against an empty / fabricated stand-in.
Indefinite stall. The model waits for the result, with no way to know it isn't coming. The user sees a non-responding session.
Fabricated synthesis. The model hallucinates plausible output for the missing subagent and downstream work proceeds against fiction.

Fix Action

Fix / Workaround

Current local workaround

The hook is a workaround, not a fix:

This is an observed-symptom proposal, not a claim about where the bug lives. I haven't traced the subagent dispatch path in the upstream source; the right fix could be in the harness's subagent-completion handler, in the runtime's tool_result writer, or in the dispatch resolver. All three are plausible homes for the signal.

Code Example

spawn_ids   = jq '.type=="assistant" → .message.content[] → select(.type=="tool_use" AND (.name=="Agent" OR .name=="mcp__openclaw__sessions_spawn")) → .id'
result_ids  = jq '.type=="user" → .message.content[] → select(.type=="tool_result") → .tool_use_id'
orphans     = spawn_ids − result_ids

---

# stderr message shape
ORPHANED SUBAGENT(S) DETECTED: N Agent/sessions_spawn tool_use(s) emitted with
no tool_result returned. The subagent(s) likely died, were killed, or the
runtime dropped the result before the parent resumed.

  - <tool_use_id>  <timestamp>  <description | subagent_type | name>
  (+ M older orphan(s) silenced; see <state_file>)

Decide: re-fire with same or refined prompt, or explicitly drop the work.
Do not silently narrate forward as if it completed.

RAW_BUFFERClick to expand / collapse

Environment

OpenClaw: 2026.5.22
Agent harness: claude-cli (direct Anthropic), model claude-opus-4-7
Host: single-VPS Linux deployment, no container
Transcript scope: workspace project under ~/.claude/projects/<slug>/

TL;DR

Across the local historical transcript corpus, 23 of 602 (3.82%) Agent / mcp__openclaw__sessions_spawn tool_use blocks emitted by the assistant have no matching tool_result block anywhere later in the same transcript. The subagent either died, was killed, or the runtime dropped the result before the parent resumed — but no signal of that failure reaches the parent's context. The parent transcript shows the spawn tool_use and nothing after it, leaving the model to either narrate forward as if the spawn completed (worst case), wait indefinitely, or fabricate the result.

The ask is for first-class harness signal on a dropped / killed / disconnected subagent: e.g. an injected tool_result with is_error=true and an error_type field, or a harness-side notification surfaced into the parent context that the model can read on the next turn.

Repro shape

This is an observed-rate report against an existing on-disk transcript corpus, not a synthetic single-shot repro. The detection is mechanical:

spawn_ids   = jq '.type=="assistant" → .message.content[] → select(.type=="tool_use" AND (.name=="Agent" OR .name=="mcp__openclaw__sessions_spawn")) → .id'
result_ids  = jq '.type=="user" → .message.content[] → select(.type=="tool_result") → .tool_use_id'
orphans     = spawn_ids − result_ids

A spawn id that appears in spawn_ids and never in result_ids is an orphan. Backtest result on this host's transcript corpus:

Metric	Value
Transcripts scanned	13,505
Total spawn `tool_use`s	602
Orphans	23
Orphan rate	3.82%

Why this matters

Three downstream behaviors the parent can take when the result never arrives, all of them bad:

Silent narrate-forward. The model, with no visible failure signal, continues as if the subagent returned cleanly. Whatever the parent was going to do with the result (summarize, decide, act) gets executed against an empty / fabricated stand-in.
Indefinite stall. The model waits for the result, with no way to know it isn't coming. The user sees a non-responding session.
Fabricated synthesis. The model hallucinates plausible output for the missing subagent and downstream work proceeds against fiction.

Current local workaround

A Stop hook that diffs spawn_ids − result_ids per transcript and surfaces new orphans via exit 2 stderr as a next-turn system reminder. The model reads the reminder on the next turn and chooses re-fire / refine / drop.

# stderr message shape
ORPHANED SUBAGENT(S) DETECTED: N Agent/sessions_spawn tool_use(s) emitted with
no tool_result returned. The subagent(s) likely died, were killed, or the
runtime dropped the result before the parent resumed.

  - <tool_use_id>  <timestamp>  <description | subagent_type | name>
  (+ M older orphan(s) silenced; see <state_file>)

Decide: re-fire with same or refined prompt, or explicitly drop the work.
Do not silently narrate forward as if it completed.

State (already-surfaced ids) is kept in an append-only ledger so the same orphan doesn't re-fire on every subsequent Stop event. The hook is detection-only — it does not retry the spawn or mutate the transcript.

The hook is a workaround, not a fix:

It runs on every Stop, so the first signal arrives at next-turn boundary — not at the moment the subagent dropped.
It surfaces stderr text the model has to parse, rather than the structured tool_result shape it already knows how to consume.
Every downstream user has to install and maintain it independently. First-class harness signal would obviate that.

Proposed change (observed-symptom framing, not an upstream-code claim)

Two viable shapes for first-class signal:

Synthetic tool_result injection. When the harness detects a subagent has died / been killed / disconnected without returning a result, inject a tool_result block into the parent transcript with tool_use_id=<the dropped spawn>, is_error=true, and a content payload describing the failure mode (subagent died, subagent killed, runtime dropped result, etc.). The model already knows how to read tool_result and will see the failure on the next turn naturally.
Harness system reminder. Emit a runtime-side system reminder into the parent's next-turn context with the same information. Less structurally clean (the model has to parse text rather than a typed result) but lower-impact change to the harness.

Either form removes the silent class of failure. Form (1) is preferable because it slots into the existing tool_use/tool_result contract and is unambiguous to the model.

A complementary observation: if there's an existing failure path in the harness that already knows the subagent dropped (e.g. a logged error), the missing link is just the surface to the parent context — not a new detection layer. The 3.82% rate suggests the failure is detected somewhere; it just isn't being signalled upstream to the parent.

Cross-refs

#86227 — claude-cli harness lazy registration silent drop (different failure surface; same family of "harness drops a thing and the parent has no signal").
#86239 — MissingAgentHarnessError on inbound dispatch (also silent until forensics).
#87327 — runtime-plugins phase stall pre-execution (also no parent-visible signal).
Local memory: <workspace>/memory/openclaw-architecture.md → Behavioral Conformance Hooks (2026-06-01) documents the hook + state file + retire condition.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix [Bug]: Agent / sessions_spawn tool_use can emit with no matching tool_result and no parent-visible signal — silent subagent drop (~3.8% historical rate)

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Current local workaround

Code Example

Environment

TL;DR

Repro shape

Why this matters

Current local workaround

Proposed change (observed-symptom framing, not an upstream-code claim)

Cross-refs

Still need to ship something?

TRENDING