openclaw - 💡(How to fix) Fix Aborted subagent with empty content silently marked `done`, never auto-announces — parent black-holes [1 comments, 2 participants]

openclaw2026-04-26 17:35:38

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#72293•Fetched 2026-04-27 05:31:58

View on GitHub

Comments

Participants

Timeline

Reactions

Author

joeykrug

Participants

clawsweeper[bot]

joeykrug

Timeline (top)

commented ×1

When a subagent's underlying inference call gets aborted mid-response (Codex AbortError, OpenAI Responses API timeout, network drop, etc.), the subagent emits an empty assistant message (content: []) with stopReason: "aborted" and errorMessage: "This operation was aborted". The gateway then marks the run as status: done with abortedLastRun: false, and the auto-announce queue silently drops the empty result. The parent agent waits indefinitely for an announcement that never comes.

Error Message

20:28:01 AST Spawn subagent (Codex GPT-5.5, mode=run) 20:44:21 AST Last successful tool call (exec) 20:44:22.570 Codex emits openclaw:prompt-error error: "This operation was aborted | This operation was aborted" 20:44:22.614 Final assistant message: { content: [], // <-- empty stopReason: "aborted", errorMessage: "This operation was aborted", usage: { totalTokens: 0 } } 20:44:24.876 Gateway: status=done, abortedLastRun=false Auto-announce: silently swallowed (no content to relay) Parent: waits forever, no completion event arrives

Root Cause

Fix Action

Workaround

Parent must poll sessions_list with kind=subagent and check for stale "done" entries with empty output, instead of trusting that auto-announce will fire. We're adding a heartbeat-time watchdog on our side, but this defeats the purpose of the announce mechanism.

Code Example

20:28:01 AST  Spawn subagent (Codex GPT-5.5, mode=run)
20:44:21 AST  Last successful tool call (exec)
20:44:22.570  Codex emits openclaw:prompt-error
              error: "This operation was aborted | This operation was aborted"
20:44:22.614  Final assistant message:
              {
                content: [],            // <-- empty
                stopReason: "aborted",
                errorMessage: "This operation was aborted",
                usage: { totalTokens: 0 }
              }
20:44:24.876  Gateway: status=done, abortedLastRun=false
              Auto-announce: silently swallowed (no content to relay)
              Parent: waits forever, no completion event arrives

RAW_BUFFERClick to expand / collapse

Summary

Reproduction

Spawn a long-running subagent with mode: "run" and runtime: "subagent" that does heavy reasoning over many tool calls (in our case Codex GPT-5.5 doing 16 minutes of upstream-source archaeology)
Wait for the underlying inference call to hit its provider-internal timeout (Codex's per-response timeout, distinct from runTimeoutSeconds)
Observe the subagent's transcript: final entry is an openclaw:prompt-error custom event followed by an empty content: [] assistant message
The gateway flags status: done, but no completion event is delivered to the parent

Forensic timeline (real example, 2026-04-25)

20:28:01 AST  Spawn subagent (Codex GPT-5.5, mode=run)
20:44:21 AST  Last successful tool call (exec)
20:44:22.570  Codex emits openclaw:prompt-error
              error: "This operation was aborted | This operation was aborted"
20:44:22.614  Final assistant message:
              {
                content: [],            // <-- empty
                stopReason: "aborted",
                errorMessage: "This operation was aborted",
                usage: { totalTokens: 0 }
              }
20:44:24.876  Gateway: status=done, abortedLastRun=false
              Auto-announce: silently swallowed (no content to relay)
              Parent: waits forever, no completion event arrives

Why it's a bug

Two distinct gateway invariants are violated:

abortedLastRun: false is wrong when the assistant message has stopReason: "aborted" and errorMessage: "This operation was aborted". The assistant message itself reports the abort; the gateway should propagate that into abortedLastRun: true (or a richer status like aborted).
Silent announce drop: the auto-announce queue should not produce a no-op for an aborted-with-empty-content terminal state. Either deliver a failure announcement ("subagent X aborted at step Y, no result available") or surface as a parent-visible error.

Either fix would prevent the parent from black-holing.

Suggested fix sketch

In the subagent runtime path (something like subagent-announce.ts / pi-embedded-runner/run/attempt.ts), when the final assistant message has both stopReason: "aborted" and zero output tokens:

Set runRecord.status = "aborted" (not "done")
Set abortedLastRun = true
Either:
- (a) Auto-retry once with a fresh inference call, or
- (b) Synthesize a completion event with status=failed and the error message, so the parent can act on it

Workaround

Environment

OpenClaw 2026.4.24
Subagent: Codex openai-codex/gpt-5.5 (free via Pro subscription)
Mode: run, runtime: subagent
runTimeoutSeconds: 18000 global default (so OpenClaw timeout was NOT the trigger — the abort came from inside the inference layer)

extent analysis

TL;DR

The most likely fix involves updating the subagent runtime to correctly handle aborted inference calls by setting runRecord.status to "aborted" and abortedLastRun to true, and either auto-retrying or synthesizing a completion event with a failure status.

Guidance

Review the subagent runtime code (e.g., subagent-announce.ts or pi-embedded-runner/run/attempt.ts) to ensure it correctly handles cases where the final assistant message has stopReason: "aborted" and zero output tokens.
Consider implementing a retry mechanism or synthesizing a completion event with a failure status to notify the parent agent of the abortion.
Verify that the gateway's abortedLastRun flag is being set correctly based on the subagent's final message.
Evaluate the current workaround of polling sessions_list with kind=subagent and checking for stale "done" entries, and consider replacing it with a more reliable solution.

Example

if (finalMessage.stopReason === "aborted" && finalMessage.content.length === 0) {
  runRecord.status = "aborted";
  abortedLastRun = true;
  // Either auto-retry or synthesize a completion event with failure status
  // ...
}

Notes

The provided fix sketch and example code assume that the subagent runtime code has access to the necessary variables and functions to update the runRecord and handle the abortion correctly. Additional modifications may be required to integrate this fix into the existing codebase.

Recommendation

Apply the suggested fix to update the subagent runtime to correctly handle aborted inference calls, as it addresses the root cause of the issue and provides a more reliable solution than the current workaround.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #autograd error #model save/load #optimization #mixed precision

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix Aborted subagent with empty content silently marked `done`, never auto-announces — parent black-holes [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Workaround

Code Example

Summary

Reproduction

Forensic timeline (real example, 2026-04-25)

Why it's a bug

Suggested fix sketch

Workaround

Environment

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix Aborted subagent with empty content silently marked `done`, never auto-announces — parent black-holes [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Workaround

Code Example

Summary

Reproduction

Forensic timeline (real example, 2026-04-25)

Why it's a bug

Suggested fix sketch

Workaround

Environment

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING