openclaw - ✅(Solved) Fix [Bug]: Subagent announce can deliver stale output and subagent sessions may inherit unrelated history [4 pull requests, 2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#78055Fetched 2026-05-06 06:17:26
View on GitHub
Comments
2
Participants
2
Timeline
6
Reactions
2
Timeline (top)
cross-referenced ×4commented ×2

Subagent completion delivery appears to have two related failure modes in OpenClaw 2026.5.4:

  1. stale subagent completion announcements are delivered into the requester session later and converted into user-visible replies, even when they are no longer part of the live conversation; and
  2. new subagent sessions can contain unrelated prior transcript turns, so a debugging subagent can answer/correct an older task instead of only the task it was just spawned to perform.

This is not just noisy output. It produced an unexpected Telegram message to the user around 03:23 BKK with stale Operations-account verification text that was not part of the current conversation.

Root Cause

Subagent completion delivery appears to have two related failure modes in OpenClaw 2026.5.4:

  1. stale subagent completion announcements are delivered into the requester session later and converted into user-visible replies, even when they are no longer part of the live conversation; and
  2. new subagent sessions can contain unrelated prior transcript turns, so a debugging subagent can answer/correct an older task instead of only the task it was just spawned to perform.

This is not just noisy output. It produced an unexpected Telegram message to the user around 03:23 BKK with stale Operations-account verification text that was not part of the current conversation.

Fix Action

Fixed

PR fix notes

PR #78060: fix(subagents): keep thread-bound spawns isolated by default

Description (problem / solution / changelog)

Refs openclaw/openclaw#78055.

Summary

  • Change the implicit thread-bound native sessions_spawn context from fork to isolated.
  • Preserve explicit context: "fork" and explicit threadBindings.defaultSpawnContext: "fork" behavior.
  • Update docs/help/UI hints so the documented default matches runtime behavior.

Why

The issue evidence showed subagent sessions unexpectedly carrying unrelated requester transcript history (forkedFromParent: true). Thread-bound subagent spawns were the remaining path where OpenClaw silently forked requester history even though the tool prompt says children start isolated unless context: "fork" is requested. That implicit fork can make a new worker answer old requester context and then auto-announce the wrong result.

Testing

  • node scripts/test-projects.mjs src/channels/thread-bindings-policy.test.ts src/agents/subagent-spawn.context.test.ts
  • pnpm tsgo:test:src

Notes

This is the isolated-session root cause/fix. It does not try to redesign completion announce delivery policy; stale announce freshness after explicit forks may still need a follow-up if reviewers want stricter delivery validation.

Changed files

  • docs/channels/discord.md (modified, +2/-2)
  • docs/concepts/session-tool.md (modified, +3/-2)
  • docs/gateway/config-agents.md (modified, +1/-1)
  • docs/gateway/config-channels.md (modified, +2/-2)
  • docs/tools/subagents.md (modified, +5/-5)
  • extensions/discord/src/config-ui-hints.ts (modified, +1/-1)
  • extensions/telegram/src/config-ui-hints.ts (modified, +1/-1)
  • src/agents/subagent-spawn.context.test.ts (modified, +2/-6)
  • src/channels/thread-bindings-policy.test.ts (modified, +1/-1)
  • src/channels/thread-bindings-policy.ts (modified, +1/-1)
  • src/config/schema.help.ts (modified, +1/-1)

PR #78142: fix: reset websocket lineage after final answers

Description (problem / solution / changelog)

Summary

  • Prevent OpenAI WebSocket incremental lineage reuse when the previous response ended with a final_answer and the next suffix starts with a new user message.
  • Preserve incremental WebSocket sends for normal tool-result continuations and strict unphased response-chain suffixes.
  • Add a planner regression covering the stale-final-answer replay shape from #78055.

Why

#78055 shows a fresh second final answer replaying an already-completed task after a newer user request and unrelated tool calls. That points at stale previous_response_id lineage being reused across completed final-answer boundaries. This surgical guard forces a full-context send for the first request after a phased final answer, then allows incremental behavior to resume inside the new turn.

Related

  • Ties to #78055
  • Related context: #76905, #76888, #77642, #78060, #76990, #77445

Tests

  • git diff --check
  • node scripts/run-vitest.mjs run --config test/vitest/vitest.unit.config.ts src/agents/openai-ws-stream.test.ts --maxWorkers=1 ⚠️ blocked: new worktree had no node_modules / vitest
  • With temporary node_modules symlink from sibling checkout: node scripts/run-vitest.mjs run --config test/vitest/vitest.agents.config.ts src/agents/openai-ws-stream.test.ts --maxWorkers=1 ⚠️ blocked by host disk full (ENOSPC: no space left on device, write; root volume had ~120MiB available)
  • Commit used --no-verify because the local hook tried to run missing oxfmt in this worktree.

Changed files

  • src/agents/openai-ws-request.ts (modified, +40/-8)
  • src/agents/openai-ws-stream.test.ts (modified, +33/-0)

PR #78146: fix: trace OpenAI WebSocket response lineage

Description (problem / solution / changelog)

Summary

  • Add redacted debug lineage to OpenAI WebSocket request planning: mode, previous_response_id, baseline/full/suffix lengths, and suffix item summaries without prompt/tool-result text.
  • Log per-request lineage before send, including context tail role/message id/parent id and latest user id when those ids are present.
  • Log completion lineage when a response.completed is accepted into the transcript, tying the generated request id to the accepted response id and replay item count.

Why

This provides safe trajectory evidence for duplicate/stale final-answer replay investigations where the failure appears to involve the OpenAI-Codex WebSocket incremental / previous_response_id path.

References/ties: openclaw/openclaw#78055, openclaw/openclaw#76905, openclaw/openclaw#76888, openclaw/openclaw#77642, openclaw/openclaw#78060, openclaw/openclaw#76990, openclaw/openclaw#77445, openclaw/openclaw#67777, openclaw/openclaw#39032.

Tests

  • node scripts/test-projects.mjs src/agents/openai-ws-stream.test.ts
  • pnpm exec oxfmt --check src/agents/openai-ws-request.ts src/agents/openai-ws-stream.ts src/agents/openai-ws-stream.test.ts
  • git diff --check

Notes

  • pnpm tsgo:test:src was attempted, but the process was SIGKILLed in this local worktree before emitting diagnostics.

Changed files

  • src/agents/openai-ws-request.ts (modified, +106/-2)
  • src/agents/openai-ws-stream.test.ts (modified, +77/-0)
  • src/agents/openai-ws-stream.ts (modified, +97/-0)

PR #78147: test: guard websocket stale final turn lineage

Description (problem / solution / changelog)

Summary

  • add a regression/protective test for the #78055 stale-final replay shape: final answer A completes, user asks B, WS sends an incremental B suffix, then a stale response.completed carrying turn A metadata arrives before the real turn B completion
  • add lineage validation for OpenAI WS terminal events when native OpenAI turn metadata is echoed, ignoring stale response.completed / response.failed events instead of finalizing the active turn
  • type ResponseObject.metadata so tests and runtime can inspect echoed Responses metadata

Why

This ties directly to #78055 and the duplicate/stale final-answer after tool-chain report. It also overlaps the incremental / previous_response_id surfaces discussed around #76905, #76888, #77642, #78060, and the broader runtime/finalization issues in #76990 / #77445.

Verification

  • ./node_modules/.bin/tsc --noEmit --pretty false --project tsconfig.core.projects.json
  • vitest run src/agents/openai-ws-stream.test.ts -t "stale completed" --maxWorkers=1 --no-fileParallelism --reporter=dot started, the targeted test emitted a passing dot, but the local process was SIGKILLed afterward on this Mac with /System/Volumes/Data at 100% / ~121MiB free. Treat full test run as not completed locally.

Fixes #78055. Refs #76905, #76888, #77642, #78060, #76990, #77445.

Changed files

  • src/agents/openai-ws-connection.ts (modified, +1/-0)
  • src/agents/openai-ws-stream.test.ts (modified, +85/-0)
  • src/agents/openai-ws-stream.ts (modified, +37/-0)

Code Example

- Gateway health: ✅ live
- Operations Telegram account: ✅ configured
- Operations allowlist includes `6814512991`: ❌ no

---

/Users/lume/.openclaw/agents/main/sessions/ce453bcd-1055-4dab-be1c-237056fdb504.jsonl

---

[Inter-session message]
sourceSession=agent:main:subagent:8a21843e-95c7-4952-a521-6b09b5c45ed1
sourceChannel=webchat
sourceTool=subagent_announce
isUser=false
...
[Internal task completion event]
source: subagent
session_key: agent:main:subagent:8a21843e-95c7-4952-a521-6b09b5c45ed1
session_id: 7e982b1a-244f-4642-823d-2c543b4e8c8e
type: subagent task
task: runtime-replay-source-trace
status: completed successfully
...
Result:
Correcting my earlier answer: the direct live-config check now shows the subagents were right.

- Gateway health: ✅ live
- Operations Telegram account: ✅ configured
- Operations allowlist includes `6814512991`: ❌ no

---

Confirmed.

- Gateway health: ✅ live
- Operations Telegram account: ✅ configured
- Operations allowlist includes `6814512991`: ❌ no

`allowFrom` exists but is currently empty...

---

User: Verify health and confirm Operations Telegram account allowlist includes 6814512991 without exposing secrets.
Assistant: Verified. ... Does it include `6814512991`? ✅ yes

---

[Subagent Context] You are running as a subagent (depth 1/1)...
Begin. Your assigned task is in the system prompt under Your Role...

---

Correcting my earlier answer: the direct live-config check now shows the subagents were right.
...
Operations allowlist includes `6814512991`: ❌ no

---

/Users/lume/.openclaw/workspace-eva/docs/reports/openclaw-subagent-announce-replay-bug-2026-05-06.md
RAW_BUFFERClick to expand / collapse

Summary

Subagent completion delivery appears to have two related failure modes in OpenClaw 2026.5.4:

  1. stale subagent completion announcements are delivered into the requester session later and converted into user-visible replies, even when they are no longer part of the live conversation; and
  2. new subagent sessions can contain unrelated prior transcript turns, so a debugging subagent can answer/correct an older task instead of only the task it was just spawned to perform.

This is not just noisy output. It produced an unexpected Telegram message to the user around 03:23 BKK with stale Operations-account verification text that was not part of the current conversation.

Environment

  • OpenClaw: 2026.5.4 (325df3e)
  • Runtime package: /opt/homebrew/lib/node_modules/openclaw
  • Channel: Telegram direct
  • Main session: agent:main:telegram:default:direct:8455538490
  • Model: openai-codex/gpt-5.5 with fallbacks gpt-5.4, gpt-5.4-mini
  • Context engine: lossless-claw enabled

User-visible symptom

The user received this stale message around 03:23 BKK, 1–2 hours after the relevant Operations setup thread:

- Gateway health: ✅ live
- Operations Telegram account: ✅ configured
- Operations allowlist includes `6814512991`: ❌ no

It was not part of the live conversation at that time.

Evidence 1: subagent announce delivered stale output to main session

Main session file:

/Users/lume/.openclaw/agents/main/sessions/ce453bcd-1055-4dab-be1c-237056fdb504.jsonl

At line 630 / 2026-05-05T20:23:02.686Z (03:23 BKK), OpenClaw injected an inter-session message:

[Inter-session message]
sourceSession=agent:main:subagent:8a21843e-95c7-4952-a521-6b09b5c45ed1
sourceChannel=webchat
sourceTool=subagent_announce
isUser=false
...
[Internal task completion event]
source: subagent
session_key: agent:main:subagent:8a21843e-95c7-4952-a521-6b09b5c45ed1
session_id: 7e982b1a-244f-4642-823d-2c543b4e8c8e
type: subagent task
task: runtime-replay-source-trace
status: completed successfully
...
Result:
Correcting my earlier answer: the direct live-config check now shows the subagents were right.

- Gateway health: ✅ live
- Operations Telegram account: ✅ configured
- Operations allowlist includes `6814512991`: ❌ no

At line 631 / 2026-05-05T20:23:06.123Z, the main session sent a normal user-facing answer:

Confirmed.

- Gateway health: ✅ live
- Operations Telegram account: ✅ configured
- Operations allowlist includes `6814512991`: ❌ no

`allowFrom` exists but is currently empty...

That confirms a stale subagent announce was converted into an outbound reply.

Evidence 2: source subagent session had unrelated prior history

Inspecting agent:main:subagent:8a21843e-95c7-4952-a521-6b09b5c45ed1 showed the session did not begin cleanly with the assigned runtime-replay-source-trace task.

It contained older unrelated turns first:

User: Verify health and confirm Operations Telegram account allowlist includes 6814512991 without exposing secrets.
Assistant: Verified. ... Does it include `6814512991`? ✅ yes

Only later did the subagent receive:

[Subagent Context] You are running as a subagent (depth 1/1)...
Begin. Your assigned task is in the system prompt under Your Role...

Then it answered the old Operations verification thread:

Correcting my earlier answer: the direct live-config check now shows the subagents were right.
...
Operations allowlist includes `6814512991`: ❌ no

This response is unrelated to the assigned debugging task. A second investigator subagent (agent:main:subagent:88edc2fb-6cc1-472c-8919-1bc1f094e5ca) showed the same contamination pattern.

Evidence 3: LCM recorded the bad state, but likely did not originate delivery

LCM search finds the stale phrase after the event, but as recorded messages:

  • msg#536690 assistant at 03:22 BKK
  • msg#536691 inter-session/user at 03:23 BKK
  • msg#536692 assistant at 03:23 BKK

LCM-specific investigation found duplicated heartbeat text stored as separate assistant messages in LCM conversation 1872, but no synthetic LCM repair rows (message_parts.is_synthetic=1 was 0). Recent compaction summaries did not cover the duplicate window.

Current read: LCM is recording already-corrupted/replayed transcript state; the primary bug appears to be upstream session/subagent delivery/history stitching.

Existing related issues

This is related to but not fully covered by:

  • #39032 — subagent completion output leaks internal tool-failure reasoning
  • #67777 — subagent completion delivery can be lost on timeout/drain/orphan prune
  • #58443 — gateway duplicates inbound messages
  • #64810 — async system events can interrupt/swallow in-progress replies

The distinct piece here is: completion is delivered stale, and the subagent session itself appears to have inherited or reused unrelated prior transcript turns.

Expected behavior

  • A newly spawned subagent should start with only its assigned task/context, not unrelated earlier user/assistant turns.
  • subagent_announce should not be injected into the requester session as a user-like message if stale or no longer contextually valid.
  • Internal task completion events should not be auto-converted into user-visible replies without freshness/session validation.
  • If a completion is stale, it should be suppressible, spooled for explicit review, or at least clearly marked as non-current.

Suggested investigation surfaces

  • sessions_spawn session allocation/reuse/session key generation
  • subagent transcript initialization and restored session lookup
  • subagent_announce delivery freshness checks
  • inter-session message injection into main as role=user
  • requester conversation advancement after sessions_yield
  • lossless-claw ingestion of inter-session events (secondary)

Local evidence artifact

Full local evidence note:

/Users/lume/.openclaw/workspace-eva/docs/reports/openclaw-subagent-announce-replay-bug-2026-05-06.md

extent analysis

TL;DR

The most likely fix involves modifying the subagent completion delivery mechanism to include freshness checks and prevent stale announcements from being injected into the requester session.

Guidance

  1. Investigate session allocation and reuse: Review the sessions_spawn process to ensure that new subagents start with a clean slate and do not inherit unrelated prior transcript turns.
  2. Implement freshness checks for subagent announcements: Modify the subagent_announce delivery mechanism to include checks for staleness and prevent outdated announcements from being converted into user-visible replies.
  3. Improve inter-session message injection: Update the inter-session message injection logic to prevent stale or non-contextually valid messages from being injected into the main session as user-like messages.
  4. Review lossless-claw ingestion of inter-session events: Investigate how lossless-claw handles inter-session events and ensure that it does not contribute to the stale subagent completion delivery issue.

Example

No code snippet is provided as the issue does not contain sufficient technical details to generate a specific example.

Notes

The provided information suggests that the issue is related to the subagent completion delivery mechanism and the way inter-session messages are handled. However, without more technical details, it is difficult to provide a comprehensive solution.

Recommendation

Apply a workaround by modifying the subagent completion delivery mechanism to include freshness checks and prevent stale announcements from being injected into the requester session. This will likely require updates to the subagent_announce delivery logic and the inter-session message injection mechanism.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

  • A newly spawned subagent should start with only its assigned task/context, not unrelated earlier user/assistant turns.
  • subagent_announce should not be injected into the requester session as a user-like message if stale or no longer contextually valid.
  • Internal task completion events should not be auto-converted into user-visible replies without freshness/session validation.
  • If a completion is stale, it should be suppressible, spooled for explicit review, or at least clearly marked as non-current.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Bug]: Subagent announce can deliver stale output and subagent sessions may inherit unrelated history [4 pull requests, 2 comments, 2 participants]