openclaw - ✅(Solved) Fix Commentary text can leak into final assistant replies, and duplicate visible replies can occur after tool sends [5 pull requests, 7 comments, 6 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#59150Fetched 2026-04-08 02:28:07
View on GitHub
Comments
7
Participants
6
Timeline
22
Reactions
0
Author
Timeline (top)
cross-referenced ×8commented ×7mentioned ×3subscribed ×3

We observed two user-facing reply integrity issues during Telegram channel conversations using OpenClaw with Codex / OpenAI Responses-style flows:

  1. commentary-like internal text can appear in final assistant-visible replies
  2. duplicate visible replies can occur when a turn already performed a user-visible tool send

This report is based on:

  • transcript evidence
  • inspection of compiled dist code paths in the installed OpenClaw runtime

Root Cause

Root cause assessment

Fix Action

Fix / Workaround

We have not yet fully proven the exact final outbound dispatch function in every case.

PR fix notes

PR #59643: fix(agents): preserve commentary/final_answer phase separation

Description (problem / solution / changelog)

Summary

  • Problem: assistant turns that contain both commentary and final_answer text can be flattened into one visible output, which leaks commentary into user-facing replies and can produce duplicate or malformed final delivery.
  • Why it matters: this breaks the expected final-only user experience, causes duplicate replies after tool/send paths, and corrupts replay/context because mixed-phase text is persisted and replayed ambiguously.
  • What changed: preserved phase separation end-to-end across stored-message conversion, replay/input-item rebuilding, WebSocket partial phase propagation, and visible extraction/delivery so user-visible output prefers final_answer while still falling back safely when no final text exists.
  • What did NOT change (scope boundary): this does not globally redefine every text extractor in the repo, does not change tool-call semantics, and does not attempt a broader phase-aware audit outside the main OpenAI WS -> embedded subscribe -> visible delivery path.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #59150
  • Related #56198
  • Related #58892
  • Related #52084
  • Related #25592
  • Related #44467
  • Related PR #30479
  • Related PR #57484
  • This PR fixes a bug or regression

Root Cause / Regression History (if applicable)

  • Root cause: assistant text phase truth already existed at block level, but several layers still flattened mixed-phase text. Stored assistant messages could carry both commentary and final-answer blocks under one misleading top-level phase; replay collapsed them back together; stream partials could lose item-phase attribution; and visible extraction/delivery then consumed flattened text instead of phase-aware text.
  • Missing detection / guardrail: there was no focused regression coverage for mixed-phase stored messages, phase-aware replay splitting, signature-only phased partials, or text_end / message_end delivery interactions where commentary previews must be suppressed/replaced by final output.
  • Prior context (git blame, prior PR, issue, or refactor if known): this bug family overlaps longstanding leakage/duplication reports in #59150, #56198, #58892, #52084, #25592, and #44467. Adjacent prior art includes #30479 (stripping raw user-facing protocol leakage) and #57484 (commentary-delivery semantics on a channel path), but those did not fix mixed-phase persistence/replay/delivery end-to-end.
  • Why this regressed now: the issue is not a single recent regression; it is an accumulated phase-separation gap that became more visible once commentary, block replies, tool sends, and final-answer delivery all coexisted on the same assistant turn path.
  • If unknown, what was ruled out: ruled out “display-only” root cause. Investigation confirmed the problem begins upstream in stored-message conversion/replay semantics, not only in final rendering.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file:
    • src/agents/openai-ws-stream.test.ts
    • src/agents/pi-embedded-utils.test.ts
    • src/agents/pi-embedded-subscribe.handlers.messages.test.ts
    • src/agents/pi-embedded-subscribe.subscribe-embedded-pi-session.emits-block-replies-text-end-does-not.test.ts
    • src/agents/pi-embedded-subscribe.subscribe-embedded-pi-session.does-not-append-text-end-content-is.test.ts
    • src/agents/pi-embedded-subscribe.subscribe-embedded-pi-session.does-not-duplicate-text-end-repeats-full.test.ts
  • Scenario the test should lock in: mixed commentary/final stored messages stay phase-separated on replay; stream partials preserve phase; visible extraction prefers final_answer -> commentary -> legacy/unphased; commentary text_end block replies are suppressed until final delivery; final replacement at message_end works; and duplicate/prefix-extension regressions remain fixed.
  • Why this is the smallest reliable guardrail: the bug spans stored-message conversion, stream partial attribution, and delivery seams. Unit-only coverage at one layer would miss the cross-layer collapse that caused the visible duplication/leak.
  • Existing test that already covers this (if any): none before this branch for the mixed-phase end-to-end path.
  • If no new test is added, why not: N/A

User-visible / Behavior Changes

  • Visible assistant output now prefers final_answer text when both commentary and final-answer phases exist in one turn.
  • Commentary-only previews are no longer allowed to leak through as the final visible reply in the main embedded delivery path.
  • When commentary streamed first and final text arrives later, the final visible reply replaces the preview instead of duplicating it.
  • If no final-answer text exists, commentary/unphased fallback still works instead of producing an empty reply.

Diagram (if applicable)

Before:
[mixed commentary + final_answer blocks]
  -> [stored/replayed as flattened assistant text]
  -> [visible extractor concatenates all text]
  -> [commentary leak and/or duplicate final reply]

After:
[mixed commentary + final_answer blocks]
  -> [phase preserved in storage/replay/partials]
  -> [visible extractor prefers final_answer]
  -> [commentary preview suppressed/replaced]
  -> [single intended final visible reply]

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: Ubuntu (local dev host)
  • Runtime/container: local Node/pnpm repo checkout
  • Model/provider: OpenAI WS / embedded subscribe path
  • Integration/channel (if any): embedded delivery path with Telegram/Discord-adjacent visible-output semantics
  • Relevant config (redacted): default OpenAI WS / embedded subscribe test harnesses; no special secrets required

Steps

  1. Produce or replay an assistant turn containing both commentary and final_answer text blocks.
  2. Observe stored/replayed assistant content and the user-visible delivery path.
  3. Confirm whether visible output leaks commentary, duplicates final delivery, or collapses mixed phases.

Expected

  • Stored and replayed assistant content preserves phase boundaries.
  • User-visible extraction prefers final_answer when present.
  • Commentary previews are suppressed or replaced rather than duplicated.

Actual

  • Before this fix: mixed-phase turns could flatten into one visible assistant reply, leak commentary, or produce duplicate final delivery.
  • On this branch: phase separation is preserved through replay and delivery, and the targeted duplicate/leak regressions are covered by tests.

Evidence

Attach at least one:

  • Failing test/log before + passing after

  • Trace/log snippets

  • Screenshot/recording

  • Perf numbers (if relevant)

  • local transcript evidence showed assistant turns with both commentary and final_answer blocks in a single message

  • targeted regression slice passed: 8 suites / 164 tests

  • fresh independent review loop passed for storage/replay, stream-phase, visible-delivery, and holistic closure before branch submission

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios:
    • inspected mixed-phase transcript evidence and confirmed commentary + final-answer coexistence in one assistant turn
    • verified stored-message/replay, stream partial propagation, and visible delivery changes in the touched files
    • ran the targeted regression slice and confirmed 8 suites / 164 tests passed
    • confirmed the branch diff remains scoped to the intended 10 files
  • Edge cases checked:
    • commentary leaking through text_end block replies
    • final replacement at message_end after commentary streamed first
    • legacy/unphased + phased replay collapse
    • signature-only phased partials without top-level partial.phase
    • prefix-extension and duplicate text_end regressions
  • What you did not verify:
    • full-repo tsc --noEmit on this host (prior typecheck attempts were memory-constrained)
    • a broader follow-up audit of other phase-blind helper paths such as src/agents/tools/sessions-helpers.ts

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? No
  • Migration needed? No
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: other assistant-text consumers outside the main embedded delivery path may still use phase-blind flattening and show adjacent inconsistencies.
    • Mitigation: this PR keeps scope tight to the verified main issue path and leaves sessions-helpers parity as explicit follow-up watchpoint rather than silently broadening behavior.
  • Risk: replay/delivery edge cases could regress around partial/final transitions.
    • Mitigation: regression coverage now locks in mixed-phase replay splitting, phase-aware partial attribution, commentary suppression, final replacement, and duplicate/text-end edge cases.

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • src/agents/openai-ws-message-conversion.ts (modified, +35/-13)
  • src/agents/openai-ws-stream.test.ts (modified, +490/-30)
  • src/agents/openai-ws-stream.ts (modified, +145/-12)
  • src/agents/pi-embedded-subscribe.handlers.messages.test.ts (modified, +169/-0)
  • src/agents/pi-embedded-subscribe.handlers.messages.ts (modified, +84/-32)
  • src/agents/pi-embedded-subscribe.subscribe-embedded-pi-session.does-not-append-text-end-content-is.test.ts (modified, +1/-0)
  • src/agents/pi-embedded-subscribe.subscribe-embedded-pi-session.does-not-duplicate-text-end-repeats-full.test.ts (modified, +4/-1)
  • src/agents/pi-embedded-subscribe.subscribe-embedded-pi-session.emits-block-replies-text-end-does-not.test.ts (modified, +468/-1)
  • src/agents/pi-embedded-utils.test.ts (modified, +76/-1)
  • src/agents/pi-embedded-utils.ts (modified, +108/-6)

PR #61457: fix(agents): suppress commentary partial visibility and buffer unphased early deltas

Description (problem / solution / changelog)

Summary

Addresses the two specific blockers from the maintainer review on #59643, building on the architectural direction of that PR while fixing the behaviors that blocked its merge.

Blocker 1: Commentary partials must stay suppressed

The merged fix from #61282 already suppresses commentary-phase output at the embedded subscribe delivery boundary on current main. This PR preserves that behavior and does not reintroduce the commentary-preview-then-replace pattern that was rejected in #59643.

Blocker 2: Late-map unphased deltas must be buffered

In src/agents/openai-ws-stream.ts, when response.output_text.delta arrives before response.output_item.added, the delta is now buffered instead of emitted as an unphased visible partial. When item metadata arrives:

  • final_answer buffered text is emitted with correct phase/signature
  • commentary buffered text stays suppressed

Architectural improvements (from #59643 direction)

  • Phase-aware replay conversion in openai-ws-message-conversion.ts: stored/replayed assistant text respects per-block textSignature.phase and splits by phase instead of flattening
  • Phase attribution on WS partials via textSignature
  • Top-level assistant phase only set when response is consistently single-phase

What this does NOT cover (known follow-up scope)

  • src/agents/tools/sessions-helpers.ts still uses phase-blind extractAssistantText() — follow-up PR recommended
  • TUI render path (tui-formatters.ts, tui-stream-assembler.ts) still phase-blind — follow-up PR recommended
  • Webchat/dashboard history consumers may still render generic assistant text — follow-up PR recommended

These are explicitly out of scope per the maintainer direction to make the core WS → embedded subscribe path correct first.

Change Type

  • Bug fix

Scope

  • Gateway / orchestration
  • Integrations

Linked Issues

  • Closes #59150
  • Related #59643
  • Related #61282

Tests

Tests added for:

  • Phase-aware replay splitting
  • Mixed-phase stored message reconstruction
  • WS partial emission only after item-phase mapping exists
  • Suppression of buffered commentary deltas when late phase resolves to commentary

Credits

Architectural direction from @ringlochid in #59643. Review feedback from @steipete that identified the two specific blockers addressed here.

Changed files

  • src/agents/openai-ws-message-conversion.ts (modified, +34/-13)
  • src/agents/openai-ws-stream.test.ts (modified, +543/-0)
  • src/agents/openai-ws-stream.ts (modified, +173/-12)
  • src/agents/pi-embedded-subscribe.handlers.messages.test.ts (modified, +180/-0)
  • src/agents/pi-embedded-subscribe.handlers.messages.ts (modified, +14/-0)

PR #61463: fix(agents,gateway): phase-aware assistant text extraction — suppress OpenAI commentary leaks in sessions-helpers, TUI, and REST history

Description (problem / solution / changelog)

What this fixes (plain English)

Several places in the codebase that display assistant text were not aware of the phase system (commentary vs. final answer). This meant internal "thinking out loud" commentary could leak into user-visible surfaces like the TUI and HTTP/SSE session history endpoints. This PR makes those surfaces phase-aware, and hardens heartbeat session resolution against subagent key leaks.

Technical details

Follow-up to #59643 and #59150, which fixed phase separation in the core WS path. Post-merge audit found adjacent surfaces still using phase-blind extraction.

Surfaces fixed:

  • src/tui/tui-formatters.ts — assistant text extraction now uses extractAssistantVisibleText() to prefer final_answer over commentary phase blocks
  • src/infra/heartbeat-runner.ts — heartbeat session resolution now rejects subagent session keys, falling back to the main session key instead of leaking into subagent scopes
  • src/gateway/sessions-history-http.test.ts — regression test: REST history applies chat.history sanitization (strips NO_REPLY messages, preserves phase blocks)
  • src/tui/tui-formatters.test.ts — regression test: mixed commentary + final_answer blocks extract only the visible final answer

Explicitly deferred: extractAssistantTextForSilentCheck and buffered delta/final rendering — lower-confidence, more behaviorally sensitive.

Related

  • Follow-up to #59643 and #59150
  • Follow-up issues: #61474, #61475, #61476, #61477, #61478
  • Companion PRs: #61529, #61528, #61527
  • Related PRs: #61855, #61816

Test plan

  • 27/27 TUI formatter tests pass
  • Regression test for mixed commentary + final_answer in TUI extraction
  • HTTP history regression test (REST path shares chat.history sanitization, strips NO_REPLY messages)
  • SSE seq validation for NO_REPLY message stripping
  • sessions-history-http gateway integration tests have 2 pre-existing infra failures (also fail on upstream/main)

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • src/infra/heartbeat-runner.subagent-session-guard.test.ts (added, +72/-0)
  • src/infra/heartbeat-runner.ts (modified, +24/-17)
  • src/tui/tui-formatters.test.ts (modified, +20/-0)
  • src/tui/tui-formatters.ts (modified, +25/-0)

PR #61481: fix(agents): harden OpenAI phase-aware visible text — suppress commentary partials, prevent empty final_answer fallback leak

Description (problem / solution / changelog)

Summary

  • fix phase-aware visible text extraction so an explicit final_answer block never falls back to commentary or legacy unphased text when it sanitizes to empty
  • suppress all commentary-phase partial streaming output regardless of whether extracted visible text is non-empty
  • keep session-history HTTP/SSE sanitization aligned with the hardened chat history path
  • add regression tests covering both leak paths and the session-history follow-through

Context

This hardens the merged #59643 behavior against two P1 leaks:

  • fixes #61474
  • fixes #61475

Related issues / bug family

  • related to #25592
  • related to #59536
  • related to #59918
  • related to #44213
  • related to #49438
  • related to #53960

Parent / sibling PRs

  • parent: #59643 — core phase-separation fix (merged)
  • sibling: #61463 — phase-aware extraction in sessions-helpers, TUI, and history paths

Remaining follow-ups from the same adversarial review

  • #61476 — replay splitting corrupts phase on mixed messages
  • #61477 — late-map buffering gates on key existence, not phase validity
  • #61478 — function-call replay silently loses malformed arguments

Related open PRs

  • #59920 — prefer terminal reply fields in CLI JSONL parser
  • #61151 — drop partialJson streaming artifacts from session history
  • #61337 — disable OpenAI tool-use pairing repair

Testing

  • npm exec -- node --no-maglev ./node_modules/vitest/vitest.mjs run --config vitest.config.ts src/agents/pi-embedded-utils.test.ts src/agents/pi-embedded-subscribe.handlers.messages.test.ts
  • npm exec -- node --no-maglev ./node_modules/vitest/vitest.mjs run --config vitest.config.ts src/gateway/sessions-history-http.test.ts

Changed files

  • .agents/skills/openclaw-parallels-smoke/SKILL.md (modified, +13/-0)
  • .agents/skills/openclaw-qa-testing/SKILL.md (added, +86/-0)
  • .agents/skills/openclaw-qa-testing/agents/openai.yaml (added, +4/-0)
  • .github/labeler.yml (modified, +4/-0)
  • .github/workflows/ci.yml (modified, +7/-1)
  • .github/workflows/control-ui-locale-refresh.yml (modified, +2/-2)
  • .github/workflows/openclaw-npm-release.yml (modified, +1/-1)
  • CHANGELOG.md (modified, +40/-12)
  • appcast.xml (modified, +248/-116)
  • apps/android/app/build.gradle.kts (modified, +2/-2)
  • apps/ios/Config/Version.xcconfig (modified, +3/-3)
  • apps/macos/Sources/OpenClaw/Resources/Info.plist (modified, +2/-2)
  • apps/macos/Sources/OpenClawProtocol/GatewayModels.swift (modified, +14/-0)
  • apps/shared/OpenClawKit/Sources/OpenClawKit/Resources/tool-display.json (modified, +23/-0)
  • apps/shared/OpenClawKit/Sources/OpenClawProtocol/GatewayModels.swift (modified, +14/-0)
  • docs/.generated/config-baseline.sha256 (modified, +4/-4)
  • docs/.generated/plugin-sdk-api-baseline.sha256 (modified, +2/-2)
  • docs/automation/tasks.md (modified, +5/-0)
  • docs/channels/discord.md (modified, +1/-1)
  • docs/channels/matrix.md (modified, +29/-5)
  • docs/cli/memory.md (modified, +43/-15)
  • docs/cli/update.md (modified, +3/-1)
  • docs/concepts/dreaming.md (modified, +121/-194)
  • docs/concepts/memory-qmd.md (modified, +17/-1)
  • docs/concepts/memory-search.md (modified, +9/-8)
  • docs/concepts/memory.md (modified, +12/-8)
  • docs/concepts/model-providers.md (modified, +2/-0)
  • docs/concepts/models.md (modified, +2/-0)
  • docs/docs.json (modified, +8/-1)
  • docs/gateway/configuration-reference.md (modified, +31/-12)
  • docs/help/faq.md (modified, +36/-0)
  • docs/help/testing.md (modified, +22/-0)
  • docs/install/updating.md (modified, +1/-0)
  • docs/plugins/architecture.md (modified, +1/-0)
  • docs/plugins/building-plugins.md (modified, +1/-0)
  • docs/plugins/manifest.md (modified, +76/-30)
  • docs/plugins/sdk-migration.md (modified, +11/-1)
  • docs/plugins/sdk-overview.md (modified, +22/-9)
  • docs/providers/bedrock-mantle.md (modified, +20/-7)
  • docs/providers/bedrock.md (modified, +29/-0)
  • docs/providers/comfy.md (added, +201/-0)
  • docs/providers/fal.md (modified, +2/-1)
  • docs/providers/google.md (modified, +30/-0)
  • docs/providers/index.md (modified, +4/-0)
  • docs/providers/minimax.md (modified, +29/-0)
  • docs/providers/models.md (modified, +4/-0)
  • docs/providers/openai.md (modified, +10/-2)
  • docs/providers/runway.md (added, +63/-0)
  • docs/providers/vydra.md (added, +123/-0)
  • docs/reference/memory-config.md (modified, +117/-98)
  • docs/tools/image-generation.md (modified, +21/-17)
  • docs/tools/index.md (modified, +14/-7)
  • docs/tools/lobster.md (modified, +11/-9)
  • docs/tools/music-generation.md (added, +208/-0)
  • docs/tools/plugin.md (modified, +1/-0)
  • docs/tools/slash-commands.md (modified, +1/-1)
  • docs/tools/video-generation.md (modified, +147/-84)
  • docs/web/control-ui.md (modified, +4/-1)
  • docs/web/dashboard.md (modified, +2/-0)
  • dream-diary-preview-v2.html (added, +399/-0)
  • dream-diary-preview-v3.html (added, +323/-0)
  • extensions/amazon-bedrock-mantle/api.ts (modified, +2/-0)
  • extensions/amazon-bedrock-mantle/bedrock-token-generator.d.ts (added, +6/-0)
  • extensions/amazon-bedrock-mantle/discovery.test.ts (modified, +101/-3)
  • extensions/amazon-bedrock-mantle/discovery.ts (modified, +64/-13)
  • extensions/amazon-bedrock-mantle/package.json (modified, +3/-0)
  • extensions/bluebubbles/src/accounts.ts (modified, +5/-1)
  • extensions/bluebubbles/src/monitor.ts (modified, +1/-1)
  • extensions/browser/src/browser/chrome.default-browser.test.ts (modified, +2/-6)
  • extensions/browser/src/browser/client-fetch.loopback-auth.test.ts (modified, +2/-6)
  • extensions/browser/src/browser/control-service.plugin-disabled.test.ts (modified, +2/-6)
  • extensions/browser/src/browser/profiles-service.test.ts (modified, +5/-8)
  • extensions/browser/src/browser/pw-tools-core.clamps-timeoutms-scrollintoview.test.ts (modified, +2/-6)
  • extensions/browser/src/browser/pw-tools-core.interactions.batch.test.ts (modified, +2/-6)
  • extensions/browser/src/browser/pw-tools-core.interactions.evaluate.abort.test.ts (modified, +2/-6)
  • extensions/browser/src/browser/pw-tools-core.interactions.set-input-files.test.ts (modified, +2/-4)
  • extensions/browser/src/browser/pw-tools-core.last-file-chooser-arm-wins.test.ts (modified, +2/-6)
  • extensions/browser/src/browser/pw-tools-core.screenshots-element-selector.test.ts (modified, +2/-6)
  • extensions/browser/src/browser/routes/agent.existing-session.test.ts (modified, +3/-8)
  • extensions/browser/src/browser/routes/basic.existing-session.test.ts (modified, +3/-8)
  • extensions/browser/src/browser/server-context.existing-session.test.ts (modified, +3/-8)
  • extensions/browser/src/browser/server-context.hot-reload-profiles.test.ts (modified, +6/-12)
  • extensions/browser/src/browser/server-context.remote-profile-tab-ops.fallback.test.ts (modified, +2/-6)
  • extensions/browser/src/browser/server-context.remote-profile-tab-ops.playwright.test.ts (modified, +2/-6)
  • extensions/browser/src/browser/server-lifecycle.test.ts (modified, +3/-8)
  • extensions/browser/src/browser/server.control-server.test-harness.ts (modified, +2/-1)
  • extensions/browser/src/browser/server.evaluate-disabled-does-not-block-storage.test.ts (modified, +3/-8)
  • extensions/browser/src/cli/browser-cli.test-support.ts (modified, +1/-1)
  • extensions/browser/src/cli/command-format.ts (modified, +1/-1)
  • extensions/browser/src/config/config.ts (modified, +1/-1)
  • extensions/browser/src/core-api.ts (modified, +25/-20)
  • extensions/browser/src/doctor-browser.ts (modified, +1/-1)
  • extensions/browser/src/gateway/auth.ts (modified, +1/-1)
  • extensions/browser/src/gateway/startup-auth.ts (modified, +1/-1)
  • extensions/browser/src/infra/errors.ts (modified, +1/-1)
  • extensions/browser/src/infra/fs-safe.ts (modified, +1/-1)
  • extensions/browser/src/infra/net/proxy-env.ts (modified, +1/-1)
  • extensions/browser/src/infra/net/ssrf.ts (modified, +1/-1)
  • extensions/browser/src/infra/path-guards.ts (modified, +1/-1)
  • extensions/browser/src/infra/ports.ts (modified, +1/-1)

PR #61506: fix: prefer final-answer text across web chat surfaces

Description (problem / solution / changelog)

Summary

  • prefer assistant final_answer text over commentary in web chat and previews
  • suppress commentary text when completed WS replies are persisted
  • share assistant phase helpers and patch the remaining chat.history phase-blind path

Verification

  • pnpm test src/shared/chat-message-content.test.ts src/agents/pi-embedded-utils.test.ts src/agents/pi-embedded-subscribe.handlers.messages.test.ts src/agents/openai-ws-stream.test.ts src/gateway/session-utils.fs.test.ts ui/src/ui/chat/message-extract.test.ts
  • pnpm build

Notes

  • pnpm check on fresh origin/main is currently red from unrelated typing drift outside this branch's touched surface.
  • follow-up to #59150 / #59643

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • src/agents/openai-ws-message-conversion.ts (modified, +38/-48)
  • src/agents/openai-ws-stream.test.ts (modified, +46/-17)
  • src/agents/openai-ws-stream.ts (modified, +4/-15)
  • src/agents/pi-embedded-subscribe.handlers.messages.ts (modified, +3/-43)
  • src/agents/pi-embedded-utils.ts (modified, +5/-31)
  • src/gateway/server-methods/chat.ts (modified, +2/-29)
  • src/gateway/server.chat.gateway-server-chat.test.ts (modified, +23/-0)
  • src/gateway/session-utils.fs.test.ts (modified, +50/-0)
  • src/gateway/session-utils.fs.ts (modified, +10/-0)
  • src/shared/chat-message-content.test.ts (modified, +111/-1)
  • src/shared/chat-message-content.ts (modified, +146/-0)
  • ui/src/ui/chat/message-extract.test.ts (modified, +35/-0)
  • ui/src/ui/chat/message-extract.ts (modified, +3/-1)
RAW_BUFFERClick to expand / collapse

Summary

We observed two user-facing reply integrity issues during Telegram channel conversations using OpenClaw with Codex / OpenAI Responses-style flows:

  1. commentary-like internal text can appear in final assistant-visible replies
  2. duplicate visible replies can occur when a turn already performed a user-visible tool send

This report is based on:

  • transcript evidence
  • inspection of compiled dist code paths in the installed OpenClaw runtime

Symptoms

Commentary leak

Users can receive text that looks like internal drafting/commentary, for example:

  • Need answer yes can reinstall maybe already exists.

Duplicate replies

A single turn can produce two visible replies instead of one:

  • one normal assistant reply
  • one additional reply associated with delivery-mirror

This appears to be two separate outbound messages, not one duplicated string.

Evidence

Confirmed by transcript

Session transcript:

  • /root/.openclaw/agents/main/sessions/2607bbec-bdf2-4920-ac61-fc8ef586c4cc.jsonl

Findings:

  1. commentary-phase text appeared inside assistant-visible content
  2. duplicate replies were recorded as two separate assistant messages
  3. the second message often appeared as:
    • provider=openclaw
    • model=delivery-mirror

Confirmed by code inspection

In compiled dist code:

Commentary path

  • file: dist/auth-profiles-B5ypC5S-.js
  • function: buildAssistantMessageFromResponse(response, modelInfo)

item.phase is parsed, but commentary-phase output_text still appears to be collected into final assistant content.

Delivery-mirror path

  • file: dist/auth-profiles-B5ypC5S-.js
  • functions:
    • appendAssistantMessageToSessionTranscript(params)
    • deliverOutboundPayloads(...)
    • executeSendAction(params)
    • sendMessage$1(params)

delivery-mirror is explicitly generated as an assistant-side transcript message after outbound send success.

Root cause assessment

Confirmed / strongly supported

  1. commentary leakage appears to happen in final response assembly
    • buildAssistantMessageFromResponse(...) parses item.phase but still appears to include commentary-phase text in final assistant content
  2. duplicate reply behavior involves a distinct delivery-mirror assistant message path
    • this is not just a Telegram rendering artifact

High-probability inference

The duplicate visible reply issue is likely caused by missing turn-level mutual exclusion between:

  • user-visible tool send
  • delivery-mirror append
  • normal final assistant reply delivery

We have not yet fully proven the exact final outbound dispatch function in every case.

Suggested fix directions

1. Commentary leak

In buildAssistantMessageFromResponse(...):

  • exclude items where phase === "commentary" from final assistant-visible content

2. Duplicate replies

Introduce turn-level suppression logic:

  • if a turn already performed a user-visible tool send
  • suppress the normal final assistant reply for that same turn

This would avoid double visible delivery without necessarily removing transcript mirror support.

Notes / impact

This investigation is intentionally conservative:

  • transcript evidence is direct
  • code-path assessment is based on compiled dist artifacts
  • some final-delivery details still need deeper tracing

User impact is significant because these issues affect:

  • reply integrity
  • trust
  • transcript cleanliness
  • duplicate visible messages

extent analysis

TL;DR

The most likely fix involves modifying the buildAssistantMessageFromResponse function to exclude commentary-phase text and introducing turn-level suppression logic to prevent duplicate replies.

Guidance

  • Review the buildAssistantMessageFromResponse function to ensure it correctly filters out commentary-phase text by checking the item.phase property and excluding items where phase === "commentary".
  • Introduce turn-level suppression logic to prevent duplicate replies by checking if a turn has already performed a user-visible tool send and suppressing the normal final assistant reply for that turn if necessary.
  • Verify the changes by inspecting the session transcript and checking for commentary leakage and duplicate replies.
  • Consider adding additional logging or debugging statements to help identify and fix any remaining issues.

Example

function buildAssistantMessageFromResponse(response, modelInfo) {
  // Filter out commentary-phase text
  const filteredResponse = response.items.filter(item => item.phase !== "commentary");
  // ...
}

Notes

The suggested fix directions are based on the provided evidence and code inspection, but further investigation may be necessary to fully resolve the issues. The user impact is significant, and addressing these issues will help maintain reply integrity, trust, and transcript cleanliness.

Recommendation

Apply the suggested workaround by modifying the buildAssistantMessageFromResponse function and introducing turn-level suppression logic, as this approach is likely to fix the commentary leakage and duplicate replies issues.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix Commentary text can leak into final assistant replies, and duplicate visible replies can occur after tool sends [5 pull requests, 7 comments, 6 participants]