openclaw - ✅(Solved) Fix Commentary text can leak into final assistant replies, and duplicate visible replies can occur after tool sends [5 pull requests, 7 comments, 6 participants]

elliclee · 2026-04-01T16:30:44Z

[openclaw] We observed two user-facing reply integrity issues during Telegram channel conversations using OpenClaw with Codex / OpenAI Responses-style flows: 1… We observed two user-facing reply integrity issues during Telegram channel conversations using OpenClaw with Codex / OpenAI Responses-style flows: 1. commentary-like internal text can appear in final assistant-visible replies 2. duplicate visible replies can occur when a turn already performed a user-visible tool send This report is based on: - transcript evidence - inspection of compiled `dist` code paths in the installed OpenClaw runtime # PR #59643: fix(agents): preserve commentary/final_answer phase separation - Repository: openclaw/openclaw - Author: ringlochid - State: closed | merged: True - Link: https://github.com/openclaw/openclaw/pull/59643 ## Description (problem / solution / changelog) ## Summary - Problem: assistant turns that contain both `commentary` and `final_answer` text can be flattened into one visible output, which leaks commentary into user-facing replies and can produce duplicate or malformed final delivery. - Why it matters: this breaks the expected final-only user experience, causes duplicate replies after tool/send paths, and corrupts replay/context because mixed-phase text is persisted and replayed ambiguously. - What changed: preserved phase separation end-to-end across stored-message conversion, replay/input-item rebuilding, WebSocket partial phase propagation, and visible extraction/delivery so user-visible output prefers `final_answer` while still falling back safely when no final text exists. - What did NOT change (scope boundary): this does not globally redefine every text extractor in the repo, does not change tool-call semantics, and does not attempt a broader phase-aware audit outside the main OpenAI WS -> embedded subscribe -> visible delivery path. ## Change Type (select all) - [x] Bug fix - [ ] Feature - [x] Refactor required for the fix - [ ] Docs - [ ] Security hardening - [ ] Chore/infra ## Scope (select all touched areas) - [x] Gateway / orchestration - [ ] Skills / tool execution - [ ] Auth / tokens - [x] Memory / storage - [x] Integrations - [x] API / contracts - [ ] UI / DX - [ ] CI/CD / infra ## Linked Issue/PR - Closes #59150 - Related #56198 - Related #58892 - Related #52084 - Related #25592 - Related #44467 - Related PR #30479 - Related PR #57484 - [x] This PR fixes a bug or regression ## Root Cause / Regression History (if applicable) - Root cause: assistant text phase truth already existed at block level, but several layers still flattened mixed-phase text. Stored assistant messages could carry both commentary and final-answer blocks under one misleading top-level phase; replay collapsed them back together; stream partials could lose item-phase attribution; and visible extraction/delivery then consumed flattened text instead of phase-aware text. - Missing detection / guardrail: there was no focused regression coverage for mixed-phase stored messages, phase-aware replay splitting, signature-only phased partials, or `text_end` / `message_end` delivery interactions where commentary previews must be suppressed/replaced by final output. - Prior context (`git blame`, prior PR, issue, or refactor if known): this bug family overlaps longstanding leakage/duplication reports in #59150, #56198, #58892, #52084, #25592, and #44467. Adjacent prior art includes #30479 (stripping raw user-facing protocol leakage) and #57484 (commentary-delivery semantics on a channel path), but those did not fix mixed-phase persistence/replay/delivery end-to-end. - Why this regressed now: the issue is not a single recent regression; it is an accumulated phase-separation gap that became more visible once commentary, block replies, tool sends, and final-answer delivery all coexisted on the same assistant turn path. - If unknown, what was ruled out: ruled out “display-only” root cause. Investigation confirmed the problem begins upstream in stored-message conversion/replay semantics, not only in final rendering. ## Regression Test Plan (if applicable) - Coverage level that should have caught this: - [x] Unit test - [x] Seam / integration test - [ ] End-to-end test - [ ] Existing coverage already sufficient - Target test or file: - `src/agents/openai-ws-stream.test.ts` - `src/agents/pi-embedded-utils.test.ts` - `src/agents/pi-embedded-subscribe.handlers.messages.test.ts` - `src/agents/pi-embedded-subscribe.subscribe-embedded-pi-session.emits-block-replies-text-end-does-not.test.ts` - `src/agents/pi-embedded-subscribe.subscribe-embedded-pi-session.does-not-append-text-end-content-is.test.ts` - `src/agents/pi-embedded-subscribe.subscribe-embedded-pi-session.does-not-duplicate-text-end-repeats-full.test.ts` - Scenario the test should lock in: mixed commentary/final stored messages stay phase-separated on replay; stream partials preserve phase; visible extraction prefers `final_answer -> commentary -> legacy/unphased`; commentary `text_en

openclaw2026-04-01 16:30:44

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#59150•Fetched 2026-04-08 02:28:07

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

cross-referenced ×8commented ×7mentioned ×3subscribed ×3

We observed two user-facing reply integrity issues during Telegram channel conversations using OpenClaw with Codex / OpenAI Responses-style flows:

commentary-like internal text can appear in final assistant-visible replies
duplicate visible replies can occur when a turn already performed a user-visible tool send

This report is based on:

transcript evidence
inspection of compiled dist code paths in the installed OpenClaw runtime

Root Cause

Root cause assessment

Fix Action

Fix / Workaround

We have not yet fully proven the exact final outbound dispatch function in every case.

PR fix notes

PR #59643: fix(agents): preserve commentary/final_answer phase separation

Repository: openclaw/openclaw
Author: ringlochid
State: closed | merged: True
Link: https://github.com/openclaw/openclaw/pull/59643

Description (problem / solution / changelog)

Summary

Problem: assistant turns that contain both commentary and final_answer text can be flattened into one visible output, which leaks commentary into user-facing replies and can produce duplicate or malformed final delivery.
Why it matters: this breaks the expected final-only user experience, causes duplicate replies after tool/send paths, and corrupts replay/context because mixed-phase text is persisted and replayed ambiguously.
What changed: preserved phase separation end-to-end across stored-message conversion, replay/input-item rebuilding, WebSocket partial phase propagation, and visible extraction/delivery so user-visible output prefers final_answer while still falling back safely when no final text exists.
What did NOT change (scope boundary): this does not globally redefine every text extractor in the repo, does not change tool-call semantics, and does not attempt a broader phase-aware audit outside the main OpenAI WS -> embedded subscribe -> visible delivery path.

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Closes #59150
Related #56198
Related #58892
Related #52084
Related #25592
Related #44467
Related PR #30479
Related PR #57484
This PR fixes a bug or regression

Root Cause / Regression History (if applicable)

Root cause: assistant text phase truth already existed at block level, but several layers still flattened mixed-phase text. Stored assistant messages could carry both commentary and final-answer blocks under one misleading top-level phase; replay collapsed them back together; stream partials could lose item-phase attribution; and visible extraction/delivery then consumed flattened text instead of phase-aware text.
Missing detection / guardrail: there was no focused regression coverage for mixed-phase stored messages, phase-aware replay splitting, signature-only phased partials, or text_end / message_end delivery interactions where commentary previews must be suppressed/replaced by final output.
Prior context (git blame, prior PR, issue, or refactor if known): this bug family overlaps longstanding leakage/duplication reports in #59150, #56198, #58892, #52084, #25592, and #44467. Adjacent prior art includes #30479 (stripping raw user-facing protocol leakage) and #57484 (commentary-delivery semantics on a channel path), but those did not fix mixed-phase persistence/replay/delivery end-to-end.
Why this regressed now: the issue is not a single recent regression; it is an accumulated phase-separation gap that became more visible once commentary, block replies, tool sends, and final-answer delivery all coexisted on the same assistant turn path.
If unknown, what was ruled out: ruled out “display-only” root cause. Investigation confirmed the problem begins upstream in stored-message conversion/replay semantics, not only in final rendering.

Regression Test Plan (if applicable)

Coverage level that should have caught this:
- Unit test
- Seam / integration test
- End-to-end test
- Existing coverage already sufficient
Target test or file:
- src/agents/openai-ws-stream.test.ts
- src/agents/pi-embedded-utils.test.ts
- src/agents/pi-embedded-subscribe.handlers.messages.test.ts
- src/agents/pi-embedded-subscribe.subscribe-embedded-pi-session.emits-block-replies-text-end-does-not.test.ts
- src/agents/pi-embedded-subscribe.subscribe-embedded-pi-session.does-not-append-text-end-content-is.test.ts
- src/agents/pi-embedded-subscribe.subscribe-embedded-pi-session.does-not-duplicate-text-end-repeats-full.test.ts
Scenario the test should lock in: mixed commentary/final stored messages stay phase-separated on replay; stream partials preserve phase; visible extraction prefers final_answer -> commentary -> legacy/unphased; commentary text_end block replies are suppressed until final delivery; final replacement at message_end works; and duplicate/prefix-extension regressions remain fixed.
Why this is the smallest reliable guardrail: the bug spans stored-message conversion, stream partial attribution, and delivery seams. Unit-only coverage at one layer would miss the cross-layer collapse that caused the visible duplication/leak.
Existing test that already covers this (if any): none before this branch for the mixed-phase end-to-end path.
If no new test is added, why not: N/A

User-visible / Behavior Changes

Visible assistant output now prefers final_answer text when both commentary and final-answer phases exist in one turn.
Commentary-only previews are no longer allowed to leak through as the final visible reply in the main embedded delivery path.
When commentary streamed first and final text arrives later, the final visible reply replaces the preview instead of duplicating it.
If no final-answer text exists, commentary/unphased fallback still works instead of producing an empty reply.

Diagram (if applicable)

Before:
[mixed commentary + final_answer blocks]
  -> [stored/replayed as flattened assistant text]
  -> [visible extractor concatenates all text]
  -> [commentary leak and/or duplicate final reply]

After:
[mixed commentary + final_answer blocks]
  -> [phase preserved in storage/replay/partials]
  -> [visible extractor prefers final_answer]
  -> [commentary preview suppressed/replaced]
  -> [single intended final visible reply]

Security Impact (required)

New permissions/capabilities? No
Secrets/tokens handling changed? No
New/changed network calls? No
Command/tool execution surface changed? No
Data access scope changed? No
If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

OS: Ubuntu (local dev host)
Runtime/container: local Node/pnpm repo checkout
Model/provider: OpenAI WS / embedded subscribe path
Integration/channel (if any): embedded delivery path with Telegram/Discord-adjacent visible-output semantics
Relevant config (redacted): default OpenAI WS / embedded subscribe test harnesses; no special secrets required

Steps

Produce or replay an assistant turn containing both commentary and final_answer text blocks.
Observe stored/replayed assistant content and the user-visible delivery path.
Confirm whether visible output leaks commentary, duplicates final delivery, or collapses mixed phases.

Expected

Stored and replayed assistant content preserves phase boundaries.
User-visible extraction prefers final_answer when present.
Commentary previews are suppressed or replaced rather than duplicated.

Actual

Before this fix: mixed-phase turns could flatten into one visible assistant reply, leak commentary, or produce duplicate final delivery.
On this branch: phase separation is preserved through replay and delivery, and the targeted duplicate/leak regressions are covered by tests.

Evidence

Attach at least one:

Failing test/log before + passing after
Trace/log snippets
Screenshot/recording
Perf numbers (if relevant)
local transcript evidence showed assistant turns with both commentary and final_answer blocks in a single message
targeted regression slice passed: 8 suites / 164 tests
fresh independent review loop passed for storage/replay, stream-phase, visible-delivery, and holistic closure before branch submission

Human Verification (required)

What you personally verified (not just CI), and how:

Verified scenarios:
- inspected mixed-phase transcript evidence and confirmed commentary + final-answer coexistence in one assistant turn
- verified stored-message/replay, stream partial propagation, and visible delivery changes in the touched files
- ran the targeted regression slice and confirmed 8 suites / 164 tests passed
- confirmed the branch diff remains scoped to the intended 10 files
Edge cases checked:
- commentary leaking through text_end block replies
- final replacement at message_end after commentary streamed first
- legacy/unphased + phased replay collapse
- signature-only phased partials without top-level partial.phase
- prefix-extension and duplicate text_end regressions
What you did not verify:
- full-repo tsc --noEmit on this host (prior typecheck attempts were memory-constrained)
- a broader follow-up audit of other phase-blind helper paths such as src/agents/tools/sessions-helpers.ts

Review Conversations

I replied to or resolved every bot review conversation I addressed in this PR.
I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

Backward compatible? Yes
Config/env changes? No
Migration needed? No
If yes, exact upgrade steps: N/A

Risks and Mitigations

Risk: other assistant-text consumers outside the main embedded delivery path may still use phase-blind flattening and show adjacent inconsistencies.
- Mitigation: this PR keeps scope tight to the verified main issue path and leaves sessions-helpers parity as explicit follow-up watchpoint rather than silently broadening behavior.
Risk: replay/delivery edge cases could regress around partial/final transitions.
- Mitigation: regression coverage now locks in mixed-phase replay splitting, phase-aware partial attribution, commentary suppression, final replacement, and duplicate/text-end edge cases.

Changed files

CHANGELOG.md (modified, +1/-0)
src/agents/openai-ws-message-conversion.ts (modified, +35/-13)
src/agents/openai-ws-stream.test.ts (modified, +490/-30)
src/agents/openai-ws-stream.ts (modified, +145/-12)
src/agents/pi-embedded-subscribe.handlers.messages.test.ts (modified, +169/-0)
src/agents/pi-embedded-subscribe.handlers.messages.ts (modified, +84/-32)
src/agents/pi-embedded-subscribe.subscribe-embedded-pi-session.does-not-append-text-end-content-is.test.ts (modified, +1/-0)
src/agents/pi-embedded-subscribe.subscribe-embedded-pi-session.does-not-duplicate-text-end-repeats-full.test.ts (modified, +4/-1)
src/agents/pi-embedded-subscribe.subscribe-embedded-pi-session.emits-block-replies-text-end-does-not.test.ts (modified, +468/-1)
src/agents/pi-embedded-utils.test.ts (modified, +76/-1)
src/agents/pi-embedded-utils.ts (modified, +108/-6)

PR #61457: fix(agents): suppress commentary partial visibility and buffer unphased early deltas

Repository: openclaw/openclaw
Author: 100yenadmin
State: closed | merged: False
Link: https://github.com/openclaw/openclaw/pull/61457

Description (problem / solution / changelog)

Summary

Addresses the two specific blockers from the maintainer review on #59643, building on the architectural direction of that PR while fixing the behaviors that blocked its merge.

Blocker 1: Commentary partials must stay suppressed

The merged fix from #61282 already suppresses commentary-phase output at the embedded subscribe delivery boundary on current main. This PR preserves that behavior and does not reintroduce the commentary-preview-then-replace pattern that was rejected in #59643.

Blocker 2: Late-map unphased deltas must be buffered

In src/agents/openai-ws-stream.ts, when response.output_text.delta arrives before response.output_item.added, the delta is now buffered instead of emitted as an unphased visible partial. When item metadata arrives:

final_answer buffered text is emitted with correct phase/signature
commentary buffered text stays suppressed

Architectural improvements (from #59643 direction)

Phase-aware replay conversion in openai-ws-message-conversion.ts: stored/replayed assistant text respects per-block textSignature.phase and splits by phase instead of flattening
Phase attribution on WS partials via textSignature
Top-level assistant phase only set when response is consistently single-phase

What this does NOT cover (known follow-up scope)

src/agents/tools/sessions-helpers.ts still uses phase-blind extractAssistantText() — follow-up PR recommended
TUI render path (tui-formatters.ts, tui-stream-assembler.ts) still phase-blind — follow-up PR recommended
Webchat/dashboard history consumers may still render generic assistant text — follow-up PR recommended

These are explicitly out of scope per the maintainer direction to make the core WS → embedded subscribe path correct first.

Change Type

Bug fix

Scope

Gateway / orchestration
Integrations

Linked Issues

Closes #59150
Related #59643
Related #61282

Tests

Tests added for:

Phase-aware replay splitting
Mixed-phase stored message reconstruction
WS partial emission only after item-phase mapping exists
Suppression of buffered commentary deltas when late phase resolves to commentary

Credits

Architectural direction from @ringlochid in #59643. Review feedback from @steipete that identified the two specific blockers addressed here.

Changed files

src/agents/openai-ws-message-conversion.ts (modified, +34/-13)
src/agents/openai-ws-stream.test.ts (modified, +543/-0)
src/agents/openai-ws-stream.ts (modified, +173/-12)
src/agents/pi-embedded-subscribe.handlers.messages.test.ts (modified, +180/-0)
src/agents/pi-embedded-subscribe.handlers.messages.ts (modified, +14/-0)

PR #61463: fix(agents,gateway): phase-aware assistant text extraction — suppress OpenAI commentary leaks in sessions-helpers, TUI, and REST history

Repository: openclaw/openclaw
Author: 100yenadmin
State: closed | merged: True
Link: https://github.com/openclaw/openclaw/pull/61463

Description (problem / solution / changelog)

What this fixes (plain English)

Several places in the codebase that display assistant text were not aware of the phase system (commentary vs. final answer). This meant internal "thinking out loud" commentary could leak into user-visible surfaces like the TUI and HTTP/SSE session history endpoints. This PR makes those surfaces phase-aware, and hardens heartbeat session resolution against subagent key leaks.

Technical details

Follow-up to #59643 and #59150, which fixed phase separation in the core WS path. Post-merge audit found adjacent surfaces still using phase-blind extraction.

Surfaces fixed:

src/tui/tui-formatters.ts — assistant text extraction now uses extractAssistantVisibleText() to prefer final_answer over commentary phase blocks
src/infra/heartbeat-runner.ts — heartbeat session resolution now rejects subagent session keys, falling back to the main session key instead of leaking into subagent scopes
src/gateway/sessions-history-http.test.ts — regression test: REST history applies chat.history sanitization (strips NO_REPLY messages, preserves phase blocks)
src/tui/tui-formatters.test.ts — regression test: mixed commentary + final_answer blocks extract only the visible final answer

Explicitly deferred: extractAssistantTextForSilentCheck and buffered delta/final rendering — lower-confidence, more behaviorally sensitive.

Follow-up to #59643 and #59150
Follow-up issues: #61474, #61475, #61476, #61477, #61478
Companion PRs: #61529, #61528, #61527
Related PRs: #61855, #61816

Test plan

27/27 TUI formatter tests pass
Regression test for mixed commentary + final_answer in TUI extraction
HTTP history regression test (REST path shares chat.history sanitization, strips NO_REPLY messages)
SSE seq validation for NO_REPLY message stripping
sessions-history-http gateway integration tests have 2 pre-existing infra failures (also fail on upstream/main)

Changed files

CHANGELOG.md (modified, +1/-0)
src/infra/heartbeat-runner.subagent-session-guard.test.ts (added, +72/-0)
src/infra/heartbeat-runner.ts (modified, +24/-17)
src/tui/tui-formatters.test.ts (modified, +20/-0)
src/tui/tui-formatters.ts (modified, +25/-0)

PR #61481: fix(agents): harden OpenAI phase-aware visible text — suppress commentary partials, prevent empty final_answer fallback leak

Repository: openclaw/openclaw
Author: 100yenadmin
State: closed | merged: False
Link: https://github.com/openclaw/openclaw/pull/61481

Description (problem / solution / changelog)

Summary

fix phase-aware visible text extraction so an explicit final_answer block never falls back to commentary or legacy unphased text when it sanitizes to empty
suppress all commentary-phase partial streaming output regardless of whether extracted visible text is non-empty
keep session-history HTTP/SSE sanitization aligned with the hardened chat history path
add regression tests covering both leak paths and the session-history follow-through

Context

This hardens the merged #59643 behavior against two P1 leaks:

fixes #61474
fixes #61475

Related issues / bug family

related to #25592
related to #59536
related to #59918
related to #44213
related to #49438
related to #53960

Parent / sibling PRs

parent: #59643 — core phase-separation fix (merged)
sibling: #61463 — phase-aware extraction in sessions-helpers, TUI, and history paths

Remaining follow-ups from the same adversarial review

#61476 — replay splitting corrupts phase on mixed messages
#61477 — late-map buffering gates on key existence, not phase validity
#61478 — function-call replay silently loses malformed arguments

Related open PRs

#59920 — prefer terminal reply fields in CLI JSONL parser
#61151 — drop partialJson streaming artifacts from session history
#61337 — disable OpenAI tool-use pairing repair

Testing

npm exec -- node --no-maglev ./node_modules/vitest/vitest.mjs run --config vitest.config.ts src/agents/pi-embedded-utils.test.ts src/agents/pi-embedded-subscribe.handlers.messages.test.ts
npm exec -- node --no-maglev ./node_modules/vitest/vitest.mjs run --config vitest.config.ts src/gateway/sessions-history-http.test.ts

Changed files

.agents/skills/openclaw-parallels-smoke/SKILL.md (modified, +13/-0)
.agents/skills/openclaw-qa-testing/SKILL.md (added, +86/-0)
.agents/skills/openclaw-qa-testing/agents/openai.yaml (added, +4/-0)
.github/labeler.yml (modified, +4/-0)
.github/workflows/ci.yml (modified, +7/-1)
.github/workflows/control-ui-locale-refresh.yml (modified, +2/-2)
.github/workflows/openclaw-npm-release.yml (modified, +1/-1)
CHANGELOG.md (modified, +40/-12)
appcast.xml (modified, +248/-116)
apps/android/app/build.gradle.kts (modified, +2/-2)
apps/ios/Config/Version.xcconfig (modified, +3/-3)
apps/macos/Sources/OpenClaw/Resources/Info.plist (modified, +2/-2)
apps/macos/Sources/OpenClawProtocol/GatewayModels.swift (modified, +14/-0)
apps/shared/OpenClawKit/Sources/OpenClawKit/Resources/tool-display.json (modified, +23/-0)
apps/shared/OpenClawKit/Sources/OpenClawProtocol/GatewayModels.swift (modified, +14/-0)
docs/.generated/config-baseline.sha256 (modified, +4/-4)
docs/.generated/plugin-sdk-api-baseline.sha256 (modified, +2/-2)
docs/automation/tasks.md (modified, +5/-0)
docs/channels/discord.md (modified, +1/-1)
docs/channels/matrix.md (modified, +29/-5)
docs/cli/memory.md (modified, +43/-15)
docs/cli/update.md (modified, +3/-1)
docs/concepts/dreaming.md (modified, +121/-194)
docs/concepts/memory-qmd.md (modified, +17/-1)
docs/concepts/memory-search.md (modified, +9/-8)
docs/concepts/memory.md (modified, +12/-8)
docs/concepts/model-providers.md (modified, +2/-0)
docs/concepts/models.md (modified, +2/-0)
docs/docs.json (modified, +8/-1)
docs/gateway/configuration-reference.md (modified, +31/-12)
docs/help/faq.md (modified, +36/-0)
docs/help/testing.md (modified, +22/-0)
docs/install/updating.md (modified, +1/-0)
docs/plugins/architecture.md (modified, +1/-0)
docs/plugins/building-plugins.md (modified, +1/-0)
docs/plugins/manifest.md (modified, +76/-30)
docs/plugins/sdk-migration.md (modified, +11/-1)
docs/plugins/sdk-overview.md (modified, +22/-9)
docs/providers/bedrock-mantle.md (modified, +20/-7)
docs/providers/bedrock.md (modified, +29/-0)
docs/providers/comfy.md (added, +201/-0)
docs/providers/fal.md (modified, +2/-1)
docs/providers/google.md (modified, +30/-0)
docs/providers/index.md (modified, +4/-0)
docs/providers/minimax.md (modified, +29/-0)
docs/providers/models.md (modified, +4/-0)
docs/providers/openai.md (modified, +10/-2)
docs/providers/runway.md (added, +63/-0)
docs/providers/vydra.md (added, +123/-0)
docs/reference/memory-config.md (modified, +117/-98)
docs/tools/image-generation.md (modified, +21/-17)
docs/tools/index.md (modified, +14/-7)
docs/tools/lobster.md (modified, +11/-9)
docs/tools/music-generation.md (added, +208/-0)
docs/tools/plugin.md (modified, +1/-0)
docs/tools/slash-commands.md (modified, +1/-1)
docs/tools/video-generation.md (modified, +147/-84)
docs/web/control-ui.md (modified, +4/-1)
docs/web/dashboard.md (modified, +2/-0)
dream-diary-preview-v2.html (added, +399/-0)
dream-diary-preview-v3.html (added, +323/-0)
extensions/amazon-bedrock-mantle/api.ts (modified, +2/-0)
extensions/amazon-bedrock-mantle/bedrock-token-generator.d.ts (added, +6/-0)
extensions/amazon-bedrock-mantle/discovery.test.ts (modified, +101/-3)
extensions/amazon-bedrock-mantle/discovery.ts (modified, +64/-13)
extensions/amazon-bedrock-mantle/package.json (modified, +3/-0)
extensions/bluebubbles/src/accounts.ts (modified, +5/-1)
extensions/bluebubbles/src/monitor.ts (modified, +1/-1)
extensions/browser/src/browser/chrome.default-browser.test.ts (modified, +2/-6)
extensions/browser/src/browser/client-fetch.loopback-auth.test.ts (modified, +2/-6)
extensions/browser/src/browser/control-service.plugin-disabled.test.ts (modified, +2/-6)
extensions/browser/src/browser/profiles-service.test.ts (modified, +5/-8)
extensions/browser/src/browser/pw-tools-core.clamps-timeoutms-scrollintoview.test.ts (modified, +2/-6)
extensions/browser/src/browser/pw-tools-core.interactions.batch.test.ts (modified, +2/-6)
extensions/browser/src/browser/pw-tools-core.interactions.evaluate.abort.test.ts (modified, +2/-6)
extensions/browser/src/browser/pw-tools-core.interactions.set-input-files.test.ts (modified, +2/-4)
extensions/browser/src/browser/pw-tools-core.last-file-chooser-arm-wins.test.ts (modified, +2/-6)
extensions/browser/src/browser/pw-tools-core.screenshots-element-selector.test.ts (modified, +2/-6)
extensions/browser/src/browser/routes/agent.existing-session.test.ts (modified, +3/-8)
extensions/browser/src/browser/routes/basic.existing-session.test.ts (modified, +3/-8)
extensions/browser/src/browser/server-context.existing-session.test.ts (modified, +3/-8)
extensions/browser/src/browser/server-context.hot-reload-profiles.test.ts (modified, +6/-12)
extensions/browser/src/browser/server-context.remote-profile-tab-ops.fallback.test.ts (modified, +2/-6)
extensions/browser/src/browser/server-context.remote-profile-tab-ops.playwright.test.ts (modified, +2/-6)
extensions/browser/src/browser/server-lifecycle.test.ts (modified, +3/-8)
extensions/browser/src/browser/server.control-server.test-harness.ts (modified, +2/-1)
extensions/browser/src/browser/server.evaluate-disabled-does-not-block-storage.test.ts (modified, +3/-8)
extensions/browser/src/cli/browser-cli.test-support.ts (modified, +1/-1)
extensions/browser/src/cli/command-format.ts (modified, +1/-1)
extensions/browser/src/config/config.ts (modified, +1/-1)
extensions/browser/src/core-api.ts (modified, +25/-20)
extensions/browser/src/doctor-browser.ts (modified, +1/-1)
extensions/browser/src/gateway/auth.ts (modified, +1/-1)
extensions/browser/src/gateway/startup-auth.ts (modified, +1/-1)
extensions/browser/src/infra/errors.ts (modified, +1/-1)
extensions/browser/src/infra/fs-safe.ts (modified, +1/-1)
extensions/browser/src/infra/net/proxy-env.ts (modified, +1/-1)
extensions/browser/src/infra/net/ssrf.ts (modified, +1/-1)
extensions/browser/src/infra/path-guards.ts (modified, +1/-1)
extensions/browser/src/infra/ports.ts (modified, +1/-1)

PR #61506: fix: prefer final-answer text across web chat surfaces

Repository: openclaw/openclaw
Author: steipete
State: closed | merged: True
Link: https://github.com/openclaw/openclaw/pull/61506

Description (problem / solution / changelog)

Summary

prefer assistant final_answer text over commentary in web chat and previews
suppress commentary text when completed WS replies are persisted
share assistant phase helpers and patch the remaining chat.history phase-blind path

Verification

pnpm test src/shared/chat-message-content.test.ts src/agents/pi-embedded-utils.test.ts src/agents/pi-embedded-subscribe.handlers.messages.test.ts src/agents/openai-ws-stream.test.ts src/gateway/session-utils.fs.test.ts ui/src/ui/chat/message-extract.test.ts
pnpm build

Notes

pnpm check on fresh origin/main is currently red from unrelated typing drift outside this branch's touched surface.
follow-up to #59150 / #59643

Changed files

CHANGELOG.md (modified, +1/-0)
src/agents/openai-ws-message-conversion.ts (modified, +38/-48)
src/agents/openai-ws-stream.test.ts (modified, +46/-17)
src/agents/openai-ws-stream.ts (modified, +4/-15)
src/agents/pi-embedded-subscribe.handlers.messages.ts (modified, +3/-43)
src/agents/pi-embedded-utils.ts (modified, +5/-31)
src/gateway/server-methods/chat.ts (modified, +2/-29)
src/gateway/server.chat.gateway-server-chat.test.ts (modified, +23/-0)
src/gateway/session-utils.fs.test.ts (modified, +50/-0)
src/gateway/session-utils.fs.ts (modified, +10/-0)
src/shared/chat-message-content.test.ts (modified, +111/-1)
src/shared/chat-message-content.ts (modified, +146/-0)
ui/src/ui/chat/message-extract.test.ts (modified, +35/-0)
ui/src/ui/chat/message-extract.ts (modified, +3/-1)

RAW_BUFFERClick to expand / collapse

Summary

We observed two user-facing reply integrity issues during Telegram channel conversations using OpenClaw with Codex / OpenAI Responses-style flows:

commentary-like internal text can appear in final assistant-visible replies
duplicate visible replies can occur when a turn already performed a user-visible tool send

This report is based on:

transcript evidence
inspection of compiled dist code paths in the installed OpenClaw runtime

Symptoms

Commentary leak

Users can receive text that looks like internal drafting/commentary, for example:

Need answer yes can reinstall maybe already exists.

Duplicate replies

A single turn can produce two visible replies instead of one:

one normal assistant reply
one additional reply associated with delivery-mirror

This appears to be two separate outbound messages, not one duplicated string.

Evidence

Confirmed by transcript

Session transcript:

/root/.openclaw/agents/main/sessions/2607bbec-bdf2-4920-ac61-fc8ef586c4cc.jsonl

Findings:

commentary-phase text appeared inside assistant-visible content
duplicate replies were recorded as two separate assistant messages
the second message often appeared as:
- provider=openclaw
- model=delivery-mirror

Confirmed by code inspection

In compiled dist code:

Commentary path

file: dist/auth-profiles-B5ypC5S-.js
function: buildAssistantMessageFromResponse(response, modelInfo)

item.phase is parsed, but commentary-phase output_text still appears to be collected into final assistant content.

Delivery-mirror path

file: dist/auth-profiles-B5ypC5S-.js
functions:
- appendAssistantMessageToSessionTranscript(params)
- deliverOutboundPayloads(...)
- executeSendAction(params)
- sendMessage$1(params)

delivery-mirror is explicitly generated as an assistant-side transcript message after outbound send success.

Root cause assessment

Confirmed / strongly supported

commentary leakage appears to happen in final response assembly
- buildAssistantMessageFromResponse(...) parses item.phase but still appears to include commentary-phase text in final assistant content
duplicate reply behavior involves a distinct delivery-mirror assistant message path
- this is not just a Telegram rendering artifact

High-probability inference

The duplicate visible reply issue is likely caused by missing turn-level mutual exclusion between:

user-visible tool send
delivery-mirror append
normal final assistant reply delivery

We have not yet fully proven the exact final outbound dispatch function in every case.

Suggested fix directions

1. Commentary leak

In buildAssistantMessageFromResponse(...):

exclude items where phase === "commentary" from final assistant-visible content

2. Duplicate replies

Introduce turn-level suppression logic:

if a turn already performed a user-visible tool send
suppress the normal final assistant reply for that same turn

This would avoid double visible delivery without necessarily removing transcript mirror support.

Notes / impact

This investigation is intentionally conservative:

transcript evidence is direct
code-path assessment is based on compiled dist artifacts
some final-delivery details still need deeper tracing

User impact is significant because these issues affect:

reply integrity
trust
transcript cleanliness
duplicate visible messages

extent analysis

TL;DR

The most likely fix involves modifying the buildAssistantMessageFromResponse function to exclude commentary-phase text and introducing turn-level suppression logic to prevent duplicate replies.

Guidance

Review the buildAssistantMessageFromResponse function to ensure it correctly filters out commentary-phase text by checking the item.phase property and excluding items where phase === "commentary".
Introduce turn-level suppression logic to prevent duplicate replies by checking if a turn has already performed a user-visible tool send and suppressing the normal final assistant reply for that turn if necessary.
Verify the changes by inspecting the session transcript and checking for commentary leakage and duplicate replies.
Consider adding additional logging or debugging statements to help identify and fix any remaining issues.

Example

function buildAssistantMessageFromResponse(response, modelInfo) {
  // Filter out commentary-phase text
  const filteredResponse = response.items.filter(item => item.phase !== "commentary");
  // ...
}

Notes

The suggested fix directions are based on the provided evidence and code inspection, but further investigation may be necessary to fully resolve the issues. The user impact is significant, and addressing these issues will help maintain reply integrity, trust, and transcript cleanliness.

Recommendation

Apply the suggested workaround by modifying the buildAssistantMessageFromResponse function and introducing turn-level suppression logic, as this approach is likely to fix the commentary leakage and duplicate replies issues.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#search optimization #API routing #API middleware #SSR setup #ISR setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.