openclaw - ✅(Solved) Fix [Bug]: Control UI duplicates assistant replies via assistant-media delivery path [3 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#73956Fetched 2026-04-30 06:30:20
View on GitHub
Comments
0
Participants
1
Timeline
5
Reactions
0
Participants
Timeline (top)
cross-referenced ×4labeled ×1

Control UI/WebChat is showing duplicate assistant replies. Backend session history shows a normal final assistant message followed by a second gateway-injected assistant message with idempotencyKey ending in :assistant-media, repeating the same text.

Root Cause

Control UI/WebChat is showing duplicate assistant replies. Backend session history shows a normal final assistant message followed by a second gateway-injected assistant message with idempotencyKey ending in :assistant-media, repeating the same text.

Fix Action

Fix / Workaround

Current mitigation: set messages.tts.auto = "off" and avoid TTS directives, but this did not stop the duplicate assistant-media messages in the observed session.

PR fix notes

PR #73962: fix(gateway): skip text-only assistant media transcript

Description (problem / solution / changelog)

Summary

  • Fixes #73956.
  • Root cause: the webchat assistant-media transcript path treated any media-bearing reply payload as appendable, then fell back to visible text when stale or unavailable media could not be embedded. That could store a second gateway-injected assistant message with the same text as the model answer.
  • The fix only appends an assistant-media transcript message after the gateway has built persisted non-text media content, so text-only fallbacks are left to the normal model transcript.

Why This Is Safe

  • Successful audio/image assistant-media replies still append when non-text content is present.
  • Non-agent command TTS behavior is unchanged; this guard is only in the agent-run assistant-media supplement path.
  • Security and runtime controls are unchanged: local media access policy, trusted media checks, sensitive-media filtering, idempotency keys, auth scopes, and transcript emission boundaries remain the same.

Tests

  • git diff --check
  • pnpm test src/gateway/server-methods/chat.directive-tags.test.ts -- --reporter=verbose
  • pnpm check:changed

Out Of Scope

  • No changes to TTS enablement/config defaults.
  • No changes to assistant-media HTTP serving or Control UI playback.
  • No transcript migration or cleanup for already-stored duplicate messages.

Made with Cursor

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • src/gateway/server-methods/chat.directive-tags.test.ts (modified, +48/-0)
  • src/gateway/server-methods/chat.ts (modified, +4/-4)

PR #74501: Fix duplicate Control UI assistant-media replies

Description (problem / solution / changelog)

Summary

  • Problem: Control UI/WebChat could persist a second gateway-injected assistant message from the :assistant-media path when a media-bearing agent payload no longer resolved to playable media.
  • Why it matters: stale or non-playable TTS/media references could repeat the final assistant text in the transcript, making the assistant appear to reply twice.
  • What changed: the agent-run assistant-media transcript append now requires resolved media content before writing the supplemental transcript message.
  • What did NOT change (scope boundary): command/non-agent media replies still preserve visible text when no model transcript exists.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #73956
  • This PR fixes a bug or regression

Root Cause (if applicable)

  • Root cause: the agent-run assistant-media append path gated on the presence of mediaUrl/mediaUrls, then fell back to text when those media references failed to resolve to embeddable content.
  • Missing detection / guardrail: there was coverage for resolved TTS audio not duplicating text, but not for stale/non-playable media references with visible text.
  • Contributing context: the supplemental assistant-media message is only needed when it can carry media content that the normal model transcript does not already contain.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/gateway/server-methods/chat.directive-tags.test.ts
  • Scenario the test should lock in: an agent-run final payload with text and a stale local audio path must not append a second assistant transcript message.
  • Why this is the smallest reliable guardrail: the test exercises the WebChat chat.send transcript append seam where the duplicate was created.
  • Existing test that already covers this (if any): resolved auto-TTS media is covered by the adjacent audio-only transcript test.
  • If no new test is added, why not: N/A

User-visible / Behavior Changes

Control UI/WebChat should no longer show duplicate text-only assistant replies from stale or non-playable assistant-media/TTS supplements.

Diagram (if applicable)

Before:
agent final text + stale media ref -> assistant-media fallback -> duplicate text transcript message

After:
agent final text + stale media ref -> no resolved media -> no supplemental assistant-media transcript message

Security Impact (required)

  • New permissions/capabilities? (Yes/No) No
  • Secrets/tokens handling changed? (Yes/No) No
  • New/changed network calls? (Yes/No) No
  • Command/tool execution surface changed? (Yes/No) No
  • Data access scope changed? (Yes/No) No
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: Windows local dev environment
  • Runtime/container: Node/pnpm local repo checkout
  • Model/provider: Not provider-specific
  • Integration/channel (if any): Control UI / WebChat
  • Relevant config (redacted): agent-run payload with text plus stale local audio media path

Steps

  1. Run WebChat chat.send through the gateway seam with an agent run started.
  2. Emit a final reply payload containing visible text and a stale local audio mediaUrl/mediaUrls.
  3. Inspect emitted assistant transcript updates.

Expected

  • No supplemental :assistant-media assistant transcript message is appended when no playable media resolves.

Actual

  • Before this change, the assistant-media path could append a text-only duplicate of the normal final answer.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

What I personally verified:

  • pnpm test src/gateway/server-methods/chat.directive-tags.test.ts -- --reporter=verbose
  • pnpm test src/gateway/server-methods/chat.directive-tags.test.ts -t "does not persist agent media supplements" -- --reporter=verbose
  • pnpm check:changed

Edge cases checked:

  • Resolved auto-TTS media still appends audio-only assistant media content via the existing adjacent test.

What I did not verify:

  • Full pnpm test suite.
  • Browser-driven Control UI manual session.

Note:

  • pnpm test:changed currently fails on unrelated Windows path expectations in src/crestodian/operations.test.ts and src/crestodian/rescue-message.test.ts (/tmp/work vs C:\tmp\work).

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes/No) Yes
  • Config/env changes? (Yes/No) No
  • Migration needed? (Yes/No) No
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: a malformed media payload might no longer create a text-only supplemental transcript entry.
    • Mitigation: agent final text is already represented by the normal model transcript; command/non-agent media fallback behavior is unchanged.

Superseded by #74502 from the non-codex branch requested by the author.

Changed files

  • src/gateway/server-methods/chat.directive-tags.test.ts (modified, +42/-0)
  • src/gateway/server-methods/chat.ts (modified, +3/-0)

PR #74502: Fix duplicate Control UI assistant-media replies

Description (problem / solution / changelog)

Summary

  • Problem: Control UI/WebChat could persist a second gateway-injected assistant message from the :assistant-media path when a media-bearing agent payload no longer resolved to playable media.
  • Why it matters: stale or non-playable TTS/media references could repeat the final assistant text in the transcript, making the assistant appear to reply twice.
  • What changed: the agent-run assistant-media transcript append now requires resolved media content before writing the supplemental transcript message.
  • What did NOT change (scope boundary): command/non-agent media replies still preserve visible text when no model transcript exists.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #73956
  • This PR fixes a bug or regression

Root Cause (if applicable)

  • Root cause: the agent-run assistant-media append path gated on the presence of mediaUrl/mediaUrls, then fell back to text when those media references failed to resolve to embeddable content.
  • Missing detection / guardrail: there was coverage for resolved TTS audio not duplicating text, but not for stale/non-playable media references with visible text.
  • Contributing context: the supplemental assistant-media message is only needed when it can carry media content that the normal model transcript does not already contain.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/gateway/server-methods/chat.directive-tags.test.ts
  • Scenario the test should lock in: an agent-run final payload with text and a stale local audio path must not append a second assistant transcript message.
  • Why this is the smallest reliable guardrail: the test exercises the WebChat chat.send transcript append seam where the duplicate was created.
  • Existing test that already covers this (if any): resolved auto-TTS media is covered by the adjacent audio-only transcript test.
  • If no new test is added, why not: N/A

User-visible / Behavior Changes

Control UI/WebChat should no longer show duplicate text-only assistant replies from stale or non-playable assistant-media/TTS supplements.

Diagram (if applicable)

Before:
agent final text + stale media ref -> assistant-media fallback -> duplicate text transcript message

After:
agent final text + stale media ref -> no resolved media -> no supplemental assistant-media transcript message

Security Impact (required)

  • New permissions/capabilities? (Yes/No) No
  • Secrets/tokens handling changed? (Yes/No) No
  • New/changed network calls? (Yes/No) No
  • Command/tool execution surface changed? (Yes/No) No
  • Data access scope changed? (Yes/No) No
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: Windows local dev environment
  • Runtime/container: Node/pnpm local repo checkout
  • Model/provider: Not provider-specific
  • Integration/channel (if any): Control UI / WebChat
  • Relevant config (redacted): agent-run payload with text plus stale local audio media path

Steps

  1. Run WebChat chat.send through the gateway seam with an agent run started.
  2. Emit a final reply payload containing visible text and a stale local audio mediaUrl/mediaUrls.
  3. Inspect emitted assistant transcript updates.

Expected

  • No supplemental :assistant-media assistant transcript message is appended when no playable media resolves.

Actual

  • Before this change, the assistant-media path could append a text-only duplicate of the normal final answer.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

What I personally verified:

  • pnpm test src/gateway/server-methods/chat.directive-tags.test.ts -- --reporter=verbose
  • pnpm test src/gateway/server-methods/chat.directive-tags.test.ts -t "does not persist agent media supplements" -- --reporter=verbose
  • pnpm check:changed

Edge cases checked:

  • Resolved auto-TTS media still appends audio-only assistant media content via the existing adjacent test.

What I did not verify:

  • Full pnpm test suite.
  • Browser-driven Control UI manual session.

Note:

  • pnpm test:changed currently fails on unrelated Windows path expectations in src/crestodian/operations.test.ts and src/crestodian/rescue-message.test.ts (/tmp/work vs C:\tmp\work).

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes/No) Yes
  • Config/env changes? (Yes/No) No
  • Migration needed? (Yes/No) No
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: a malformed media payload might no longer create a text-only supplemental transcript entry.
    • Mitigation: agent final text is already represented by the normal model transcript; command/non-agent media fallback behavior is unchanged.

Supersedes #74501, which used the previous codex-prefixed branch name.

Changed files

  • src/gateway/server-methods/chat.directive-tags.test.ts (modified, +42/-0)
  • src/gateway/server-methods/chat.ts (modified, +3/-0)

Code Example

Observed in Control UI / WebChat.

Reproduction sequence:
1. Enable TTS / test assistant audio.
2. Notice duplicate assistant replies.
3. Set `messages.tts.auto = "off"`.
4. Send a text-only reply with no media, no TTS directive, and no tool narration.
5. Duplicate still appears.

Backend session history showed:

Normal final answer:
- api: openai-codex-responses
- provider: openai-codex
- model: gpt-5.5
- textSignature phase: final_answer

Duplicate message:
- api: openai-responses
- provider: openclaw
- model: gateway-injected
- idempotencyKey: <uuid>:assistant-media
- same text repeated
- appears roughly 67 seconds after the original reply

Example duplicate text:
“Text-only test: one clean reply, no TTS, no media, no tool narration.
RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

Control UI/WebChat is showing duplicate assistant replies. Backend session history shows a normal final assistant message followed by a second gateway-injected assistant message with idempotencyKey ending in :assistant-media, repeating the same text.

Steps to reproduce

After enabling and then disabling TTS, assistant replies continue to appear twice in Control UI.

This reproduces even with:

messages.tts.auto = "off"

Backend session history shows the duplicate is not just a UI-only visual replay. A second assistant message is stored after the normal final answer.

Observed pattern:

  1. Normal assistant final answer is stored.
  2. Roughly 6–7 seconds later, a second assistant message is stored with:
    • api: openai-responses
    • provider: openclaw
    • model: gateway-injected
    • idempotencyKey ending in :assistant-media
  3. The duplicate repeats the same text as the final answer.

This seems related to the assistant-media / TTS delivery path, but continues even after TTS is disabled.

Expected behavior

Each assistant reply should appear once. The assistant-media delivery path should not replay or fallback-send duplicate text when there is no playable media to deliver, especially when TTS is disabled.

Actual behavior

Affected: Control UI / WebChat conversations Severity: Medium annoyance / usability bug Frequency: Every assistant reply once triggered Consequence: Chat becomes noisy and confusing; it looks like the assistant is repeating itself.

OpenClaw version

OpenClaw 2026.4.26 (be8c246)

Operating system

Ubuntu 26

Install method

npm global

Model

openai-codex/gpt-5.5

Provider / routing chain

openclaw -> openai-codex via oauth

Additional provider/model setup details

Related but distinct from #73898, which requests auto-play support for assistant TTS audio.

This bug is about duplicate text replies being created through the assistant-media path even after TTS is disabled.

Current mitigation: set messages.tts.auto = "off" and avoid TTS directives, but this did not stop the duplicate assistant-media messages in the observed session.

Logs, screenshots, and evidence

Observed in Control UI / WebChat.

Reproduction sequence:
1. Enable TTS / test assistant audio.
2. Notice duplicate assistant replies.
3. Set `messages.tts.auto = "off"`.
4. Send a text-only reply with no media, no TTS directive, and no tool narration.
5. Duplicate still appears.

Backend session history showed:

Normal final answer:
- api: openai-codex-responses
- provider: openai-codex
- model: gpt-5.5
- textSignature phase: final_answer

Duplicate message:
- api: openai-responses
- provider: openclaw
- model: gateway-injected
- idempotencyKey: <uuid>:assistant-media
- same text repeated
- appears roughly 6–7 seconds after the original reply

Example duplicate text:
“Text-only test: one clean reply, no TTS, no media, no tool narration.”

Impact and severity

Affected: Control UI / WebChat conversations Severity: Medium annoyance / usability bug Frequency: Every assistant reply once triggered Consequence: Chat becomes noisy and confusing; it looks like the assistant is repeating itself.

Additional information

No response

extent analysis

TL;DR

Disable the assistant-media delivery path or modify its configuration to prevent duplicate text replies when TTS is disabled.

Guidance

  • Investigate the assistant-media delivery path configuration to determine why it's sending duplicate text replies even when TTS is disabled.
  • Check the OpenClaw version (2026.4.26) documentation to see if there are any known issues or configuration options related to the assistant-media path.
  • Consider adding a conditional check to prevent the assistant-media path from sending duplicate text replies when TTS is disabled.
  • Review the backend session history to ensure that the duplicate message is not being generated by another part of the system.

Example

No code snippet is provided as the issue is related to configuration and system behavior.

Notes

The issue seems to be related to the assistant-media delivery path and its interaction with the TTS system. Disabling TTS does not seem to fully prevent the duplicate text replies, suggesting a configuration or logic issue.

Recommendation

Apply a workaround by disabling the assistant-media delivery path or modifying its configuration to prevent duplicate text replies when TTS is disabled, as this is the most direct way to address the issue.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Each assistant reply should appear once. The assistant-media delivery path should not replay or fallback-send duplicate text when there is no playable media to deliver, especially when TTS is disabled.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Bug]: Control UI duplicates assistant replies via assistant-media delivery path [3 pull requests, 1 participants]