openclaw - ✅(Solved) Fix [Bug]: /tts audio succeeds but no audio is shown in Control UI / TUI [2 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#61564Fetched 2026-04-08 02:57:17
View on GitHub
Comments
1
Participants
2
Timeline
6
Reactions
0
Timeline (top)
cross-referenced ×2labeled ×2closed ×1commented ×1

/tts audio appears to execute successfully, but in the Control UI / TUI the command disappears and no audio player, attachment, or download is shown.

The same command works correctly from WhatsApp, where the audio file/voice note is created and delivered.

This suggests the TTS backend/provider path is working, but the Control UI / TUI presentation/rendering path is broken or incomplete.

Error Message

  • No visible error is shown in the UI

Root Cause

/tts audio appears to execute successfully, but in the Control UI / TUI the command disappears and no audio player, attachment, or download is shown.

The same command works correctly from WhatsApp, where the audio file/voice note is created and delivered.

This suggests the TTS backend/provider path is working, but the Control UI / TUI presentation/rendering path is broken or incomplete.

Fix Action

Fix / Workaround

So:

  • TTS backend works
  • OpenAI provider works
  • slash command dispatch works
  • WhatsApp delivery works
  • Control UI / TUI rendering/presentation does not

PR fix notes

PR #61593: fix(gateway): show /tts audio in Control UI webchat

Description (problem / solution / changelog)

Summary

  • Problem: /tts audio returned a media-only final reply; webchat completion only aggregated text, so Control UI received a chat final event with no assistant message and nothing rendered.
  • Why it matters: TTS worked on external channels (e.g. WhatsApp) but webchat/TUI showed an empty turn.
  • What changed: Embed readable local audio files from final reply payloads as base64 content blocks, persist the same structured content via transcript inject, and render <audio controls> in the Control UI chat bubble.
  • What did NOT change: External channel delivery, TTS providers, or slash command parsing.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #61564
  • Related #
  • This PR fixes a bug or regression

Root Cause (if applicable)

  • Root cause: chat.send non-agent path built combinedReply from payload.text only; /tts audio replies are media-only.
  • Missing detection / guardrail: No handling for final payloads with local mediaUrl when the agent run never started.
  • Contributing context (if known): Regression-style gap between external channel delivery and internal webchat broadcast.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/gateway/server-methods/chat-webchat-media.test.ts
  • Scenario the test should lock in: Local mediaUrl on a reply payload produces an embedded audio content block with a data: URL.
  • Why this is the smallest reliable guardrail: Exercises the new embedding helper without a full gateway WS harness.
  • Existing test that already covers this (if any): None for this path.
  • If no new test is added, why not: N/A (test added).

User-visible / Behavior Changes

  • Control UI webchat shows a short “TTS audio” label (when there is no text) plus an HTML5 audio player for /tts audio and similar media-only finals.

Diagram (if applicable)

Before:
/tts audio -> final payloads with mediaUrl -> combinedReply "" -> no chat message

After:
/tts audio -> embed local file in assistant content -> chat final + transcript -> audio controls

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation:

Repro + Verification

Environment

  • OS: any (gateway host)
  • Runtime/container: local gateway
  • Model/provider: n/a for slash TTS path
  • Integration/channel (if any): Control UI webchat

Steps

  1. Open Control UI chat for a session.
  2. Run /tts audio hello (with TTS configured).

Expected

  • Assistant message appears with playable audio.

Actual (before fix)

  • Turn completed with no visible assistant content.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios: pnpm check, pnpm build, pnpm test src/gateway/server-methods/chat-webchat-media.test.ts, pnpm test src/gateway/server-methods/chat.inject.parentid.test.ts
  • Edge cases checked: remote https:// media URLs skipped; oversized files skipped (15MB cap).
  • What I did not verify: Live browser session against a running gateway (please spot-check).

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps:

Risks and Mitigations

  • Large audio could bloat WebSocket payloads; mitigated with a 15MB read cap and skipping non-local URLs.

Additional notes

  • Includes narrow as never casts in extensions/diffs/src/language-hints.test.ts so pnpm check (tsgo) accepts invalid language literals used intentionally by those tests.

Made with Cursor

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • extensions/diffs/src/language-hints.test.ts (modified, +3/-3)
  • src/gateway/server-methods/chat-transcript-inject.ts (modified, +36/-2)
  • src/gateway/server-methods/chat-webchat-media.test.ts (added, +89/-0)
  • src/gateway/server-methods/chat-webchat-media.ts (added, +117/-0)
  • src/gateway/server-methods/chat.ts (modified, +22/-5)
  • ui/src/styles/chat/layout.css (modified, +14/-0)
  • ui/src/ui/chat/grouped-render.ts (modified, +54/-3)

PR #61598: fix(gateway): show /tts audio in Control UI webchat

Description (problem / solution / changelog)

Summary

  • Problem: /tts audio returned a media-only final reply; webchat completion only aggregated text, so Control UI received a chat final event with no assistant message and nothing rendered.
  • Why it matters: TTS worked on external channels (e.g. WhatsApp) but webchat/TUI showed an empty turn.
  • What changed: Embed readable local audio files from final reply payloads as base64 content blocks, persist the same structured content via transcript inject, and render <audio controls> in the Control UI chat bubble.
  • What did NOT change: External channel delivery, TTS providers, or slash command parsing.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #61564
  • Related #
  • This PR fixes a bug or regression

Root Cause (if applicable)

  • Root cause: chat.send non-agent path built combinedReply from payload.text only; /tts audio replies are media-only.
  • Missing detection / guardrail: No handling for final payloads with local mediaUrl when the agent run never started.
  • Contributing context (if known): Regression-style gap between external channel delivery and internal webchat broadcast.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/gateway/server-methods/chat-webchat-media.test.ts
  • Scenario the test should lock in: Local mediaUrl on a reply payload produces an embedded audio content block with a data: URL.
  • Why this is the smallest reliable guardrail: Exercises the new embedding helper without a full gateway WS harness.
  • Existing test that already covers this (if any): None for this path.
  • If no new test is added, why not: N/A (test added).

User-visible / Behavior Changes

  • Control UI webchat shows a short “TTS audio” label (when there is no text) plus an HTML5 audio player for /tts audio and similar media-only finals.

Diagram (if applicable)

Before:
/tts audio -> final payloads with mediaUrl -> combinedReply "" -> no chat message

After:
/tts audio -> embed local file in assistant content -> chat final + transcript -> audio controls

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation:

Repro + Verification

Environment

  • OS: any (gateway host)
  • Runtime/container: local gateway
  • Model/provider: n/a for slash TTS path
  • Integration/channel (if any): Control UI webchat

Steps

  1. Open Control UI chat for a session.
  2. Run /tts audio hello (with TTS configured).

Expected

  • Assistant message appears with playable audio.

Actual (before fix)

  • Turn completed with no visible assistant content.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios: pnpm check, pnpm build, pnpm test src/gateway/server-methods/chat-webchat-media.test.ts, pnpm test src/gateway/server-methods/chat.inject.parentid.test.ts
  • Edge cases checked: remote https:// media URLs skipped; oversized files skipped (15MB cap).
  • What I did not verify: Live browser session against a running gateway (please spot-check).

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps:

Risks and Mitigations

  • Large audio could bloat WebSocket payloads; mitigated with a 15MB read cap and skipping non-local URLs.

Additional notes

  • Includes narrow as never casts in extensions/diffs/src/language-hints.test.ts so pnpm check (tsgo) accepts invalid language literals used intentionally by those tests.

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • extensions/diffs/src/language-hints.test.ts (modified, +3/-3)
  • src/gateway/server-methods/chat-transcript-inject.ts (modified, +36/-2)
  • src/gateway/server-methods/chat-webchat-media.test.ts (added, +102/-0)
  • src/gateway/server-methods/chat-webchat-media.ts (added, +121/-0)
  • src/gateway/server-methods/chat.ts (modified, +22/-5)
  • ui/src/styles/chat/layout.css (modified, +14/-0)
  • ui/src/ui/chat/grouped-render.ts (modified, +54/-3)
RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Beta release blocker

No

Summary

Summary

/tts audio appears to execute successfully, but in the Control UI / TUI the command disappears and no audio player, attachment, or download is shown.

The same command works correctly from WhatsApp, where the audio file/voice note is created and delivered.

This suggests the TTS backend/provider path is working, but the Control UI / TUI presentation/rendering path is broken or incomplete.

Environment

  • OpenClaw version: 2026.4.2
  • OS: Windows 10.0.26200 (x64)
  • Node: 24.14.0
  • Gateway: local ws://127.0.0.1:18789
  • Gateway service: running
  • Surface(s) with issue: Control UI / TUI / WebChat-style UI
  • Working comparison surface: WhatsApp
  • TTS provider: OpenAI

Steps to Reproduce

  1. Open the Control UI / TUI
  2. Run /tts status
  3. Confirm it works normally
  4. Run /tts audio hello ace
  5. Observe the command appears briefly, then disappears after a few seconds

Expected Behavior

/tts audio ... should generate a one-off audio reply and render/expose it in the Control UI / TUI.

Actual Behavior

  • The slash command briefly appears, then disappears
  • No audio player appears
  • No attachment appears
  • No browser download starts
  • No new chat/thread appears
  • No visible error is shown in the UI

Additional Evidence

  • /tts status works normally
  • Recent TTS attempts show success
  • TUI/status output shows the last attempt succeeded with provider success and latency
  • Example status indicators:
    • State: enabled
    • Provider: openai
    • Last attempt: success
    • Attempt details: openai:success(ok)

Comparison Test

Running /tts audio ... from WhatsApp successfully creates/delivers the audio.

So:

  • TTS backend works
  • OpenAI provider works
  • slash command dispatch works
  • WhatsApp delivery works
  • Control UI / TUI rendering/presentation does not

Conclusion

This appears to be a Control UI / TUI / WebChat media rendering or delivery/presentation bug for /tts audio, rather than a configuration or provider issue.

Steps to reproduce

  1. Open the Control UI / TUI
  2. Run /tts status
  3. Confirm it works normally
  4. Run /tts audio hello ace
  5. Observe the command appears briefly, then disappears after a few seconds

Expected behavior

/tts audio ... should generate a one-off audio reply and render/expose it in the Control UI / TUI.

Actual behavior

The /tts audio command just disappears.

OpenClaw version

2026.4.2

Operating system

Windows 10.0.26200 (x64)

Install method

npm global

Model

openAi

Provider / routing chain

openclaw // local gateway

Additional provider/model setup details

No response

Logs, screenshots, and evidence

Impact and severity

No response

Additional information

No response

extent analysis

TL;DR

The issue can be resolved by investigating and fixing the Control UI / TUI media rendering or delivery/presentation for /tts audio commands.

Guidance

  • Verify that the audio file is being generated correctly by the TTS backend/provider by checking the logs or status indicators.
  • Check the Control UI / TUI code for any issues with rendering or displaying audio files, such as missing or incorrect HTML elements.
  • Compare the code and configuration between the Control UI / TUI and WhatsApp to identify any differences that may be causing the issue.
  • Test the /tts audio command with different audio files or providers to see if the issue is specific to a particular file or provider.

Example

No code snippet is provided as the issue does not contain enough information to create a specific example.

Notes

The issue appears to be specific to the Control UI / TUI and does not affect WhatsApp, suggesting that the problem is with the rendering or presentation of the audio file rather than the TTS backend/provider.

Recommendation

Apply a workaround by modifying the Control UI / TUI code to correctly render or display the audio file generated by the /tts audio command, as the TTS backend/provider is working correctly.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

/tts audio ... should generate a one-off audio reply and render/expose it in the Control UI / TUI.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING