openclaw - 💡(How to fix) Fix [Bug]: Dynamic TTS auto-delivery is suppressed in message-tool-only channel contexts

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

On 2026.5.16-beta.7, dynamic tts tool calls in Discord and Telegram report success, synthesize valid Opus audio, and stage the generated audio under OpenClaw outbound media, but no channel provider send is invoked.

Root Cause

This should hold even when normal final replies are private in message_tool_only channel contexts, because the tts tool contract says audio is auto-delivered from the tool result.

Fix Action

Fix / Workaround

  • src/agents/tools/tts-tool.ts returns successful TTS output with details.media.mediaUrl, trustedLocalMedia: true, and optional audioAsVoice.
  • src/agents/tools/tts-tool.ts describes the tool as: Audio auto-delivered from tool result.
  • src/auto-reply/reply/groups.ts tells Discord/Telegram channel agents that normal final replies are private in message_tool_only contexts and visible output must use the message tool.
  • src/auto-reply/reply/dispatch-from-config.ts suppresses source delivery when sourceReplyDeliveryMode === "message_tool_only" unless reply metadata has deliverDespiteSourceReplySuppression.
  • src/agents/pi-embedded-subscribe.handlers.tools.ts can extract structured tool media and queue it.
  • src/agents/pi-embedded-runner/run/tool-media-payloads.ts merges attempt tool media into reply payloads.
  • src/agents/pi-embedded-runner/run/tool-media-payloads.ts explicitly avoids attaching tool media to sourceReplyTranscriptMirror payloads in message_tool_only mode.
  • Discord and Telegram both have channel delivery paths capable of sending media-bearing payloads.

Code Example

tts tool succeeds -> valid local Opus is created -> media/outbound copy is staged -> no channel provider send is invoked

---

Discord user message timestamp: 2026-05-18T13:40:09.091Z
Run ID: 999bd8e8-9393-4308-bf12-16b6619cc1f6
Session ID: b1fe8945-936e-4257-81d4-a4885c718857
TTS tool call timestamp: 2026-05-18T13:40:20.139Z
TTS tool result timestamp: 2026-05-18T13:40:22.330Z

---

He never went out without a book under his arm, and he often came back with two.

---

(spoken) He never went out without a book under his arm, and he often came back with two.

---

/tmp/openclaw/tts-Td3qRX/voice-1779111622307.opus
31841 bytes
OggS / OpusHead header

---

/home/node/.openclaw/media/outbound/voice-1779111622307---7da79eac-7bc5-40d7-9159-9e452cd7ecba.ogg
31841 bytes
OggS / OpusHead header
mtime: 2026-05-18 23:40:24.859 +1000

---

Telegram inbound timestamp: 2026-05-18T13:53:53.895Z
Run ID: ba4a27c8-0d1a-43ba-81ef-ced81c9d6239
Session ID: f707824d-37ca-421a-bb9b-a6cd66343d91
TTS tool call timestamp: 2026-05-18T13:54:08.803Z
TTS tool result timestamp: 2026-05-18T13:54:12.170Z

---

For the OpenClaw TTS test: Tradition! (laughs) Short, iconic, and safely through the voice path.

---

(spoken) For the OpenClaw TTS test: Tradition! (laughs) Short, iconic, and safely through the voice path.

---

/tmp/openclaw/tts-7TFPlU/voice-1779112452148.opus
54615 bytes
OggS / OpusHead header

---

/home/node/.openclaw/media/outbound/voice-1779112452148---06ebe4c7-ee9d-4b41-8cd9-2efa6acdf79b.ogg
54615 bytes
OggS / OpusHead header
mtime: 2026-05-18 23:54:15.076 +1000

---

telegram outbound send ok accountId=default chatId=6479169830 messageId=1237 operation=sendMessage deliveryKind=text chunkCount=1

---

Source pointers from `v2026.5.16-beta.7`:

- `src/agents/tools/tts-tool.ts` returns successful TTS output with `details.media.mediaUrl`, `trustedLocalMedia: true`, and optional `audioAsVoice`.
- `src/agents/tools/tts-tool.ts` describes the tool as: `Audio auto-delivered from tool result`.
- `src/auto-reply/reply/groups.ts` tells Discord/Telegram channel agents that normal final replies are private in `message_tool_only` contexts and visible output must use the message tool.
- `src/auto-reply/reply/dispatch-from-config.ts` suppresses source delivery when `sourceReplyDeliveryMode === "message_tool_only"` unless reply metadata has `deliverDespiteSourceReplySuppression`.
- `src/agents/pi-embedded-subscribe.handlers.tools.ts` can extract structured tool media and queue it.
- `src/agents/pi-embedded-runner/run/tool-media-payloads.ts` merges attempt tool media into reply payloads.
- `src/agents/pi-embedded-runner/run/tool-media-payloads.ts` explicitly avoids attaching tool media to `sourceReplyTranscriptMirror` payloads in `message_tool_only` mode.
- Discord and Telegram both have channel delivery paths capable of sending media-bearing payloads.

Runtime evidence from both fresh runs:

- The `tts` tool result succeeded.
- The generated temp audio file was valid Ogg/Opus.
- The staged outbound media file was valid Ogg/Opus.
- The final visible assistant mirror contained only text such as "Sent the TTS test..." and no media.
- Gateway logs show no channel send invocation for the staged audio.
RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

On 2026.5.16-beta.7, dynamic tts tool calls in Discord and Telegram report success, synthesize valid Opus audio, and stage the generated audio under OpenClaw outbound media, but no channel provider send is invoked.

Steps to reproduce

  1. Run OpenClaw 2026.5.16-beta.7 with TTS and Discord or Telegram configured.
  2. In Discord or Telegram, ask the agent to send a TTS message.
  3. Let the agent invoke the dynamic tts tool.
  4. Observe that the tts tool reports success.
  5. Inspect generated media, media/outbound, gateway logs, and channel history.

Expected behavior

A successful dynamic tts tool result should be delivered to the originating channel as audio/media.

If the tool reports success and OpenClaw stages a trusted local audio file under media/outbound, the relevant channel delivery path should receive a send request for that staged media.

This should hold even when normal final replies are private in message_tool_only channel contexts, because the tts tool contract says audio is auto-delivered from the tool result.

Actual behavior

The observed failure boundary is the same in both fresh traces:

tts tool succeeds -> valid local Opus is created -> media/outbound copy is staged -> no channel provider send is invoked

This is not an Opus encoding failure, not a TTS provider failure, and not a Discord/Telegram API rejection in the observed runs. The provider send path is not reached.

Discord evidence

Fresh observed Discord run:

Discord user message timestamp: 2026-05-18T13:40:09.091Z
Run ID: 999bd8e8-9393-4308-bf12-16b6619cc1f6
Session ID: b1fe8945-936e-4257-81d4-a4885c718857
TTS tool call timestamp: 2026-05-18T13:40:20.139Z
TTS tool result timestamp: 2026-05-18T13:40:22.330Z

The agent called tts with:

He never went out without a book under his arm, and he often came back with two.

The tool result was successful and returned:

(spoken) He never went out without a book under his arm, and he often came back with two.

Generated audio:

/tmp/openclaw/tts-Td3qRX/voice-1779111622307.opus
31841 bytes
OggS / OpusHead header

Staged outbound media:

/home/node/.openclaw/media/outbound/voice-1779111622307---7da79eac-7bc5-40d7-9159-9e452cd7ecba.ogg
31841 bytes
OggS / OpusHead header
mtime: 2026-05-18 23:40:24.859 +1000

Discord channel history after the request contained no bot reply and no attachment for this TTS run. Gateway log around the run showed the agent run starting but no message.action channel=discord and no Discord media send for the staged audio.

Telegram evidence

Fresh observed Telegram run:

Telegram inbound timestamp: 2026-05-18T13:53:53.895Z
Run ID: ba4a27c8-0d1a-43ba-81ef-ced81c9d6239
Session ID: f707824d-37ca-421a-bb9b-a6cd66343d91
TTS tool call timestamp: 2026-05-18T13:54:08.803Z
TTS tool result timestamp: 2026-05-18T13:54:12.170Z

The user requested an iconic line from Fiddler on the Roof. The agent called tts with:

For the OpenClaw TTS test: Tradition! (laughs) Short, iconic, and safely through the voice path.

The tool result was successful and returned:

(spoken) For the OpenClaw TTS test: Tradition! (laughs) Short, iconic, and safely through the voice path.

Generated audio:

/tmp/openclaw/tts-7TFPlU/voice-1779112452148.opus
54615 bytes
OggS / OpusHead header

Staged outbound media:

/home/node/.openclaw/media/outbound/voice-1779112452148---06ebe4c7-ee9d-4b41-8cd9-2efa6acdf79b.ogg
54615 bytes
OggS / OpusHead header
mtime: 2026-05-18 23:54:15.076 +1000

Gateway log shows Telegram inbound and agent run start, but no Telegram send after the TTS run. Earlier successful Telegram text replies in the same log emit lines such as:

telegram outbound send ok accountId=default chatId=6479169830 messageId=1237 operation=sendMessage deliveryKind=text chunkCount=1

No equivalent telegram outbound send ok, telegram sendMessage ok, sendVoice, sendAudio, or sendDocument appears for the fresh TTS run.

OpenClaw version

OpenClaw 2026.5.16-beta.7 tag: v2026.5.16-beta.7 commit: fff4532d69d77fe1a8ca3baeaea4b7306cc40456

Operating system

Linux container / OpenShift deployment.

Install method

docker

Model

codex/gpt-5.5

Provider / routing chain

TTS provider configuration is working in these runs. Synthesis produced valid local Opus files and OpenClaw staged them under media/outbound. The observed failure is after synthesis and outbound media staging.

Additional provider/model setup details

TTS provider configuration is working in these runs. Synthesis produced valid local Opus files and OpenClaw staged them under media/outbound.

The observed failure is after synthesis and outbound media staging.

Logs, screenshots, and evidence

Source pointers from `v2026.5.16-beta.7`:

- `src/agents/tools/tts-tool.ts` returns successful TTS output with `details.media.mediaUrl`, `trustedLocalMedia: true`, and optional `audioAsVoice`.
- `src/agents/tools/tts-tool.ts` describes the tool as: `Audio auto-delivered from tool result`.
- `src/auto-reply/reply/groups.ts` tells Discord/Telegram channel agents that normal final replies are private in `message_tool_only` contexts and visible output must use the message tool.
- `src/auto-reply/reply/dispatch-from-config.ts` suppresses source delivery when `sourceReplyDeliveryMode === "message_tool_only"` unless reply metadata has `deliverDespiteSourceReplySuppression`.
- `src/agents/pi-embedded-subscribe.handlers.tools.ts` can extract structured tool media and queue it.
- `src/agents/pi-embedded-runner/run/tool-media-payloads.ts` merges attempt tool media into reply payloads.
- `src/agents/pi-embedded-runner/run/tool-media-payloads.ts` explicitly avoids attaching tool media to `sourceReplyTranscriptMirror` payloads in `message_tool_only` mode.
- Discord and Telegram both have channel delivery paths capable of sending media-bearing payloads.

Runtime evidence from both fresh runs:

- The `tts` tool result succeeded.
- The generated temp audio file was valid Ogg/Opus.
- The staged outbound media file was valid Ogg/Opus.
- The final visible assistant mirror contained only text such as "Sent the TTS test..." and no media.
- Gateway logs show no channel send invocation for the staged audio.

Impact and severity

Affected: Discord and Telegram channel/direct TTS workflows observed on 2026.5.16-beta.7.

Severity: High for channel TTS workflows.

Frequency: 2/2 fresh observed attempts across two providers with the same failure boundary.

Consequence: The agent reports that TTS was sent, but the user receives no audio. This makes successful dynamic TTS misleading and effectively unusable on the observed channel paths.

Additional information

This should be treated as a shared dynamic-TTS channel-delivery bridge issue rather than separate Discord and Telegram adapter bugs.

Best current theory:

Dynamic tts media is generated and normalized, but it is not represented as a deliverable internal source reply in message_tool_only channel contexts. It therefore follows the ordinary final/tool output path and is suppressed by source-reply policy before it can invoke Discord or Telegram delivery.

The relevant contract mismatch appears to be:

  • The tts dynamic tool advertises: Audio auto-delivered from tool result.
  • Discord/Telegram channel contexts use sourceReplyDeliveryMode: "message_tool_only".
  • In that mode, ordinary final/source delivery is suppressed unless reply metadata explicitly allows delivery despite source-reply suppression.
  • Message-tool source replies get that internal delivery metadata.
  • Dynamic tts media does not appear to get equivalent treatment.
  • Result: source-reply suppression wins, so the staged media never reaches Discord/Telegram delivery.

Suggested triage path:

  1. Trace the reply payload produced after a successful dynamic tts tool result.
  2. Confirm whether that payload receives deliverDespiteSourceReplySuppression or equivalent internal-source-reply metadata.
  3. Confirm whether the payload is skipped by sourceReplyDeliveryMode: "message_tool_only" suppression before provider delivery.
  4. Decide whether dynamic TTS should emit/mark an internal source media reply, similar to message-tool source replies, so the auto-delivery contract is honored without making ordinary final text public.
  5. Add a regression test that asserts a successful dynamic tts call in a message_tool_only Discord/Telegram-style context invokes channel delivery with the staged trusted local audio while keeping ordinary final text private.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

A successful dynamic tts tool result should be delivered to the originating channel as audio/media.

If the tool reports success and OpenClaw stages a trusted local audio file under media/outbound, the relevant channel delivery path should receive a send request for that staged media.

This should hold even when normal final replies are private in message_tool_only channel contexts, because the tts tool contract says audio is auto-delivered from the tool result.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: Dynamic TTS auto-delivery is suppressed in message-tool-only channel contexts