TTS directives should only trigger when they appear as **active** markup, not when they occur inside: - inline code spans (`` `[[tts:text]]` ``) - fenced code blocks (```` ``` ... ``` ````) - indented code blocks (4-space) - optionally: table cells containing code spans (or at least when inside code spans) In other words, `parseTtsDirectives` should walk the text with basic markdown code-context awareness, or a pre-pass should strip/mask code spans before the directive regex runs.

openclaw - ✅(Solved) Fix TTS `parseTtsDirectives` is markdown-blind: `[[tts:xxx]]` inside code spans / code blocks triggers auto TTS in `tagged` mode [1 pull requests, 1 participants]

richardmqq · 2026-04-19T02:03:11Z

[openclaw] parseTtsDirectives in provider-error-utils-CyJAWFR1.js uses the regex /\ \ tts: ^\ + \ \ /gi , which matches any literal tts:xxx occurrence regardle… `parseTtsDirectives` in `provider-error-utils-CyJAWFR1.js` uses the regex `/\[\[tts:([^\]]+)\]\]/gi`, which matches **any literal `[[tts:xxx]]` occurrence** regardless of surrounding markdown context. An assistant writing *about* TTS directives — for example in a troubleshooting reply, a `| TTS 链路诊断 | 查 \`[[tts:text]]\` ... |` table cell, or an inline/fenced code block — unintentionally triggers `hasDirective=true`, and in `auto: "tagged"` mode the TTS pipeline then generates and delivers a voice message that the assistant never intended to produce. # PR #68806: [AI-assisted] fix(tts): ignore literal directives inside markdown code - Repository: openclaw/openclaw - Author: lawrence3699 - State: closed | merged: False - Link: https://github.com/openclaw/openclaw/pull/68806 ## Description (problem / solution / changelog) ## Summary - Problem: `parseTtsDirectives` treated literal `[[tts:...]]` tokens inside inline code spans and fenced code blocks as real directives. - Why it matters: in `messages.tts.auto="tagged"`, troubleshooting or documentation replies could trigger unintended TTS synthesis and strip visible example text. - What changed: skip TTS directive parsing when the directive tag itself starts inside a markdown code region, and add regression tests for inline code, fenced code, mixed literal+real directives, and real text blocks that contain inline code. - What did NOT change (scope boundary): this does not address other TTS parser issues such as orphaned closing-tag leakage in #68553. ## Change Type (select all) - [x] Bug fix - [ ] Feature - [ ] Refactor required for the fix - [ ] Docs - [ ] Security hardening - [ ] Chore/infra ## Scope (select all touched areas) - [ ] Gateway / orchestration - [ ] Skills / tool execution - [ ] Auth / tokens - [ ] Memory / storage - [x] Integrations - [ ] API / contracts - [ ] UI / DX - [ ] CI/CD / infra ## Linked Issue/PR - Closes #68769 - Related #68553 - [x] This PR fixes a bug or regression ## Root Cause (if applicable) - Root cause: the TTS directive regexes were markdown-blind and stripped any literal `[[tts:...]]` token even when that token lived inside inline or fenced code. - Missing detection / guardrail: no code-region check existed before applying the directive regexes. - Contributing context (if known): `[[tts:text]]...[[/tts:text]]` blocks can legitimately contain inline code, so the guardrail has to ignore directive tags inside code without suppressing real text blocks that merely contain code. ## Regression Test Plan (if applicable) - Coverage level that should have caught this: - [x] Unit test - [ ] Seam / integration test - [ ] End-to-end test - [ ] Existing coverage already sufficient - Target test or file: `src/tts/directives.test.ts` - Scenario the test should lock in: literal directive examples inside inline/fenced markdown code stay visible text and do not trigger TTS; real directives outside code still parse. - Why this is the smallest reliable guardrail: the bug lives entirely inside `parseTtsDirectives`, so a focused parser unit test exercises the broken branch directly without needing live provider/channel setup. - Existing test that already covers this (if any): none - If no new test is added, why not: N/A ## User-visible / Behavior Changes - Literal TTS directive examples inside inline code or fenced code stay visible text and no longer trigger tagged-mode TTS. - Real TTS directives outside markdown code still behave as before, including `[[tts:text]]...[[/tts:text]]` blocks that contain inline code. ## Diagram (if applicable) N/A ## Security Impact (required) - New permissions/capabilities? (`No`) - Secrets/tokens handling changed? (`No`) - New/changed network calls? (`No`) - Command/tool execution surface changed? (`No`) - Data access scope changed? (`No`) - If any `Yes`, explain risk + mitigation: ## Repro + Verification ### Environment - OS: macOS - Runtime/container: Node v24.15.0 / pnpm v10.33.0 - Model/provider: N/A - Integration/channel (if any): tagged TTS parser (`messages.tts.auto="tagged"`) - Relevant config (redacted): `messages.tts.auto="tagged"` ### Steps 1. Put a literal TTS token inside markdown code, for example `` `[[tts:text]]` `` or a fenced block containing `[[tts:text]]...[[/tts:text]]`. 2. Run the message through `parseTtsDirectives`. 3. Observe whether the parser marks `hasDirective=true` and strips the literal text. ### Expected - Literal examples inside markdown code remain untouched and do not trigger TTS. - Real directives outside code still parse. ### Actual - Before this patch, literal examples inside inline/fenced code set `hasDirective=true` and were stripped; mixed content also lost the inline-code literal token. - After this patch, only directives whose tag starts outside code are parsed. ## Evidence Attach at least one: - [x] Failing test/log

openclaw2026-04-19 02:03:11

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#68769•Fetched 2026-04-19 15:07:50

View on GitHub

Comments

Participants

Timeline

Reactions

Author

richardmqq

Participants

richardmqq

Timeline (top)

cross-referenced ×1

parseTtsDirectives in provider-error-utils-CyJAWFR1.js uses the regex /\[\[tts:([^\]]+)\]\]/gi, which matches any literal [[tts:xxx]] occurrence regardless of surrounding markdown context. An assistant writing about TTS directives — for example in a troubleshooting reply, a | TTS 链路诊断 | 查 \[[tts:text]]` ... |table cell, or an inline/fenced code block — unintentionally triggershasDirective=true, and in auto: "tagged"` mode the TTS pipeline then generates and delivers a voice message that the assistant never intended to produce.

Error Message

parseTtsDirectives in provider-error-utils-CyJAWFR1.js uses the regex /\[\[tts:([^\]]+)\]\]/gi, which matches any literal [[tts:xxx]] occurrence regardless of surrounding markdown context. An assistant writing about TTS directives — for example in a troubleshooting reply, a | TTS 链路诊断 | 查 \[[tts:text]]` ... |table cell, or an inline/fenced code block — unintentionally triggershasDirective=true, and in auto: "tagged"mode the TTS pipeline then generates and delivers a voice message that the assistant never intended to produce.dist/provider-error-utils-CyJAWFR1.js` (v2026.4.15):

Root Cause

Code Example

| 3 | TTS 链路诊断 | 查 `[[tts:text]]` 为什么 gateway log 完全没处理痕迹 |

---

`messages.tts.auto = "tagged"` → 需要 `[[tts:text]]` tag 才触发

---

import re
single_re = re.compile(r'\[\[tts:([^\]]+)\]\]', re.IGNORECASE)
msg = '`messages.tts.auto = "tagged"` → 需要 `[[tts:text]]` tag 才触发'
print(single_re.findall(msg))  # -> ['text']  => hasDirective=True

---

cleanedText = cleanedText.replace(
  /\[\[tts:text\]\]([\s\S]*?)\[\[\/tts:text\]\]/gi,
  (_match, inner) => { hasDirective = true; /* ... */ return ""; }
);
cleanedText = cleanedText.replace(
  /\[\[tts:([^\]]+)\]\]/gi,                // ← greedy, markdown-blind
  (_match, body) => { hasDirective = true; /* ... */ return ""; }
);

---

if (autoMode === "tagged" && !directives.hasDirective) return nextPayload;

RAW_BUFFERClick to expand / collapse

Summary

Environment

OpenClaw 2026.4.15 (041266a)
Node 25.7.0, macOS 26.4.1 (Tahoe)
Config: messages.tts.auto = "tagged", provider elevenlabs
Assistant running as system-architect agent, Telegram channel, Anthropic claude-opus-4-7

Reproduction

With messages.tts.auto: "tagged" and a working ElevenLabs / MiniMax provider, have the assistant reply containing the literal substring [[tts:text]] inside a code span or table cell without any closing [[/tts:text]] block, e.g.:

| 3 | TTS 链路诊断 | 查 `[[tts:text]]` 为什么 gateway log 完全没处理痕迹 |

or:

`messages.tts.auto = "tagged"` → 需要 `[[tts:text]]` tag 才触发

Observed: hasDirective=true → TTS synthesis runs → voice note MP3 is delivered to the user.

Minimal Python reproducer (mirroring the JS regex):

import re
single_re = re.compile(r'\[\[tts:([^\]]+)\]\]', re.IGNORECASE)
msg = '`messages.tts.auto = "tagged"` → 需要 `[[tts:text]]` tag 才触发'
print(single_re.findall(msg))  # -> ['text']  => hasDirective=True

Expected behavior

TTS directives should only trigger when they appear as active markup, not when they occur inside:

inline code spans (`[[tts:text]]`)
fenced code blocks (``` ... ```)
indented code blocks (4-space)
optionally: table cells containing code spans (or at least when inside code spans)

In other words, parseTtsDirectives should walk the text with basic markdown code-context awareness, or a pre-pass should strip/mask code spans before the directive regex runs.

Actual behavior

All [[tts:xxx]] substrings in the reply text are treated as directives. An assistant authoring a documentation/debug/explanation reply that contains these tags as examples ends up emitting a voice note. In tagged mode this is effectively unavoidable without the assistant knowing to avoid the literal token forms entirely.

Impact

Unintended TTS spend (ElevenLabs / MiniMax usage).
Surprise voice notes in the chat, confusing users; the assistant in my case denied sending them for three turns (the path also has no Telegram log line — see sibling issue on Telegram plugin logging).
Self-reinforcing loop: explaining the bug in a reply triggers the bug.

Source references

dist/provider-error-utils-CyJAWFR1.js (v2026.4.15):

cleanedText = cleanedText.replace(
  /\[\[tts:text\]\]([\s\S]*?)\[\[\/tts:text\]\]/gi,
  (_match, inner) => { hasDirective = true; /* ... */ return ""; }
);
cleanedText = cleanedText.replace(
  /\[\[tts:([^\]]+)\]\]/gi,                // ← greedy, markdown-blind
  (_match, body) => { hasDirective = true; /* ... */ return ""; }
);

Caller maybeApplyTtsToPayload in dist/extensions/speech-core/runtime-api.js:

if (autoMode === "tagged" && !directives.hasDirective) return nextPayload;

Proposed fix (sketch)

Before running the directive regexes, mask/strip markdown code regions:

Fenced blocks: /[^\n]\n[\s\S]?\n/g
Inline spans: / +[^`\n]++ /g and /`[^`\n]+`/g
(Optionally) indented code blocks.

Replace matches with placeholders before applying the directive regexes, then re-insert at the end (or simply keep them in cleanedText and only mask for directive detection).

Alternatively, require fenced code blocks to be a hard boundary: directives inside fenced blocks are treated as literal text, directives outside behave as today.

Happy to open a PR if the team agrees on the scope.

Workaround used locally

Added a rule to the assistant's TOOLS.md forbidding literal [[tts:xxx]] in reply text when discussing TTS; use alternative forms like `tts:text` tag or "双中括号 tts 指令". This works but is fragile and agent-specific.

extent analysis

TL;DR

Mask markdown code regions before applying the TTS directive regex to prevent unintended voice note generation.

Guidance

Identify and mask fenced code blocks, inline code spans, and optionally indented code blocks in the text before running the TTS directive regex.
Replace masked code regions with placeholders to prevent them from being treated as directives.
Consider requiring fenced code blocks to be a hard boundary for directive detection.
Verify the fix by testing with examples that previously triggered unintended TTS synthesis, such as the provided Python reproducer.

Example

// Mask fenced code blocks
const fencedBlockRegex = /```[^\n]*\n[\s\S]*?\n```/g;
cleanedText = cleanedText.replace(fencedBlockRegex, (match) => `{{placeholder:${match}}}`);

// Mask inline code spans
const inlineSpanRegex = /`[^`\n]+`/g;
cleanedText = cleanedText.replace(inlineSpanRegex, (match) => `{{placeholder:${match}}}`);

// Apply TTS directive regex
const ttsDirectiveRegex = /\[\[tts:([^\]]+)\]\]/gi;
cleanedText = cleanedText.replace(ttsDirectiveRegex, (_match, body) => { hasDirective = true; /* ... */ return ""; });

Notes

The proposed fix requires careful consideration of edge cases, such as nested code blocks or directives within code spans. The workaround used locally, adding a rule to the assistant's TOOLS.md, is fragile and may not be effective in all scenarios.

Recommendation

Apply the workaround of masking markdown code regions before running the TTS directive regex, as it is a more comprehensive solution that addresses the root cause of the issue. This approach will prevent unintended TTS synthesis and provide a more robust solution than the local workaround.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

TTS directives should only trigger when they appear as active markup, not when they occur inside:

inline code spans (`[[tts:text]]`)
fenced code blocks (``` ... ```)
indented code blocks (4-space)
optionally: table cells containing code spans (or at least when inside code spans)

In other words, parseTtsDirectives should walk the text with basic markdown code-context awareness, or a pre-pass should strip/mask code spans before the directive regex runs.

#api #dependency conflict #environment setup #docker error #permission error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix TTS `parseTtsDirectives` is markdown-blind: `[[tts:xxx]]` inside code spans / code blocks triggers auto TTS in `tagged` mode [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Workaround used locally

PR fix notes

PR #68806: [AI-assisted] fix(tts): ignore literal directives inside markdown code

Description (problem / solution / changelog)

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Root Cause (if applicable)

Regression Test Plan (if applicable)

User-visible / Behavior Changes

Diagram (if applicable)

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Human Verification (required)

Review Conversations

Compatibility / Migration

Risks and Mitigations

Changed files

Code Example

Summary

Environment

Reproduction

Expected behavior

Actual behavior

Impact

Source references

Proposed fix (sketch)

Workaround used locally

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING