openclaw - ✅(Solved) Fix [Bug]: TTS tool-generated audio blocked by reply media normalizer — `/tmp/openclaw/` not in `isAllowedAbsoluteReplyMediaPath` [3 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#64529Fetched 2026-04-11 06:14:34
View on GitHub
Comments
1
Participants
2
Timeline
8
Reactions
0
Participants
Timeline (top)
cross-referenced ×3labeled ×2referenced ×2commented ×1

The TTS tool generates audio to /tmp/openclaw/tts-*/voice-*.mp3, but the reply media path normalizer (createReplyMediaPathNormalizer in agent-runner.runtime) drops it with: dropping blocked reply media /tmp/openclaw/tts-w1q7or/voice-1775861103004.mp3: Error: Absolute host-local MEDIA paths are blocked in normal replies. Use a safe relative path or the message tool.

Error Message

[debug] embedded run tool start: tool=tts [debug] TTS: starting with provider openai, fallbacks: elevenlabs, microsoft, minimax, vydra [debug] embedded run tool end: tool=tts [debug] dropping blocked reply media /tmp/openclaw/tts-w1q7or/voice-1775861103004.mp3: Error: Absolute host-local MEDIA paths are blocked in normal replies. Use a safe relative path or the message tool.

Root Cause

The TTS tool generates audio to /tmp/openclaw/tts-*/voice-*.mp3, but the reply media path normalizer (createReplyMediaPathNormalizer in agent-runner.runtime) drops it with: dropping blocked reply media /tmp/openclaw/tts-w1q7or/voice-1775861103004.mp3: Error: Absolute host-local MEDIA paths are blocked in normal replies. Use a safe relative path or the message tool.

Fix Action

Fixed

PR fix notes

PR #14: fix: Zod 4 manifest schema generation + SSRF-compatible media fetching

Description (problem / solution / changelog)

Summary

Two fixes for OpenClaw 2026.4.x compatibility:

1. Fix empty configSchema in manifest (Zod 4 migration)

The openclaw peer dependency now ships Zod 4, which changed its internal schema representation. The third-party zod-to-json-schema library silently produces empty definitions with Zod 4, resulting in openclaw.plugin.json shipping a configSchema with no type information. This breaks manifest-level config validation.

Replaced with Zod 4's native z.toJSONSchema() method and removed the zod-to-json-schema dev dependency.

2. Use requestInit instead of fetchImpl for inbound media

OpenClaw 2026.4.x added SSRF guard dispatchers to fetchRemoteMedia that reject custom fetchImpl wrappers with invalid onRequestStart method, silently breaking all inbound media (images, audio, documents).

Passes auth headers via the requestInit parameter instead of wrapping fetch, matching the FetchMediaOptions contract. Also adds mediaLocalRoots to the deliver callback for outbound media support (pending openclaw/openclaw#64529 to fully unblock TTS delivery).

Elevates media download failure logging from debug to info so fetch errors are visible without debug-level logging.

3. Improved upload error detail

uploadFile error now includes HTTP status and response body for easier debugging of outbound media failures.

Testing

All existing tests pass. Inbound image vision confirmed working on Rocket.Chat 8.3.1 + OpenClaw 2026.4.9.

Changed files

  • package.json (modified, +1/-2)
  • scripts/generate-manifest-schema.ts (modified, +2/-9)
  • src/rocketchat/client.ts (modified, +1/-1)
  • src/rocketchat/monitor.ts (modified, +10/-9)

PR #64544: fix(reply): allow TTS and tool-generated media under managed tmp dir

Description (problem / solution / changelog)

Summary

Fixes #64529 — TTS-generated audio is silently dropped because /tmp/openclaw/ is not in the reply media path allowlist.

Root Cause

Two security allowlists in OpenClaw are inconsistent:

AllowlistIncludes /tmp/openclaw/?
buildMediaLocalRoots() (media loading)✅ Yes, via preferredTmpDir
isAllowedAbsoluteReplyMediaPath() (reply normalizer)❌ No

TTS tools write audio files to /tmp/openclaw/tts-*/voice-*.mp3. The file is generated successfully, but the reply pipeline blocks it:

dropping blocked reply media /tmp/openclaw/tts-w1q7or/voice-1775861103004.mp3:
Error: Absolute host-local MEDIA paths are blocked in normal replies.

Fix

Add resolvePreferredOpenClawTmpDir() as an allowed root in isAllowedAbsoluteReplyMediaPath(), making it consistent with buildMediaLocalRoots(). The call is wrapped in try/catch for graceful degradation in environments where tmp dir resolution fails.

Impact

  • TTS audio delivery works again on all channels
  • Any tool writing media to /tmp/openclaw/ benefits from this fix
  • No security regression: the tmp dir is already managed/secured by OpenClaw (0o700 perms, ownership checks)

Changed files

  • src/auto-reply/reply/reply-media-paths.ts (modified, +13/-0)

PR #64573: fix(browser): inject DISPLAY=:0 fallback when spawning Chrome on Linux

Description (problem / solution / changelog)

Summary

  • Problem: Non-headless Chrome spawned by the browser extension fails to start on Linux/WSL2 with Missing X server or $DISPLAY, because the launch context (WSL2 shell, systemd user unit, launchd agent) did not propagate DISPLAY into the spawned process.
  • Why it matters: WSL2 users with a non-snap Chrome cannot use non-headless browser control at all — the Chrome bootstrap dies before CDP comes up, and the user sees GatewayClientRequestError: Failed to start Chrome CDP on port 18800.
  • What changed: Extracted the Chrome spawn environment into a pure buildChromeSpawnEnv helper and have it inject DISPLAY=:0 only on Linux, only in non-headless mode, and only when the caller has not already set one.
  • What did NOT change (scope boundary): macOS, Windows, headless Chrome, and any launch where DISPLAY is already set in the parent env. No launch args changed. No behavior change for buildOpenClawChromeLaunchArgs.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #64464
  • Related #
  • This PR fixes a bug or regression

Root Cause (if applicable)

  • Root cause: launchOpenClawChrome in extensions/browser/src/browser/chrome.ts builds a spawn env that starts from process.env and explicitly overrides HOME. When the parent process has no DISPLAY (common in WSL2 invocations that did not inherit WSLg's env, systemd user units, or the gateway running as a daemon), the spawned Chrome has no X display and the ozone/X11 platform initialization aborts before CDP is reachable.
  • Missing detection / guardrail: Headless mode masked this because --headless=new does not touch X11, so the bug only surfaces for users who explicitly set headless: false. There was no unit test covering the spawn env — only the launch args were tested.
  • Contributing context (if known): Reporter noted the exact fix in the issue, including the file and line (extensions/browser/src/browser/chrome.ts inside the spawnOnce closure of launchOpenClawChrome).

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: extensions/browser/src/browser/chrome.spawn-env.test.ts (new)
  • Scenario the test should lock in: buildChromeSpawnEnv must inject DISPLAY=:0 on Linux in non-headless mode when the base env has none, and must leave DISPLAY unchanged in every other combination (Linux+headless, Linux+existing DISPLAY, macOS, Windows).
  • Why this is the smallest reliable guardrail: The spawn env is pure data assembly — a per-platform truth table is the cheapest test that fully covers the rule set, and it cannot regress silently because it does not depend on a real Chrome binary.
  • Existing test that already covers this (if any): None. chrome.launch-args.test.ts covered buildOpenClawChromeLaunchArgs but not the env bag.
  • If no new test is added, why not: N/A — a new test file is added.

User-visible / Behavior Changes

WSL2 and Linux users launching non-headless Chrome without DISPLAY in their env will now succeed (previously failed). No change for users who already had DISPLAY set, nor for macOS/Windows users, nor for headless Chrome.

Diagram (if applicable)

Before (Linux non-headless, no DISPLAY in parent env):
  spawn(chrome, { env: { ...process.env, HOME } })
    -> chrome stderr: "Missing X server or $DISPLAY"
    -> CDP never comes up -> "Failed to start Chrome CDP on port 18800"

After:
  spawn(chrome, { env: buildChromeSpawnEnv({ base: process.env, platform, headless, home }) })
    -> env includes DISPLAY=":0" (only injected on Linux + non-headless + unset)
    -> chrome initializes X11 ozone platform -> CDP ready

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No — same spawn(exe.path, args, ...) call, same args, only the env bag is assembled via a helper.
  • Data access scope changed? No

Repro + Verification

Environment

  • OS: WSL2 Ubuntu (per reporter); tests run on Windows 11 host, Node 22
  • Runtime/container: Native host Chrome (non-snap)
  • Model/provider: N/A — this is browser-subprocess plumbing
  • Integration/channel: browser extension (extensions/browser)
  • Relevant config: browser.headless = false, no DISPLAY in parent env

Steps

  1. On WSL2 with non-snap Chrome installed, start the browser extension with headless: false.
  2. Observe Chrome fails to come up, stderr contains Missing X server or $DISPLAY.
  3. Apply this patch; repeat — Chrome starts, CDP becomes reachable on 18800.

Expected

Chrome starts; CDP handshake succeeds.

Actual (before the fix)

GatewayClientRequestError: Failed to start Chrome CDP on port 18800 for profile "openclaw". with Missing X server or $DISPLAY in stderr.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

New test file results:

 RUN  v4.1.4
 Test Files  2 passed (2)
      Tests  7 passed (7)

Tests run:

  • chrome.spawn-env.test.ts: 6 cases (Linux non-headless injects, existing DISPLAY preserved, Linux headless no-op, macOS no-op, Windows no-op, HOME pinned on all platforms).
  • chrome.launch-args.test.ts: existing smoke test still green.

oxlint on the touched files: Found 0 warnings and 0 errors.

Human Verification (required)

  • Verified scenarios:
    • Ran vitest against the new and existing browser test files locally — all 7 pass.
    • Ran oxlint against chrome.ts and the new test file — clean.
    • Walked the spawnOnce closure to confirm resolved.headless is in scope at the call site.
    • Confirmed buildOpenClawChromeLaunchArgs already gates --headless=new on resolved.headless, so headless branches cannot observe the DISPLAY injection.
  • Edge cases checked:
    • DISPLAY="" (empty string) in parent env — treated as unset and the :0 fallback is applied. Empty DISPLAY is not a valid X display, so this is the intended outcome.
    • User-set DISPLAY=:1 (remote X) — preserved, never clobbered.
  • What I did NOT verify: I did not spin up a real WSL2 + Chrome session end-to-end — the reporter's trace already demonstrates the failure mode and the fix matches their suggested remediation. I relied on unit tests for the env-assembly logic.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? No
  • Migration needed? No
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: Injected DISPLAY=:0 on a Linux host that has no X server at all (e.g. a headless Linux box misconfigured with headless: false) would not suddenly make Chrome work — it would still fail.
    • Mitigation: The failure mode is identical to today (Chrome exits with an X error), so this is a strict no-regression. Headless users (the correct path for fully headless boxes) are untouched.
  • Risk: A user who relies on DISPLAY being absent for some downstream behavior would see a surprise.
    • Mitigation: The injection is scoped to the Chrome spawn env only; it does not mutate process.env of the parent, and it only activates when DISPLAY is already absent from the base env.

Changed files

  • extensions/browser/src/browser/chrome.spawn-env.test.ts (added, +83/-0)
  • extensions/browser/src/browser/chrome.ts (modified, +39/-5)

Code Example

[debug] embedded run tool start: tool=tts
[debug] TTS: starting with provider openai, fallbacks: elevenlabs, microsoft, minimax, vydra
[debug] embedded run tool end: tool=tts
[debug] dropping blocked reply media /tmp/openclaw/tts-w1q7or/voice-1775861103004.mp3:
Error: Absolute host-local MEDIA paths are blocked in normal replies.
Use a safe relative path or the message tool.
RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Beta release blocker

No

Summary

The TTS tool generates audio to /tmp/openclaw/tts-*/voice-*.mp3, but the reply media path normalizer (createReplyMediaPathNormalizer in agent-runner.runtime) drops it with: dropping blocked reply media /tmp/openclaw/tts-w1q7or/voice-1775861103004.mp3: Error: Absolute host-local MEDIA paths are blocked in normal replies. Use a safe relative path or the message tool.

Steps to reproduce

  1. Configure TTS with any provider (OpenAI-compatible endpoint, ElevenLabs, Edge, etc.)
  2. Connect any channel (Telegram, etc.)
  3. Send a message requesting TTS (e.g. :speaker: say hello)
  4. TTS tool runs, generates MP3 to /tmp/openclaw/tts-*/voice-*.mp3
  5. Reply pipeline drops the media path silently (only visible at debug log level)
  6. Channel receives no reply
  7. Typing indicator runs for 60s then times out

Expected behavior

TTS-generated audio under /tmp/openclaw/ should be treated as managed generated media and delivered through the channel's sendMedia path, consistent with how buildMediaLocalRoots already includes preferredTmpDir (/tmp/openclaw/) as an allowed media root.

Actual behavior

isAllowedAbsoluteReplyMediaPath checks two sources:

  1. isManagedGlobalReplyMediaPath — allows ~/.openclaw/media/outbound and ~/.openclaw/media/tool-*
  2. workspaceDir/.openclaw/media and sandboxRoot/.openclaw/media

Neither includes /tmp/openclaw/, which is where the TTS tool writes its output.

However, buildMediaLocalRoots (in local-roots) does include /tmp/openclaw/ via resolvePreferredOpenClawTmpDir(). This means the media loading allowlist recognizes the path, but the reply normalizer security check does not. The two allowlists are inconsistent.

OpenClaw version

OpenClaw 2026.4.9

Operating system

Ubuntu 24.04

Install method

docker

Model

grok-4-1-fast

Provider / routing chain

local docker host tunnel

Additional provider/model setup details

No response

Logs, screenshots, and evidence

[debug] embedded run tool start: tool=tts
[debug] TTS: starting with provider openai, fallbacks: elevenlabs, microsoft, minimax, vydra
[debug] embedded run tool end: tool=tts
[debug] dropping blocked reply media /tmp/openclaw/tts-w1q7or/voice-1775861103004.mp3:
Error: Absolute host-local MEDIA paths are blocked in normal replies.
Use a safe relative path or the message tool.

Impact and severity

Affected: All channels using TTS tool-generated audio (channel-agnostic; likely affects Telegram and others) Severity: High (blocks TTS audio delivery entirely; audio is generated but never reaches the channel) Frequency: 100% — every TTS tool invocation on OpenClaw 2026.4.9 Consequence: TTS audio never delivered to users; typing indicator hangs for 60s then times out; no error visible without debug logging

Additional information

first known bad version: 2026.4.9

extent analysis

TL;DR

The TTS tool's generated audio files are being dropped due to an inconsistent allowlist in the reply normalizer security check, which can be fixed by updating the isAllowedAbsoluteReplyMediaPath function to include the /tmp/openclaw/ directory.

Guidance

  • Review the isAllowedAbsoluteReplyMediaPath function to ensure it includes the /tmp/openclaw/ directory, which is already recognized by the buildMediaLocalRoots function.
  • Verify that the createReplyMediaPathNormalizer function is correctly configured to allow relative paths or use the message tool.
  • Check the debug logs to confirm that the issue is indeed caused by the reply normalizer security check.
  • Consider updating the isManagedGlobalReplyMediaPath function to include the /tmp/openclaw/ directory, to maintain consistency with the buildMediaLocalRoots function.

Example

No code snippet is provided as the issue is related to the configuration and functionality of the OpenClaw tool, and not a specific code error.

Notes

The issue seems to be specific to OpenClaw version 2026.4.9, and the fix may involve updating the isAllowedAbsoluteReplyMediaPath function to include the /tmp/openclaw/ directory. However, without access to the codebase, it's difficult to provide a more detailed solution.

Recommendation

Apply a workaround by updating the isAllowedAbsoluteReplyMediaPath function to include the /tmp/openclaw/ directory, as this is the most likely cause of the issue and is consistent with the existing buildMediaLocalRoots function.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

TTS-generated audio under /tmp/openclaw/ should be treated as managed generated media and delivered through the channel's sendMedia path, consistent with how buildMediaLocalRoots already includes preferredTmpDir (/tmp/openclaw/) as an allowed media root.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING