openclaw - ✅(Solved) Fix [Bug]: TTS elevenlabs provider generates audio but OpenClaw plays OpenAI voice instead [2 pull requests, 4 comments, 5 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#52186Fetched 2026-04-08 01:14:32
View on GitHub
Comments
4
Participants
5
Timeline
10
Reactions
1
Timeline (top)
commented ×4cross-referenced ×2labeled ×2mentioned ×1

OpenAI voice plays despite ElevenLabs processing the request successfully.

Root Cause

OpenAI voice plays despite ElevenLabs processing the request successfully.

PR fix notes

PR #57953: fix(tts): restore 3.28 schema compatibility and fallback observability

Description (problem / solution / changelog)

Summary

  • Problem: v2026.3.28 introduced a schema/validation mismatch where legacy messages.tts.<provider> and plugins.entries.voice-call.config.tts.<provider> shapes (notably elevenlabs) were rejected even though runtime/docs still treated them as valid.
  • Why it matters: openclaw status, openclaw doctor, and voice-call plugin registration could fail against previously working configurations.
  • What changed:
    • Restored legacy TTS provider-shape compatibility in speech-core config migration/validation.
    • Aligned voice-call plugin TTS schema/docs with tts.providers.<id> plus legacy compatibility handling.
    • Added fallback observability: provider attempt chain and fallback metadata are persisted and surfaced in /tts status, plus telephony fallback warning logs.
    • Added regression tests for legacy config migration/validation and command/status fallback reporting.
  • What did NOT change (scope boundary): this PR does not add per-agent Discord voice assignment behavior or force-provider lock semantics; it focuses on 3.28 schema regression + fallback observability.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #57314
  • Closes #56983
  • Related #46015
  • Related #52186
  • This PR fixes a bug or regression

Cluster Item Coverage

  • #57314 (messages.tts.elevenlabs rejected by CLI validator): Completely addressed.
    • Legacy provider key shape is accepted/migrated again and no longer treated as unrecognized.
  • #56983 (voice-call tts.elevenlabs rejected as unrecognized): Completely addressed.
    • Voice-call TTS config parsing now accepts the documented/legacy shape and aligns with schema behavior.
  • #46015 (silent fallback to Edge in Discord + no visibility): Partially addressed.
    • Addressed: fallback path is now visible via /tts status attempt chain/details and telephony warning logs.
    • Not addressed here: per-agent Discord voice support and hard prevention of provider fallback behavior.
  • #52186 (ElevenLabs call observed but OpenAI voice playback): Partially addressed.
    • Addressed: fallback attempts/selected provider are now surfaced for diagnosis.
    • Not addressed here: enforcing provider stickiness or changing provider-selection policy.

Root Cause / Regression History (if applicable)

  • Root cause: 3.28 schema path drift between accepted config surfaces and runtime/documented shapes for TTS provider config, especially legacy direct provider keys.
  • Missing detection / guardrail: insufficient regression coverage for legacy shape acceptance across CLI validation + plugin validation surfaces.
  • Prior context (git blame, prior PR, issue, or refactor if known): surfaced after the 3.24 -> 3.28 upgrade reports in #57314 and #56983.
  • Why this regressed now: schema tightening/normalization was not mirrored by compatibility migration at all validation entry points.
  • If unknown, what was ruled out: ruled out runtime provider registration itself as sole cause (runtime synthesis still worked in affected reports).

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file:
    • src/config/legacy-migrate.test.ts
    • src/config/config-misc.test.ts
    • src/config/config.plugin-validation.test.ts
    • src/config/schema.test.ts
    • src/auto-reply/reply/commands-tts.test.ts
    • extensions/voice-call/src/config.test.ts
    • extensions/voice-call/src/telephony-tts.test.ts
  • Scenario the test should lock in: legacy provider-key configs remain accepted/migrated; fallback attempt/final provider visibility remains available in status/logs.
  • Why this is the smallest reliable guardrail: it covers both config-surface compatibility and user-visible fallback diagnostics where regressions were reported.
  • Existing test that already covers this (if any): expanded from existing config + voice-call test suites.
  • If no new test is added, why not: N/A.

User-visible / Behavior Changes

  • openclaw status / validation no longer reject legacy messages.tts.<provider> shapes like messages.tts.elevenlabs.
  • Voice-call plugin no longer rejects legacy plugins.entries.voice-call.config.tts.<provider> keys like tts.elevenlabs.
  • /tts status now shows fallback chain/details from the latest attempt.
  • Voice-call telephony emits explicit warning logs when fallback provider is used.

Diagram (if applicable)

Before:
legacy tts provider keys -> schema rejects key -> status/doctor/plugin fail

After:
legacy tts provider keys -> compatibility migration/acceptance -> validation passes -> runtime + diagnostics available

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: Node 22+/pnpm workspace
  • Model/provider: ElevenLabs/OpenAI/Microsoft TTS paths
  • Integration/channel (if any): CLI + voice-call
  • Relevant config (redacted): messages.tts.* and plugins.entries.voice-call.config.tts.*

Steps

  1. Configure legacy direct provider keys under messages.tts and voice-call plugin tts.
  2. Run validation/status flows and voice-call plugin registration.
  3. Trigger fallback path and inspect /tts status and voice-call logs.

Expected

  • Legacy shapes accepted.
  • Voice-call plugin registers.
  • Fallback chain/details visible.

Actual

  • Matched expected in targeted validation/tests.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios:
    • Targeted tests for config migration/schema/plugin validation and fallback observability.
  • Edge cases checked:
    • Success and failure fallback attempt reporting.
    • Voice-call telephony fallback warning logging.
  • What you did not verify:
    • Full pnpm check / full-suite e2e outside touched TTS surfaces.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: compatibility behavior differs across obscure provider-key variants.
    • Mitigation: compatibility normalization + targeted regression coverage across config and plugin validation.
  • Risk: fallback visibility could be noisy.
    • Mitigation: scoped to status output and explicit warning logs only when fallback occurs.

Changed files

  • apps/macos/Sources/OpenClaw/HostEnvSecurityPolicy.generated.swift (modified, +0/-1)
  • docs/gateway/doctor.md (modified, +4/-0)
  • docs/plugins/voice-call.md (modified, +21/-11)
  • docs/tools/tts.md (modified, +4/-0)
  • docs/tts.md (modified, +4/-0)
  • extensions/speech-core/src/tts.ts (modified, +69/-7)
  • extensions/voice-call/index.ts (modified, +7/-7)
  • extensions/voice-call/openclaw.plugin.json (modified, +162/-110)
  • extensions/voice-call/src/runtime.ts (modified, +1/-0)
  • extensions/voice-call/src/telephony-tts.test.ts (modified, +24/-1)
  • extensions/voice-call/src/telephony-tts.ts (modified, +16/-1)
  • src/auto-reply/reply/commands-tts.test.ts (added, +120/-0)
  • src/auto-reply/reply/commands-tts.ts (modified, +12/-0)
  • src/config/config-misc.test.ts (modified, +75/-0)
  • src/config/config.plugin-validation.test.ts (modified, +75/-5)
  • src/config/legacy-migrate.test.ts (modified, +124/-0)
  • src/config/legacy.migrations.runtime.ts (modified, +92/-7)
  • src/infra/host-env-security-policy.json (modified, +0/-1)
  • src/plugins/contracts/tts.contract.test.ts (modified, +25/-0)

PR #57954: feat(tts): add structured provider diagnostics and fallback attempt analytics

Description (problem / solution / changelog)

Summary

  • Problem: fallback happened without enough structured detail to diagnose provider selection failures quickly.
  • Why it matters: users and maintainers need to distinguish configuration skips, timeouts, and provider API failures without deep log forensics.
  • What changed:
    • Added structured TTS attempt analytics (provider, outcome, reasonCode, latencyMs, error) through speech-core result surfaces.
    • Surfaced attempt details in /tts status output.
    • Enriched OpenAI/ElevenLabs provider API errors with parsed body details + request IDs when available.
    • Added telephony-path verbose skip/failure diagnostics and preserved attempt metadata.
    • Removed double-prefix TTS failure wrapping path.
  • What did NOT change (scope boundary): no provider-selection policy rewrite, no forced provider stickiness, no per-agent Discord voice assignment behavior.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Related #46015
  • Related #52186
  • Depends on #57953
  • This PR fixes a bug or regression

Root Cause / Regression History (if applicable)

  • Root cause: fallback telemetry captured only coarse final state, not structured per-attempt outcomes, and provider API errors were status-only.
  • Missing detection / guardrail: no explicit tests for detailed diagnostics formatting and request-id propagation.
  • Prior context (git blame, prior PR, issue, or refactor if known): follow-up scope from the 3.28 schema-regression bucket to improve fallback observability.
  • Why this regressed now: observability requirements expanded after cluster triage; existing behavior was under-instrumented rather than newly broken.
  • If unknown, what was ruled out: ruled out transport-only failure as the sole issue; diagnostics gap existed at provider and aggregation layers.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file:
    • src/auto-reply/reply/commands-tts.test.ts
    • extensions/openai/tts.test.ts
    • extensions/elevenlabs/tts.test.ts
    • extensions/voice-call/src/telephony-tts.test.ts
    • src/plugins/contracts/tts.contract.test.ts
  • Scenario the test should lock in: fallback attempt details and provider error diagnostics remain visible and machine-readable for status/logging paths.
  • Why this is the smallest reliable guardrail: it covers command surface, provider surface, telephony surface, and contract-level behavior with minimal unrelated scope.
  • Existing test that already covers this (if any): expanded existing OpenAI/TTS command and TTS contract coverage; added ElevenLabs diagnostics coverage.
  • If no new test is added, why not: N/A.

User-visible / Behavior Changes

  • /tts status now includes Attempt details: with structured per-provider outcomes.
  • OpenAI and ElevenLabs error messages now include parsed API error details and request IDs (when present).
  • Telephony fallback/skip behavior logs richer verbose diagnostics.

Diagram (if applicable)

Before:
request -> fallback maybe occurs -> only final provider + coarse error string

After:
request -> per-provider attempts recorded -> status/logs show outcome/reason/latency + provider API detail

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: Node 22+/pnpm workspace
  • Model/provider: OpenAI, ElevenLabs, Microsoft
  • Integration/channel (if any): /tts command + voice-call telephony path
  • Relevant config (redacted): messages.tts.*

Steps

  1. Trigger a TTS call where primary provider fails and fallback succeeds.
  2. Run /tts status.
  3. Trigger provider API failures and inspect surfaced error text.

Expected

  • Status includes attempt details with reason codes and latencies.
  • Provider API errors include parsed detail and request id when available.

Actual

  • Matched expected in targeted tests and build.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios:
    • Targeted tests and build across command/provider/contract/telephony touched surfaces.
  • Edge cases checked:
    • JSON vs non-JSON provider error payloads.
    • Timeout and provider-error reason code mapping.
  • What you did not verify:
    • Full repository pnpm check and full-suite e2e.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: richer error strings may increase log noise.
    • Mitigation: bounded detail extraction + truncation + request-id only when present.
  • Risk: attempt-detail formatting drift in command output.
    • Mitigation: explicit status-output tests with fallback scenarios.

Changed files

  • CHANGELOG.md (modified, +2/-0)
  • docs/tools/tts.md (modified, +2/-0)
  • docs/tts.md (modified, +2/-0)
  • extensions/elevenlabs/tts.test.ts (added, +133/-0)
  • extensions/elevenlabs/tts.ts (modified, +52/-1)
  • extensions/openai/tts.test.ts (modified, +115/-1)
  • extensions/openai/tts.ts (modified, +55/-1)
  • extensions/speech-core/src/tts.ts (modified, +150/-29)
  • src/auto-reply/reply/commands-tts.test.ts (modified, +43/-0)
  • src/auto-reply/reply/commands-tts.ts (modified, +27/-0)
  • src/plugin-sdk/speech.ts (modified, +6/-0)
  • src/plugins/contracts/tts.contract.test.ts (modified, +176/-0)
  • src/tts/provider-error-utils.ts (added, +62/-0)
RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Summary

OpenAI voice plays despite ElevenLabs processing the request successfully.

Steps to reproduce

Version: 2026.3.13 OS: Ubuntu Linux ElevenLabs API usage confirmed via portal (calls being received and processed) openclaw status shows Provider: elevenlabs (configured) but Last attempt shows Provider: openai Config: messages.tts.provider: elevenlabs, voiceId and modelId set correctly

Expected behavior

Expected: ElevenLabs audio played back.

Actual behavior

Actual: OpenAI voice plays despite ElevenLabs processing the request successfully.

OpenClaw version

2026.3.13

Operating system

Ubuntu 24.10

Install method

npm global

Model

claude sonnet 4.6

Provider / routing chain

openclaw -> github copilot

Additional provider/model setup details

No response

Logs, screenshots, and evidence

Impact and severity

No response

Additional information

No response

extent analysis

Fix Plan

To resolve the issue of OpenAI voice playing despite ElevenLabs processing the request successfully, we need to ensure that the messages.tts.provider configuration is correctly set and that there are no overrides or caching issues causing OpenAI to be used instead.

Steps to Fix:

  1. Verify Configuration: Double-check that messages.tts.provider is set to elevenlabs in your configuration file.
  2. Check for Overrides: Ensure there are no environment variables or command-line arguments overriding the messages.tts.provider setting to use OpenAI.
  3. Clear Cache: If your application uses caching for TTS providers, clear the cache to prevent any stale configurations from being used.
  4. Update OpenClaw Configuration: If using a version of OpenClaw that supports it, update the configuration to explicitly set the TTS provider for the specific voice or model being used.

Example Code Snippet (if applicable):

For instance, if you're configuring OpenClaw programmatically, ensure your setup looks something like this:

const openclaw = require('openclaw');

// Assuming a function to configure OpenClaw
function configureOpenClaw() {
  openclaw.config.set('messages.tts.provider', 'elevenlabs');
  openclaw.config.set('voiceId', 'your_elevenlabs_voice_id');
  openclaw.config.set('modelId', 'your_elevenlabs_model_id');
}

configureOpenClaw();

Replace 'your_elevenlabs_voice_id' and 'your_elevenlabs_model_id' with your actual ElevenLabs voice and model IDs.

Verification

After applying these changes, restart your application and attempt to play audio again. Verify that the audio played is from ElevenLabs by checking the audio output or the logs for any indicators of which TTS provider was used.

Extra Tips

  • Regularly review your configuration files and environment settings to catch any unexpected changes.
  • Consider implementing logging or monitoring to track which TTS provider is being used for each request, helping you identify issues sooner.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Expected: ElevenLabs audio played back.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING