openclaw - ✅(Solved) Fix [Feature]: Add MLX Talk provider MVP for local macOS TTS [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#63531Fetched 2026-04-10 03:42:53
View on GitHub
Comments
0
Participants
1
Timeline
3
Reactions
0
Author
Participants
Timeline (top)
labeled ×2cross-referenced ×1

Add an mlx Talk provider for macOS so Talk Mode can use better local neural TTS without routing through ElevenLabs.

Root Cause

Add an mlx Talk provider for macOS so Talk Mode can use better local neural TTS without routing through ElevenLabs.

Fix Action

Fixed

PR fix notes

PR #63539: macOS: add MLX Talk provider MVP

Description (problem / solution / changelog)

Summary

  • add mlx as a real Talk provider in the macOS runtime/config path
  • add a local MLX synthesizer that generates a full utterance and reuses the existing Talk audio player
  • pin mlx-audio-swift to 0.1.2 and cover the new provider behavior with Talk tests

Notes

  • this is intentionally an MVP and skips streaming for now
  • MLX falls back to the system voice path if synthesis fails
  • supersedes #62534
  • refs #63531

Testing

  • swift build --package-path apps/macos --configuration release
  • cd apps/macos && swift test --filter TalkMode

Changed files

  • CHANGELOG.md (modified, +2/-0)
  • apps/macos/Package.resolved (modified, +118/-1)
  • apps/macos/Package.swift (modified, +2/-0)
  • apps/macos/Sources/OpenClaw/TalkMLXSpeechSynthesizer.swift (added, +178/-0)
  • apps/macos/Sources/OpenClaw/TalkModeGatewayConfig.swift (modified, +7/-1)
  • apps/macos/Sources/OpenClaw/TalkModeRuntime.swift (modified, +125/-14)
  • apps/macos/Tests/OpenClawIPCTests/TalkModeGatewayConfigTests.swift (added, +48/-0)
  • apps/macos/Tests/OpenClawIPCTests/TalkModeRuntimeSpeechTests.swift (modified, +29/-6)
RAW_BUFFERClick to expand / collapse

Summary

Add an mlx Talk provider for macOS so Talk Mode can use better local neural TTS without routing through ElevenLabs.

Problem to solve

The current local Talk fallback uses the default macOS system voice, which is a noticeable quality drop from ElevenLabs and makes Talk Mode feel worse when remote TTS is unavailable or undesirable. PR #62534 explored MLX-based local TTS, but it does so by branching inside the system voice path and adding a new streaming playback path. That shape is higher risk than needed for an MVP and makes cancellation, fallback, and playback semantics harder to reason about.

Proposed solution

Implement MLX as a first-class Talk provider instead of a special case inside playSystemVoice().

Scope the MVP to the smallest stable version:

  • Add talk.provider=mlx support to the existing Talk provider/config seam.
  • Keep provider selection explicit in the macOS Talk runtime.
  • Use a non-streaming MLX path for v1: synthesize one utterance, then play it with the existing TalkAudioPlayer.
  • Ensure stop() interrupts active MLX playback just like the existing Talk providers.
  • Pin mlx-audio-swift to a tagged release rather than branch: "main".
  • Keep the provider opt-in until quality and reliability are proven.

Suggested implementation shape:

  • apps/macos/Sources/OpenClaw/TalkModeRuntime.swift: switch on provider (elevenlabs, system, mlx) instead of hiding MLX behind the system-voice path.
  • apps/macos/Sources/OpenClaw/TalkModeGatewayConfig.swift: preserve the generic provider parsing that already exists and let the runtime honor activeProvider.
  • apps/macos/Sources/OpenClaw/TalkMLXSpeechSynthesizer.swift: load the model once, synthesize a full utterance, and reuse the existing playback/cancellation contract.

Alternatives considered

  • Extending the current system voice path with an internal MLX branch. This is the shape in #62534 and it blurs provider boundaries.
  • Shipping streaming MLX playback first. That adds avoidable complexity for temp-file conversion, audio engine lifecycle, completion handling, and interruption semantics before basic provider behavior is proven.
  • Tuning only the built-in macOS voice. That may help somewhat, but it does not deliver a true local neural TTS option.

Impact

Affected: macOS Talk Mode users who want a local TTS option or dislike the current system voice fallback. Severity: Medium to high. The current fallback voice quality is poor enough to materially hurt the Talk Mode experience. Frequency: Whenever ElevenLabs is unavailable, disabled, or intentionally avoided. Consequence: Worse UX, less willingness to use Talk Mode locally, and higher maintenance risk if MLX lands as a hidden special case instead of a real provider.

Evidence/examples

  • Prior attempt: #62534
  • Existing generic Talk provider/config seam already exists in:
    • src/config/talk.ts
    • apps/macos/Sources/OpenClaw/TalkModeGatewayConfig.swift
  • Current macOS runtime still effectively assumes ElevenLabs + system fallback in:
    • apps/macos/Sources/OpenClaw/TalkModeRuntime.swift

Additional information

This issue is intentionally narrower than #62534. The goal is a quick, stable MLX MVP first. Streaming, lower-latency playback, model selection, and richer voice controls can come later once the provider boundary and stop/fallback semantics are solid.

extent analysis

TL;DR

Implement MLX as a first-class Talk provider in the macOS Talk Mode to improve local TTS quality without routing through ElevenLabs.

Guidance

  • Add talk.provider=mlx support to the existing Talk provider/config seam to enable MLX as a provider option.
  • Modify TalkModeRuntime.swift to switch on the provider (e.g., elevenlabs, system, mlx) instead of hiding MLX behind the system-voice path.
  • Implement a non-streaming MLX path for the initial version, synthesizing one utterance and playing it with the existing TalkAudioPlayer.
  • Ensure stop() interrupts active MLX playback to maintain consistent playback semantics.

Example

// TalkModeRuntime.swift
switch provider {
case .elevenlabs:
    // existing implementation
case .system:
    // existing implementation
case .mlx:
    // new MLX implementation using TalkMLXSpeechSynthesizer
    let synthesizer = TalkMLXSpeechSynthesizer()
    synthesizer.synthesize(utterance) { audio in
        TalkAudioPlayer.play(audio)
    }
}

Notes

This implementation focuses on a minimal, stable MLX MVP, deferring features like streaming playback and model selection for later development.

Recommendation

Apply the proposed solution to implement MLX as a first-class Talk provider, as it offers a more straightforward and maintainable approach than the alternatives considered.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING