hermes - ✅(Solved) Fix [Bug]: Possible WSL2 TTS audio routing issue — playback doesn't reach Windows speakers? [1 pull requests, 1 comments, 2 participants]

hermes2026-04-29 17:28:27

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#17573•Fetched 2026-04-30 06:46:41

View on GitHub

Comments

Participants

Timeline

Reactions

Author

EternalShade3D

Participants

EternalShade3D

ygd58

Timeline (top)

labeled ×4commented ×1cross-referenced ×1referenced ×1

Error Message

Additional Logs / Traceback (optional)

Root Cause

Root Cause Analysis (optional)

Fix Action

Fix / Workaround

Me + AI created a workaround that seems to work for me:

#!/bin/bash
# WSL TTS Audio Playback Workaround
# Converts MP3 → WAV and plays via PowerShell SoundPlayer on Windows

# Check for custom Hermes play command first (WSL workaround)
    hermes_play = shutil.which("hermes-play")
    if hermes_play:
        players.insert(0, [hermes_play, file_path])

PR fix notes

PR #17608: fix(voice): add WSL2 PowerShell audio fallback for TTS playback

Repository: NousResearch/hermes-agent
Author: ygd58
State: open | merged: False
Link: https://github.com/NousResearch/hermes-agent/pull/17608

Description (problem / solution / changelog)

Problem

WSL2 does not expose Linux audio devices by default, causing TTS playback to fail silently — files are generated but no sound plays through Windows speakers.

Fix

When running in WSL2 and powershell.exe is available, add a PowerShell SoundPlayer fallback:

Detect Windows %TEMP% dir dynamically via cmd.exe + wslpath (no hardcoded username)
Convert MP3 → WAV via ffmpeg to a Windows-accessible temp path
Play via PowerShell SoundPlayer.PlaySync()
Clean up temp WAV after playback

Falls back to ffplay/aplay if PowerShell is unavailable. No new dependencies required.

Fixes #17573

Changed files

tools/voice_mode.py (modified, +41/-0)

Code Example

Debug report uploaded:
  Report     https://paste.rs/wtryX
  agent.log  https://paste.rs/v0Lae

---



---

sudo apt install ffmpeg libportaudio2 portaudio19-dev pulseaudio-utils alsa-utils

# Python packages (install in Hermes venv)
pip install sounddevice numpy edge-tts

---

#!/bin/bash
# WSL TTS Audio Playback Workaround
# Converts MP3 → WAV and plays via PowerShell SoundPlayer on Windows

WIN_TMP="/mnt/c/Users/Eternal/AppData/Local/Temp/hermes-tts.wav"
ffmpeg -i "$1" -f wav "$WIN_TMP" -loglevel quiet -y 2>/dev/null
WIN_PATH=$(wslpath -w "$WIN_TMP")
powershell.exe -NoProfile -Command "(New-Object Media.SoundPlayer '$WIN_PATH').PlaySync()"
rm -f "$WIN_TMP"

---

sudo chmod +x /usr/local/bin/hermes-play

---

if system == "Darwin":
        players.append(["afplay", file_path])
    
    # Check for custom Hermes play command first (WSL workaround)
    hermes_play = shutil.which("hermes-play")
    if hermes_play:
        players.insert(0, [hermes_play, file_path])
    
    players.append(["ffplay", "-nodisp", "-autoexit", "-loglevel", "quiet", file_path])
    if system == "Linux":
        players.append(["aplay", "-q", file_path])

---

try:
            devices = sd.query_devices()
            if not devices:
                if termux_capture:
                    notices.append("No PortAudio devices detected, but Termux:API microphone capture is available")
                elif os.environ.get('PULSE_SERVER'):
                    notices.append("No PortAudio devices detected, but WSL PulseAudio bridge is configured -- TTS playback works")
                else:
                    warnings.append("No audio input/output devices detected")

---

export PULSE_SERVER=unix:/mnt/wslg/PulseServer

RAW_BUFFERClick to expand / collapse

Bug Description

I'm not 100% sure if this is a bug or just expected WSL2 behavior, but /voice tts doesn't seem to produce audible output in my setup. The audio files are generated correctly, but I'm not hearing anything through my Windows speakers/earbuds.

It might be that WSL2 doesn't expose audio devices the same way native Linux does, but I wanted to flag it in case others are experiencing the same thing.

Steps to Reproduce

Running Hermes in WSL2 (Ubuntu) on Windows 11
Enable voice: /voice on → /voice tts
Send a message like "say something to test audio"
The CLI shows the TTS tool ran successfully and files appear in ~/.hermes/audio_cache/, but no sound plays through Windows

Expected Behavior

If TTS is enabled, I would expect to hear the audio through whatever audio device Windows is using (speakers, Bluetooth earbuds, etc.). Though maybe this needs special WSL configuration?

Actual Behavior

Audio files are generated but nothing plays. When I try ffplay manually, I see:

• ALSA lib pcm.c:2721:(snd_pcm_open_noupdate) Unknown PCM default • sounddevice finds 0 audio devices

It looks like WSL2 might not be routing audio to Windows properly, but I'm not certain if this is a Hermes issue or a WSL limitation.

Affected Component

CLI (interactive chat), Other

Messaging Platform (if gateway-related)

N/A (CLI only)

Debug Report

Debug report uploaded:
  Report     https://paste.rs/wtryX
  agent.log  https://paste.rs/v0Lae

Operating System

Ubuntu 24.04 via WSL2 on Windows 11

Python Version

3.11.15

Hermes Version

0.11.0

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

I'm guessing the issue is that play_audio_file() in tools/voice_mode.py tries ffplay and aplay, which probably expect native ALSA/PulseAudio devices. WSL2 might not expose those by default?

I'm not super familiar with Linux audio subsystems, but it seems like the playback step doesn't have a fallback for WSL environments where audio needs to route through Windows instead.

Proposed Fix (optional)

Me + AI created a workaround that seems to work for me:

• Created a script that converts MP3→WAV and plays it via PowerShell's SoundPlayer • Saves to a Windows temp folder so PowerShell can access it • Audio now routes to my Bluetooth earbuds correctly

I'm not sure if this is the "right" way to handle it, but it got things working on my end. If it helps.

Here's what I changed and what dependencies are needed:

Dependencies

The following packages need to be installed in WSL:

sudo apt install ffmpeg libportaudio2 portaudio19-dev pulseaudio-utils alsa-utils

# Python packages (install in Hermes venv)
pip install sounddevice numpy edge-tts

Note: ffmpeg is already installed by the Hermes installer, but the others need to be added manually for voice mode to work properly in WSL2.

File 1: `/usr/local/bin/hermes-play` (new script)

#!/bin/bash
# WSL TTS Audio Playback Workaround
# Converts MP3 → WAV and plays via PowerShell SoundPlayer on Windows

WIN_TMP="/mnt/c/Users/Eternal/AppData/Local/Temp/hermes-tts.wav"
ffmpeg -i "$1" -f wav "$WIN_TMP" -loglevel quiet -y 2>/dev/null
WIN_PATH=$(wslpath -w "$WIN_TMP")
powershell.exe -NoProfile -Command "(New-Object Media.SoundPlayer '$WIN_PATH').PlaySync()"
rm -f "$WIN_TMP"

Make it executable:

sudo chmod +x /usr/local/bin/hermes-play

File 2: `tools/voice_mode.py` (modified `play_audio_file()` function)

Added hermes-play as the first player option on Linux:

    if system == "Darwin":
        players.append(["afplay", file_path])
    
    # Check for custom Hermes play command first (WSL workaround)
    hermes_play = shutil.which("hermes-play")
    if hermes_play:
        players.insert(0, [hermes_play, file_path])
    
    players.append(["ffplay", "-nodisp", "-autoexit", "-loglevel", "quiet", file_path])
    if system == "Linux":
        players.append(["aplay", "-q", file_path])

File 3: `tools/voice_mode.py` (modified `detect_audio_environment()` function)

Updated the device check to not block when WSL has PulseAudio configured:

        try:
            devices = sd.query_devices()
            if not devices:
                if termux_capture:
                    notices.append("No PortAudio devices detected, but Termux:API microphone capture is available")
                elif os.environ.get('PULSE_SERVER'):
                    notices.append("No PortAudio devices detected, but WSL PulseAudio bridge is configured -- TTS playback works")
                else:
                    warnings.append("No audio input/output devices detected")

Environment Setup

Add to ~/.bashrc for PulseAudio bridge detection:

export PULSE_SERVER=unix:/mnt/wslg/PulseServer

How it works:

The script converts MP3 → WAV via ffmpeg and saves to a Windows-accessible temp folder
Uses PowerShell's SoundPlayer to play through Windows audio stack
play_audio_file() checks for hermes-play first, falling back to ffplay/aplay if not found
The environment check now recognizes WSL+PulseAudio as a valid setup

Important notes:

This is a workaround, not a proper fix. A real solution would need to handle WSL audio routing more gracefully in the codebase
The script uses a hardcoded Windows path — a better implementation would detect the Windows temp folder dynamically
Audio playback works on my setup (CMF Buds Pro 2 via Bluetooth), but I haven't tested with other audio devices
Voice input (microphone) still doesn't work in WSL2 — this only fixes TTS playback

Happy to help test any official fixes or refine this approach!

Are you willing to submit a PR for this?

I'd like to fix this myself and submit a PR

extent analysis

TL;DR

To fix the issue of no audible output from /voice tts in WSL2, use a workaround script that converts MP3 to WAV and plays it via PowerShell's SoundPlayer on Windows.

Guidance

Install required dependencies: In WSL, run sudo apt install ffmpeg libportaudio2 portaudio19-dev pulseaudio-utils alsa-utils and pip install sounddevice numpy edge-tts in the Hermes virtual environment.
Create a playback script: Add a script like /usr/local/bin/hermes-play that converts MP3 to WAV and plays it via PowerShell's SoundPlayer.
Modify play_audio_file(): Update the play_audio_file() function in tools/voice_mode.py to check for the custom hermes-play command first.
Configure PulseAudio bridge detection: Add export PULSE_SERVER=unix:/mnt/wslg/PulseServer to ~/.bashrc for PulseAudio bridge detection.

Example

The provided script /usr/local/bin/hermes-play demonstrates how to convert MP3 to WAV and play it via PowerShell's SoundPlayer:

#!/bin/bash
# WSL TTS Audio Playback Workaround
# Converts MP3 → WAV and plays via PowerShell SoundPlayer on Windows

WIN_TMP="/mnt/c/Users/Eternal/AppData/Local/Temp/hermes-tts.wav"
ffmpeg -i "$1" -f wav "$WIN_TMP" -loglevel quiet -y 2>/dev/null
WIN_PATH=$(wslpath -w "$WIN_TMP")
powershell.exe -NoProfile -Command "(New-Object Media.SoundPlayer '$WIN_PATH').PlaySync()"
rm -f "$WIN_TMP"

Notes

This workaround may not work for all audio devices or setups, and a proper fix would require handling WSL audio routing more gracefully in the codebase.

Recommendation

Apply the provided workaround

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #LLM response #prompt template #agent execution #environment setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

hermes - ✅(Solved) Fix [Bug]: Possible WSL2 TTS audio routing issue — playback doesn't reach Windows speakers? [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Additional Logs / Traceback (optional)

Root Cause

Root Cause Analysis (optional)

Fix Action

Fix / Workaround

PR fix notes

PR #17608: fix(voice): add WSL2 PowerShell audio fallback for TTS playback

Description (problem / solution / changelog)

Problem

Fix

Changed files

Code Example

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Affected Component

Messaging Platform (if gateway-related)

Debug Report

Operating System

Python Version

Hermes Version

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

Proposed Fix (optional)

Dependencies

File 1: /usr/local/bin/hermes-play (new script)

File 2: tools/voice_mode.py (modified play_audio_file() function)

File 3: tools/voice_mode.py (modified detect_audio_environment() function)

Environment Setup

How it works:

Important notes:

Are you willing to submit a PR for this?

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING

File 1: `/usr/local/bin/hermes-play` (new script)

File 2: `tools/voice_mode.py` (modified `play_audio_file()` function)

File 3: `tools/voice_mode.py` (modified `detect_audio_environment()` function)