hermes - ✅(Solved) Fix [Feature] /voice vad — hands-free VAD mode via slash command [2 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#19927Fetched 2026-05-05 06:04:19
View on GitHub
Comments
0
Participants
1
Timeline
6
Reactions
0
Author
Participants
Timeline (top)
labeled ×4cross-referenced ×2

Error Message

def _enable_voice_vad(self): """Enable hands-free VAD (Voice Activity Detection) mode.

Enables voice mode, sets continuous recording, and immediately
starts listening — no push-to-talk key required. Recording
auto-stops on silence and auto-restarts after each agent response.
"""
if self._voice_mode:
    if self._voice_continuous and self._voice_recording:
        _cprint(f"{_DIM}VAD mode is already active.{_RST}")
        return
    # Already in voice mode (push-to-talk) — switch to VAD
    with self._voice_lock:
        self._voice_continuous = True
else:
    # Enable voice mode first (checks requirements)
    self._enable_voice_mode()
    if not self._voice_mode:
        return  # requirements not met
    with self._voice_lock:
        self._voice_continuous = True

_cprint(f"\n{_ACCENT}🎤 VAD mode enabled — listening...{_RST}")
_cprint(f"  {_DIM}Speak to start recording. Auto-stops on {self._get_silence_duration()}s silence.{_RST}")
_cprint(f"  {_DIM}/voice off to disable{_RST}")

def _start_vad():
    try:
        self._voice_start_recording()
        if hasattr(self, '_app') and self._app:
            self._app.invalidate()
    except Exception as e:
        _cprint(f"\n{_DIM}VAD recording failed: {e}{_RST}")

threading.Thread(target=_start_vad, daemon=True).start()

Fix Action

Fixed

PR fix notes

PR #19929: Fix empty voice.record_key in CLI VAD mode

Description (problem / solution / changelog)

Closes #19915

Summary:

  • treat voice.record_key: "" as intentional hands-free VAD mode instead of trying to register an empty prompt_toolkit keybinding
  • centralize voice record-key normalization/display so voice enable/status output stays consistent
  • add regression tests that prove empty config skips keybinding registration

Validation:

  • python -m pytest tests/cli/test_voice_record_key.py -o 'addopts=' -q\n

Changed files

  • cli.py (modified, +33/-15)
  • tests/cli/test_voice_record_key.py (added, +77/-0)

PR #19984: feat(cli): add /voice vad command

Description (problem / solution / changelog)

Summary

  • add /voice vad to the classic CLI voice command handler
  • enable voice mode if needed, switch continuous VAD on, and start the first recording asynchronously
  • update the central slash command registry so help/autocomplete advertises vad

Scope

This intentionally stays in the classic CLI voice path. It does not change gateway voice-channel behavior or the existing /voice on|off|tts|status flows.

Fixes #19927

Verification

  • scripts/run_tests.sh tests/tools/test_voice_cli_integration.py tests/hermes_cli/test_commands.py -> 233 passed, 4 warnings
  • git diff --check -> passed
  • python -m ruff check cli.py tests/tools/test_voice_cli_integration.py -> unavailable locally (No module named ruff)

Changed files

  • cli.py (modified, +47/-2)
  • hermes_cli/commands.py (modified, +1/-1)
  • tests/tools/test_voice_cli_integration.py (modified, +66/-0)

Code Example

elif subcommand == "vad":
    self._enable_voice_vad()

---

def _enable_voice_vad(self):
    """Enable hands-free VAD (Voice Activity Detection) mode.

    Enables voice mode, sets continuous recording, and immediately
    starts listening — no push-to-talk key required. Recording
    auto-stops on silence and auto-restarts after each agent response.
    """
    if self._voice_mode:
        if self._voice_continuous and self._voice_recording:
            _cprint(f"{_DIM}VAD mode is already active.{_RST}")
            return
        # Already in voice mode (push-to-talk)switch to VAD
        with self._voice_lock:
            self._voice_continuous = True
    else:
        # Enable voice mode first (checks requirements)
        self._enable_voice_mode()
        if not self._voice_mode:
            return  # requirements not met
        with self._voice_lock:
            self._voice_continuous = True

    _cprint(f"\n{_ACCENT}🎤 VAD mode enabled — listening...{_RST}")
    _cprint(f"  {_DIM}Speak to start recording. Auto-stops on {self._get_silence_duration()}s silence.{_RST}")
    _cprint(f"  {_DIM}/voice off to disable{_RST}")

    def _start_vad():
        try:
            self._voice_start_recording()
            if hasattr(self, '_app') and self._app:
                self._app.invalidate()
        except Exception as e:
            _cprint(f"\n{_DIM}VAD recording failed: {e}{_RST}")

    threading.Thread(target=_start_vad, daemon=True).start()

---

def _get_silence_duration(self) -> float:
    """Return the configured silence auto-stop duration in seconds."""
    try:
        from hermes_cli.config import load_config
        return float(load_config().get("voice", {}).get("silence_duration", 3.0))
    except Exception:
        return 3.0

---

- _cprint("Usage: /voice [on|off|tts|status]")
+ _cprint("Usage: /voice [on|off|tts|status|vad]")
RAW_BUFFERClick to expand / collapse

Feature Description

Add a /voice vad slash command that enables hands-free Voice Activity Detection mode from within a session — no config restart required.

Currently, hands-free VAD requires setting voice.record_key: "" in config.yaml and restarting the REPL. Even then, there's no way to start the initial recording — VAD mode removes the push-to-talk keybinding but doesn't auto-start listening. This makes the documented "hands-free VAD" workflow broken at the start.

Proposed Solution

Three additions to cli.py:

1. Command handler (in _handle_voice_command)

elif subcommand == "vad":
    self._enable_voice_vad()

2. New method: _enable_voice_vad()

def _enable_voice_vad(self):
    """Enable hands-free VAD (Voice Activity Detection) mode.

    Enables voice mode, sets continuous recording, and immediately
    starts listening — no push-to-talk key required. Recording
    auto-stops on silence and auto-restarts after each agent response.
    """
    if self._voice_mode:
        if self._voice_continuous and self._voice_recording:
            _cprint(f"{_DIM}VAD mode is already active.{_RST}")
            return
        # Already in voice mode (push-to-talk) — switch to VAD
        with self._voice_lock:
            self._voice_continuous = True
    else:
        # Enable voice mode first (checks requirements)
        self._enable_voice_mode()
        if not self._voice_mode:
            return  # requirements not met
        with self._voice_lock:
            self._voice_continuous = True

    _cprint(f"\n{_ACCENT}🎤 VAD mode enabled — listening...{_RST}")
    _cprint(f"  {_DIM}Speak to start recording. Auto-stops on {self._get_silence_duration()}s silence.{_RST}")
    _cprint(f"  {_DIM}/voice off to disable{_RST}")

    def _start_vad():
        try:
            self._voice_start_recording()
            if hasattr(self, '_app') and self._app:
                self._app.invalidate()
        except Exception as e:
            _cprint(f"\n{_DIM}VAD recording failed: {e}{_RST}")

    threading.Thread(target=_start_vad, daemon=True).start()

3. Helper: _get_silence_duration()

def _get_silence_duration(self) -> float:
    """Return the configured silence auto-stop duration in seconds."""
    try:
        from hermes_cli.config import load_config
        return float(load_config().get("voice", {}).get("silence_duration", 3.0))
    except Exception:
        return 3.0

4. Update usage text

- _cprint("Usage: /voice [on|off|tts|status]")
+ _cprint("Usage: /voice [on|off|tts|status|vad]")

Behavior

ScenarioResult
/voice vad (voice off)Enables voice mode + sets continuous + starts recording immediately
/voice vad (voice on, push-to-talk)Switches to VAD mode + starts recording
/voice vad (already recording)No-op, "VAD mode is already active"
/voice vad (reqs not met)Shows requirements warning, no crash
/voice offDisables voice + continuous + stops recording
Agent respondsAuto-restarts recording (existing _voice_continuous logic)
Silence detectedAuto-stops, transcribes, restarts (existing logic)

Related

  • #19915: voice.record_key: "" crashes REPL startup — the empty record_key bug that prevents VAD config from working at all
  • This feature provides the in-session alternative to config-based VAD, and fixes the "first recording" gap (VAD mode had no way to start the initial recording loop)

extent analysis

TL;DR

To fix the hands-free Voice Activity Detection mode, implement the proposed solution by adding a /voice vad slash command that enables hands-free VAD mode from within a session.

Guidance

  • Implement the _enable_voice_vad method in cli.py to enable hands-free VAD mode, which sets continuous recording and immediately starts listening.
  • Update the _handle_voice_command method to handle the new /voice vad command.
  • Add a helper method _get_silence_duration to return the configured silence auto-stop duration in seconds.
  • Update the usage text to include the new /voice vad command.

Example

def _enable_voice_vad(self):
    # ... (implementation as proposed)

Notes

The proposed solution assumes that the voice.record_key: "" bug has been fixed, as mentioned in the related issue #19915.

Recommendation

Apply the proposed workaround by implementing the /voice vad command, as it provides a functional hands-free VAD mode without requiring a config restart.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - ✅(Solved) Fix [Feature] /voice vad — hands-free VAD mode via slash command [2 pull requests, 1 participants]