claude-code - 💡(How to fix) Fix Feature request: JARVIS-style voice-to-voice hands-free mode for Claude Code [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#50720Fetched 2026-04-20 12:14:56
View on GitHub
Comments
1
Participants
2
Timeline
2
Reactions
0
Author
Participants
Timeline (top)
commented ×1labeled ×1

Fix Action

Fix / Workaround

I run a field-service CRM project (HVAC dispatcher, 903+ clients, 2 crews). I spend a lot of my working time away from my desk — driving between sites, walking around, hands full. My laptop stays at home and I connect to Claude Code via Remote Desktop from my iPhone, wearing AirPods.

RAW_BUFFERClick to expand / collapse

Problem / use case

I run a field-service CRM project (HVAC dispatcher, 903+ clients, 2 crews). I spend a lot of my working time away from my desk — driving between sites, walking around, hands full. My laptop stays at home and I connect to Claude Code via Remote Desktop from my iPhone, wearing AirPods.

Right now the workflow on the phone is painful:

  • I have to tap the mic on the iOS keyboard to dictate
  • Then tap the send button
  • Then read Claude's response by hand — or swipe twice with two fingers to trigger iOS "Speak Screen"
  • Then tap mic again
  • ...repeat

What I actually want is a continuous voice-to-voice conversation — the same experience ChatGPT Advanced Voice Mode / "JARVIS" gives you. I talk, pause, Claude Code hears me, reasons, answers out loud, I can interrupt, I answer back, no taps in between.

The closest thing today is the Claude mobile app Voice Mode, but that app doesn't see my project files / git / codebase — so it can't actually help me ship features. Claude Code can ship features but has no voice layer at all.

Proposed feature

A built-in Voice Mode in Claude Code:

  • Toggleable per session: /voice on or a dedicated hotkey
  • Uses a realtime STT → model → TTS pipeline (Anthropic-hosted, ideally using Claude with native audio I/O if/when available)
  • Continuous listening with barge-in (interrupt the assistant mid-sentence)
  • Streams Claude's response to TTS so the user hears it as it's generated, not after
  • For tool calls and long file operations, speaks a short status ("reading calendar-settings.ts, running typecheck…") so the user knows work is happening
  • Keeps the full text transcript visible in the terminal / web UI exactly as today — voice is additive, not replacing
  • Works in the web session too (claude.ai/code/session_…) so remote-desktop / mobile use-cases work

Why this is specifically valuable in Claude Code (not just chat)

Developers and small-business operators like me aren't only coding. We're reasoning about real work while in motion — triage, planning, reviewing what an agent did, deciding what to build next. Today I have to choose: tap-typing a complex instruction on a phone screen, or losing the project context by switching to the Claude chat app. A voice layer closes that gap.

Languages

Please make sure non-English locales are first-class — in my case Russian. Both STT and TTS should respect system locale / a --voice-lang flag.

Not what I'm asking for

  • Not "transcribe my dictation and paste it" — that exists everywhere
  • Not "read the last message aloud on command" — iOS Speak Screen already does that
  • What's missing is the continuous loop with barge-in

Alternatives considered

  • iOS Shortcuts → Claude Code CLI over SSH: fragile, can't barge in, no streaming
  • Custom pipeline with Whisper + Anthropic API + ElevenLabs: doable but out of scope for 99% of users, and loses the Claude Code tooling (git, file ops, hooks)
  • ChatGPT Advanced Voice Mode: great voice UX, no access to my project / files / codebase

Would love to see this land. Happy to beta-test on Russian + a real CRM codebase.

Thanks!

extent analysis

TL;DR

Implementing a continuous voice-to-voice conversation feature in Claude Code with real-time STT and TTS pipeline, supporting barge-in and streaming responses, would address the user's pain points.

Guidance

  • Investigate integrating Anthropic-hosted STT and TTS services into Claude Code to enable real-time voice interactions.
  • Consider adding a toggleable voice mode (/voice on or dedicated hotkey) to allow users to switch between text and voice input.
  • Ensure the voice layer is additive, not replacing the existing text transcript, and works seamlessly in both web and mobile sessions.
  • Prioritize support for non-English locales, including Russian, for both STT and TTS.

Example

No specific code example can be provided without more context on the existing Claude Code architecture and technology stack.

Notes

The proposed feature requires significant development and integration efforts, including handling barge-in, streaming responses, and supporting multiple locales. It's essential to weigh the complexity and potential impact on the existing codebase.

Recommendation

Apply a workaround using existing tools, such as iOS Shortcuts or custom pipelines, until a native voice layer can be implemented in Claude Code, as developing a custom solution may be out of scope for most users.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING