openclaw - 💡(How to fix) Fix DGX Spark voice-local: STT/TTS を GPU 対応にしてターン遅延を数秒化する [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#63203Fetched 2026-04-09 07:57:04
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0
Author
Participants
RAW_BUFFERClick to expand / collapse

背景\n\nDGX Spark 上の localhost-only voice prototype で、現状は STT/TTS が CPU 経路のため、話してから応答音声が返るまで 10 秒超になることがある。\n\n## 目的\n\n- STT を GPU 対応エンジンに切り替える\n- TTS も GPU 対応エンジンで動かす\n- 体感遅延を数秒台に落とす\n- localhost only / browser に token を渡さない制約は維持する\n\n## 受け入れ条件\n\n- voice-local から GPU STT provider と GPU TTS provider を選べる\n- DGX Spark (Linux arm64 + NVIDIA GPU) で実運用できる手順が README にある\n- CPU fallback は残してよいが、既定は GPU path に寄せる\n- 1 turn の計測で STT / LLM / TTS の各区間がログに出る\n\n## 補足\n\n- OpenClaw 本体の変更は最小限\n- partial transcript は LLM に送らない\n- raw audio は永続保存しない

extent analysis

TL;DR

Switching to GPU-supported STT and TTS engines is likely to reduce the latency in the voice prototype.

Guidance

  • Identify suitable GPU-supported STT and TTS engines that can be integrated with the existing prototype.
  • Modify the prototype to use the selected GPU-supported engines, ensuring that the CPU fallback option is still available.
  • Update the README to include instructions for running the prototype on DGX Spark with Linux arm64 and NVIDIA GPU.
  • Implement logging to measure the latency of each stage (STT, LLM, TTS) in a single turn.

Example

No specific code snippet can be provided without more details on the current implementation.

Notes

The solution may require significant changes to the prototype's architecture and dependencies. Ensuring compatibility with the DGX Spark environment and maintaining the localhost-only constraint are crucial considerations.

Recommendation

Apply workaround: Switch to GPU-supported STT and TTS engines to reduce latency while maintaining the existing prototype's constraints. This approach is likely to achieve the desired reduction in latency without requiring significant changes to the OpenClaw core.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING