openclaw - ✅(Solved) Fix Google Meet realtime voice should allow human barge-in during assistant playback [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#73850Fetched 2026-04-29 06:14:14
View on GitHub
Comments
1
Participants
2
Timeline
2
Reactions
0
Author
Timeline (top)
commented ×1cross-referenced ×1

Fix Action

Fixed

PR fix notes

PR #73834: [codex] Fix Google Meet realtime barge-in playback

Description (problem / solution / changelog)

Summary

  • expose a realtime voice bridge handleBargeIn() hook so transports can interrupt provider output explicitly
  • add an optional Google Meet Chrome bargeInInputCommand side-channel microphone monitor with RMS, peak, and cooldown thresholds
  • suppress shared BlackHole loopback input while assistant audio is playing and hard-clear queued playback on human barge-in
  • add plugin entry/manifest config metadata, generated config baseline coverage, setup-status command validation, and docs for the local-command trust model

Why

The Chrome command-pair Meet bridge can use the same BlackHole device for injected assistant audio and captured meeting audio. During assistant playback, that loopback can mask human interruption and leave queued playback running even after provider-side truncation.

A separate local microphone command lets human speech take priority while loopback input is temporarily suppressed during assistant output.

Notes

  • bargeInInputCommand is optional and expects signed 16-bit little-endian mono PCM, for example a SoX CoreAudio capture command.
  • This is wired for the Gateway-hosted Chrome command-pair bridge. External bridge commands can continue to own their own interruption behavior.
  • The barge-in command uses the same operator-configured local command trust model as chrome.audioInputCommand and chrome.audioOutputCommand.
  • Follow-up for issue #73850 and the review notes in https://github.com/openclaw/openclaw/issues/73850#issuecomment-4339707135.

Testing

  • pnpm docs:list
  • pnpm test extensions/google-meet/index.test.ts
  • pnpm test extensions/openai/realtime-voice-provider.test.ts
  • pnpm build:plugin-sdk:dts
  • pnpm tsgo:core
  • pnpm tsgo:extensions
  • pnpm tsgo:test:extensions
  • pnpm format:docs:check
  • pnpm exec oxfmt --check --threads=1 extensions/google-meet/index.ts extensions/google-meet/src/runtime.ts extensions/google-meet/index.test.ts docs/plugins/google-meet.md
  • pnpm plugin-sdk:api:check
  • pnpm config:docs:check
  • git diff --check

No live Google Meet run was included in this PR validation; the behavior is covered by command-pair unit tests.

Changed files

  • docs/.generated/config-baseline.sha256 (modified, +2/-2)
  • docs/plugins/google-meet.md (modified, +38/-0)
  • docs/plugins/sdk-provider-plugins.md (modified, +5/-0)
  • extensions/google-meet/index.test.ts (modified, +202/-2)
  • extensions/google-meet/index.ts (modified, +20/-0)
  • extensions/google-meet/openclaw.plugin.json (modified, +38/-0)
  • extensions/google-meet/src/config.ts (modified, +23/-0)
  • extensions/google-meet/src/realtime.ts (modified, +134/-3)
  • extensions/google-meet/src/runtime.ts (modified, +5/-1)
  • extensions/google-meet/src/transports/types.ts (modified, +2/-0)
  • extensions/openai/realtime-voice-provider.ts (modified, +1/-1)
  • src/realtime-voice/provider-types.ts (modified, +1/-0)
  • src/realtime-voice/session-runtime.ts (modified, +2/-0)
RAW_BUFFERClick to expand / collapse

Problem

In Google Meet realtime voice sessions, the assistant can be hard to interrupt while it is speaking. In practice, the human speaker may start talking but the OpenClaw Agent keeps playing queued audio, so the conversation becomes assistant-led instead of human-led.

This is especially visible with the Chrome command-pair path when local audio routing uses a shared virtual device such as BlackHole for both injected assistant audio and captured meeting audio. Assistant playback can leak back into the realtime input path, and provider-side interruption/truncation does not always clear already-buffered local playback quickly enough.

Goal

The desired behavior is human-first voice interaction:

  • when the human starts speaking during assistant playback, assistant audio should stop quickly
  • assistant loopback audio should not be treated as user speech
  • the realtime provider should still receive real meeting/user audio once playback is no longer active
  • the fix should preserve existing Chrome command-pair and external bridge behavior

Proposed direction

One possible approach is to make the Google Meet Chrome command-pair bridge explicitly support barge-in:

  • add an optional side-channel local microphone command, separate from the shared BlackHole command pair, for detecting human interruption
  • temporarily suppress shared loopback input while assistant audio is being played
  • when human speech crosses configurable RMS/peak thresholds during assistant playback, call a realtime voice bridge interruption hook and clear the local output buffer
  • keep the side-channel command optional so existing setups are unchanged

This likely needs a small additive realtime voice provider hook, for example handleBargeIn(), so transports can ask providers that support interruption/truncation to stop active output. Providers that do not implement it can remain compatible.

Question for maintainers

Would this be an acceptable direction for a PR?

I have a draft prototype PR for reference at https://github.com/openclaw/openclaw/pull/73834, but I am opening this issue first to confirm whether the architecture is acceptable before making it review-ready or reshaping it.

extent analysis

TL;DR

Implementing a side-channel local microphone command and a handleBargeIn() hook in the realtime voice provider can help achieve human-first voice interaction by interrupting assistant audio when human speech is detected.

Guidance

  • Consider adding an optional side-channel local microphone command to detect human interruption during assistant playback.
  • Temporarily suppress shared loopback input while assistant audio is being played to prevent audio leakage.
  • Implement a handleBargeIn() hook in the realtime voice provider to stop active output when human speech crosses configurable thresholds.
  • Ensure the solution preserves existing Chrome command-pair and external bridge behavior.

Example

No code snippet is provided as the issue does not contain explicit code references.

Notes

The proposed direction seems to address the issue, but its feasibility and compatibility with existing setups need to be verified. The draft prototype PR can serve as a starting point for further discussion and refinement.

Recommendation

Apply workaround: Implement the proposed side-channel command and handleBargeIn() hook to achieve human-first voice interaction, as it seems to be a viable solution to the problem.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix Google Meet realtime voice should allow human barge-in during assistant playback [1 pull requests, 1 comments, 2 participants]