openclaw - 💡(How to fix) Fix [Bug]: Google Live browser Talk stuck at "Connecting Talk…" and failed sessions wedge embedded runner [2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#73601Fetched 2026-04-29 06:17:40
View on GitHub
Comments
2
Participants
2
Timeline
2
Reactions
0
Author
Timeline (top)
commented ×2

Error Message

  • #73427"Control UI Realtime Talk: CSP blocks OpenAI WebRTC and UI hides realtime audio errors" (filed 2026-04-28). Different transport (OpenAI WebRTC, not Google Live), but the second half of that issue — "the UI provides no actionable failure state" — is the same anti-pattern surfacing in Google Live: bundle accepts the wire, frames flow, UI shows no error, no transition out of "Connecting Talk…".

Root Cause

  • In ui/src/ui/chat/realtime-talk-google-live.ts, audit the message handler for binary frames vs. JSON frames. What signal triggers the Connecting Talk…Talk live transition? Is it tied to a JSON serverContent.modelTurn arriving, or to the first audio buffer scheduled? With current Google Live, the very first inbound message after the WS upgrade is often binary PCM, not a JSON setup ack.
  • Compare with the gateway-relay variant (ui/src/ui/chat/realtime-talk-gateway-relay.ts, referenced in #73460) — does it have the same gating logic, and if so does it work? If yes, the difference between the two is the fix.
  • Audit cleanup paths when a Talk session is stopped without ever reaching "Talk live". The embedded-runner wedge is consistent with a session-store entry created on Talk start that is never released because the "stop" path expects a "live" state to tear down.
  • The realtime-talk-live-smoke.ts smoke test in docs/providers/google.md:361 — is it actually exercising the Connecting TalkTalk live UI transition, or only the network-level token mint and WS upgrade? Fake-microphone smoke tests typically don't catch UI state-machine bugs.

Fix Action

Workaround

None clean. Switching to OpenAI Realtime via talk.provider: "openai" requires pay-per-token billing (rejected by this user). Reverting the talk block from config removes the broken Talk button from the composer and restores normal behavior, at the cost of the feature.

Code Example

"models": { "providers": { "google": { "apiKey": "<AI Studio key with Live API access>" } } },
   "plugins": { "entries": { "google": { "enabled": true } } },
   "talk": {
     "provider": "google",
     "providers": {
       "google": {
         "model": "gemini-3.1-flash-live-preview",
         "voice": "Kore"
       }
     },
     "interruptOnSpeech": true,
     "silenceTimeoutMs": 900
   }
RAW_BUFFERClick to expand / collapse

Severity: High — failed Talk attempts wedge the embedded runner so subsequent text chats fail to dispatch until a full gateway restart Affects: Anyone configuring talk.provider: "google" on 2026.4.26 Control UI Talk OpenClaw version: 2026.4.26 (be8c246) Source file (per #73460): ui/src/ui/chat/realtime-talk-google-live.ts Reported by: Scott Rolen (Tess) — Source One Staffing instance, 2026-04-28

Related issues already filed (cross-reference)

  • #73427"Control UI Realtime Talk: CSP blocks OpenAI WebRTC and UI hides realtime audio errors" (filed 2026-04-28). Different transport (OpenAI WebRTC, not Google Live), but the second half of that issue — "the UI provides no actionable failure state" — is the same anti-pattern surfacing in Google Live: bundle accepts the wire, frames flow, UI shows no error, no transition out of "Connecting Talk…".
  • #73460"Google Live browser Talk does not stop queued audio after interruption" (filed 2026-04-28). Same source file (realtime-talk-google-live.ts), different bug (interruption handling). Confirms AudioBufferSourceNode playback IS implemented in this file (lines 218-223 per that issue), so the playback path exists in code — the problem is gating on a state transition that never fires.

Headline symptom

After clicking the waves Talk button:

  1. talk.realtime.session returns 200 OK in ~300 ms (constrained-token mint succeeds)
  2. Browser opens wss://generativelanguage.googleapis.com/ws/google.ai.generativelanguage.v1alpha.GenerativeService.BidiGenerateContentConstrained?access_token=auth_tokens/... — HTTP 101 Switching Protocols (handshake good)
  3. Microphone PCM frames stream outbound correctly: {"realtimeInput":{"audio":{"data":"<base64-PCM>"}}} at ~10 KB/frame
  4. Google Live API responds with inbound binary audio frames (red ↓ Binary Message entries in DevTools Messages tab) — model IS generating audio
  5. Composer status row stays at Connecting Talk... forever — never transitions to Talk live
  6. No audible playback. Browser tab does not display Chrome's "this tab is producing audio" speaker icon
  7. No console errors related to decodeAudioData, AudioContext, AudioWorklet, play(), or anything realtime-Talk specific. The only console noise is the unrelated ScriptProcessorNode is deprecated warning from startMicrophonePump (mic capture path)

Critical secondary symptom: embedded runner wedge

After a failed Talk session, subsequent text chat dispatches stop processing until the gateway is restarted:

  • chat.send returns 200 in ~50 ms (dispatch acked, runId allocated)
  • No agent-run-start, no harness send, no model-reply markers ever follow
  • The dispatch is queued but the embedded runner does not dequeue
  • Affects all sessions on the gateway, not just the Talk session
  • Resolution: launchctl kickstart -k "gui/$(id -u)/ai.openclaw.gateway" clears it

This makes Talk failures destructive to the whole chat experience, not just an isolated Talk feature failure.

Browser audio stack confirmed healthy (rules out client-side cause)

  • new AudioContext()state: running after await ctx.resume()
  • 440 Hz oscillator → AudioBuffer → ctx.destinationplays audibly
  • Other Chrome tabs play YouTube/etc. while Talk is failing in Control UI tab
  • macOS output device, OS volume, Chrome tab mute states all verified correct

Tests that did NOT fix it

  • API key swap — replaced shared GCP key with a fresh AI Studio key. Inbound binary audio frames started arriving (confirming Live API access), but UI still stuck at "Connecting Talk…", still no playback.
  • Model name swapgemini-2.5-flash-native-audio-preview-12-2025 (the OpenClaw default in docs/providers/google.md) → gemini-3.1-flash-live-preview (matches the model name in Google's own Live API quickstart). Hot reload + gateway restart + hard reload. Still stuck at "Connecting Talk…", still no playback.
  • Multiple gateway restarts + hard browser reloads. No change to symptom.

Repro

  1. Install OpenClaw 2026.4.26 (npm package — npm install -g [email protected]; brew cask alone is the GUI app, not the gateway).
  2. Configure:
    "models": { "providers": { "google": { "apiKey": "<AI Studio key with Live API access>" } } },
    "plugins": { "entries": { "google": { "enabled": true } } },
    "talk": {
      "provider": "google",
      "providers": {
        "google": {
          "model": "gemini-3.1-flash-live-preview",
          "voice": "Kore"
        }
      },
      "interruptOnSpeech": true,
      "silenceTimeoutMs": 900
    }
  3. Open Control UI at http://127.0.0.1:18789/chat?session=... (loopback, secure context).
  4. Hard reload to ensure 2026.4.26 bundle (index-DV4QBOU4.js or successor).
  5. Click waves Talk button.
  6. Speak. Observe in DevTools Network → click WebSocket row → Messages tab.
  7. Expected: UI transitions to "Talk live" within seconds; assistant audio plays back through speakers.
  8. Actual: UI stays at "Connecting Talk…"; outbound realtimeInput audio frames present, inbound Binary Message frames present, no audible reply.
  9. Click Stop Talk.
  10. Type a text message in the composer, hit Enter.
  11. Expected: normal text-mode reply.
  12. Actual: chat.send is acked but no agent run fires. Text chat is now wedged; only fix is launchctl kickstart -k "gui/$(id -u)/ai.openclaw.gateway".

Suspected cause

Two probable bugs in ui/src/ui/chat/realtime-talk-google-live.ts:

  1. State machine never advances out of "Connecting Talk…". The bundle is presumably waiting for some specific Google Live setup-complete signal (e.g., a JSON message with setupComplete) but Google Live's first inbound message is binary audio. The state-transition gate may be looking for a JSON envelope that never arrives, or parsing the binary frame as if it were JSON and silently failing. The audio playback path probably sits behind that "live" gate, which is why no frames are decoded even though they arrive.

  2. Failed Talk session leaves a server-side lock or session ref open in the embedded runner. When subsequent text chats try to acquire whatever resource the Talk session was holding, they queue but never dispatch.

Suggested investigation

  • In ui/src/ui/chat/realtime-talk-google-live.ts, audit the message handler for binary frames vs. JSON frames. What signal triggers the Connecting Talk…Talk live transition? Is it tied to a JSON serverContent.modelTurn arriving, or to the first audio buffer scheduled? With current Google Live, the very first inbound message after the WS upgrade is often binary PCM, not a JSON setup ack.
  • Compare with the gateway-relay variant (ui/src/ui/chat/realtime-talk-gateway-relay.ts, referenced in #73460) — does it have the same gating logic, and if so does it work? If yes, the difference between the two is the fix.
  • Audit cleanup paths when a Talk session is stopped without ever reaching "Talk live". The embedded-runner wedge is consistent with a session-store entry created on Talk start that is never released because the "stop" path expects a "live" state to tear down.
  • The realtime-talk-live-smoke.ts smoke test in docs/providers/google.md:361 — is it actually exercising the Connecting TalkTalk live UI transition, or only the network-level token mint and WS upgrade? Fake-microphone smoke tests typically don't catch UI state-machine bugs.

OpenClaw doc note (separate from this issue, worth flagging in PR)

docs/providers/google.md:305 documents the default Live model as gemini-2.5-flash-native-audio-preview-12-2025. Google's own current Live API quickstart uses gemini-3.1-flash-live-preview. The OpenClaw default appears to be a frozen older preview model that may have already been deprecated/silently retired by Google, even if it still accepts WebSocket handshakes. Recommend tracking Google's currently-published default rather than pinning to a date-suffixed variant.

Workaround

None clean. Switching to OpenAI Realtime via talk.provider: "openai" requires pay-per-token billing (rejected by this user). Reverting the talk block from config removes the broken Talk button from the composer and restores normal behavior, at the cost of the feature.

Materials gathered

  • DevTools Network screenshot: wss handshake (101 Switching Protocols) to generativelanguage.googleapis.com/.../BidiGenerateContentConstrained
  • DevTools Network → Messages tab screenshot: outbound realtimeInput frames (10,997 bytes each, ~10 Hz) and inbound Binary Message entries (silent — never plays)
  • Console capture confirming no audio-related errors
  • Console one-liner test confirming AudioContext test tone plays on the same page (rules out browser-level audio block)
  • Gateway log evidence of wedge: chat.send ack with runId but no follow-up agent-run-start or harness-latency markers
  • API key tested independently — Live API confirmed enabled (inbound binary frames present); ruled out as cause

extent analysis

TL;DR

The most likely fix involves modifying the state machine in ui/src/ui/chat/realtime-talk-google-live.ts to correctly handle the transition from "Connecting Talk…" to "Talk live" when receiving binary audio frames from the Google Live API.

Guidance

  • Investigate the message handler in ui/src/ui/chat/realtime-talk-google-live.ts to determine what signal triggers the state transition and ensure it correctly handles binary frames.
  • Compare the gating logic with the gateway-relay variant (ui/src/ui/chat/realtime-talk-gateway-relay.ts) to identify potential differences.
  • Audit the cleanup paths when a Talk session is stopped without reaching "Talk live" to prevent embedded-runner wedging.
  • Verify that the realtime-talk-live-smoke.ts smoke test exercises the UI state-machine transition correctly.

Example

No code example is provided due to the complexity of the issue and the need for a thorough investigation of the codebase.

Notes

The issue is specific to the Google Live API integration and the state machine implementation in ui/src/ui/chat/realtime-talk-google-live.ts. The fix may require changes to the message handler, state transition logic, or cleanup paths.

Recommendation

Apply a workaround by modifying the talk block in the configuration to remove the broken Talk button from the composer, restoring normal behavior at the cost of the feature, until a proper fix can be implemented.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING