openclaw - 💡(How to fix) Fix voice-call: OpenAI realtime provider is broken against GA Realtime API (uses deprecated beta protocol)

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

The bundled OpenAI realtime voice provider in [email protected] (dist/realtime-voice-provider-*.js) is written against the preview/beta Realtime API and is broken against the GA Realtime API (which is the only version OpenAI now accepts).

End result: every inbound/outbound call through @openclaw/voice-call with realtime.provider = openai answers, attempts to bridge to OpenAI, fails immediately, and Twilio reports error 31921 (Stream — WebSocket Broken Pipe).

Error Message

[voice-call] realtime voice error: Unknown beta requested: 'realtime'. [voice-call] Failed to connect realtime bridge: Error: Unknown beta requested: 'realtime'. at WebSocket.<anonymous> (file:///.../dist/realtime-voice-provider-*.js:252:74)

Root Cause

Per the migration guide, even once the session is established, audio won't flow because the bridge handler listens for old event names:

Fix Action

Fix / Workaround

What I observed (in order, by patching the dist file)

Workaround I'm using

None — voice calling is unusable for me right now. I've reverted my local patches and will wait for an upstream fix.

Code Example

[voice-call] realtime voice error: Unknown beta requested: 'realtime'.
[voice-call] Failed to connect realtime bridge: Error: Unknown beta requested: 'realtime'.
    at WebSocket.<anonymous> (file:///.../dist/realtime-voice-provider-*.js:252:74)

---

defaultHeaders: {
  Authorization: `Bearer ${cfg.apiKey}`,
  "OpenAI-Beta": "realtime=v1"   // ← reject
}

---

const OPENAI_REALTIME_DEFAULT_MODEL = "gpt-realtime-1.5";

---

[voice-call] realtime voice error: Missing required parameter: 'session.type'.

---

[voice-call] realtime voice error: Unknown parameter: 'session.modalities'.
RAW_BUFFERClick to expand / collapse

Summary

The bundled OpenAI realtime voice provider in [email protected] (dist/realtime-voice-provider-*.js) is written against the preview/beta Realtime API and is broken against the GA Realtime API (which is the only version OpenAI now accepts).

End result: every inbound/outbound call through @openclaw/voice-call with realtime.provider = openai answers, attempts to bridge to OpenAI, fails immediately, and Twilio reports error 31921 (Stream — WebSocket Broken Pipe).

Environment

  • openclaw 2026.5.7
  • @openclaw/voice-call 2026.5.4
  • macOS 15.x arm64, Node v25.9
  • Provider: twilio
  • Realtime provider: openai (real sk-proj-... key, billed account)

What I observed (in order, by patching the dist file)

Each fix below uncovered the next failure. Errors come from gateway.err.log.

1. OpenAI-Beta: realtime=v1 header

[voice-call] realtime voice error: Unknown beta requested: 'realtime'.
[voice-call] Failed to connect realtime bridge: Error: Unknown beta requested: 'realtime'.
    at WebSocket.<anonymous> (file:///.../dist/realtime-voice-provider-*.js:252:74)

Source: GA migration explicitly removes this header.

"Don't include the OpenAI-Beta: header in any of the requests." — https://learn.microsoft.com/en-us/azure/foundry/openai/how-to/realtime-audio-preview-api-migration-guide

Affected lines in realtime-voice-provider-Bs3Q4Qlt.js:

defaultHeaders: {
  Authorization: `Bearer ${cfg.apiKey}`,
  "OpenAI-Beta": "realtime=v1"   // ← reject
}

2. Default model gpt-realtime-1.5

const OPENAI_REALTIME_DEFAULT_MODEL = "gpt-realtime-1.5";

Per current OpenAI docs (https://developers.openai.com/api/docs/guides/realtime-websocket) the GA model is gpt-realtime (alias) / gpt-realtime-2 (pinned). gpt-realtime-1.5 doesn't appear in the docs anymore.

3. Missing required session.type

After fixing 1+2:

[voice-call] realtime voice error: Missing required parameter: 'session.type'.

GA session.update requires session.type: "realtime" (or "transcription"). The provider's sendSessionUpdate() omits it.

4. Removed session.modalities

After fixing 3:

[voice-call] realtime voice error: Unknown parameter: 'session.modalities'.

modalities: ["text", "audio"] is no longer accepted at the session level in GA.

5. (Probable — didn't reach this point) Event name changes

Per the migration guide, even once the session is established, audio won't flow because the bridge handler listens for old event names:

  • response.text.deltaresponse.output_text.delta
  • response.audio.deltaresponse.output_audio.delta
  • response.audio_transcript.deltaresponse.output_audio_transcript.delta

And conversation item content types:

  • type=texttype=output_text
  • type=audiotype=output_audio

The current handleEvent() in OpenAIRealtimeVoiceBridge matches on the old names.

6. (Probable) Session field locations

The migration doc says "some properties are now in different locations" without enumerating them. turn_detection, input_audio_format, output_audio_format, temperature, input_audio_transcription may need restructuring too.

Repro

  1. Configure plugins.entries.voice-call with provider: twilio, realtime.enabled: true, realtime.provider: openai, valid OpenAI key with realtime access.
  2. Place a real Twilio number's Voice URL behind any public tunnel (we tried both Tailscale Funnel and Cloudflare quick tunnel — same result).
  3. Inbound call. Plugin serves realtime TwiML, opens WS from Twilio, attempts OpenAI WS bridge, fails per above. Twilio drops the call within 2-6 seconds with notification code 31921.

Setup-doctor blocker (separate, smaller bug)

openclaw voicecall setup rejects the config if both streaming.enabled and realtime.enabled are true, but the bundled OpenClaw config has both set to true by default after running through the typical onboarding flow. The error message ("cannot both be true") is good, but the default config shouldn't land in this state. Recommend defaulting streaming.enabled: false (or auto-disabling it when realtime.enabled is set).

Suggested fix

Rewrite the OpenAI realtime voice provider in realtime-voice-provider-*.js against the GA WebSocket protocol:

  • Remove OpenAI-Beta header
  • Update default model to gpt-realtime
  • Restructure session.update payload (add type, remove modalities, audit other fields against current API reference)
  • Update event-name matching in handleEvent()
  • Update content type matching for assistant message items

Reference implementation: https://developers.openai.com/api/docs/guides/realtime-websocket

Workaround I'm using

None — voice calling is unusable for me right now. I've reverted my local patches and will wait for an upstream fix.

Happy to test a PR / pre-release if useful.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING