openclaw - ✅(Solved) Fix [Bug]: Groq Whisper STT completely broken since 2026.3.31 — "deprecate legacy provider compat subpaths" broke audio provider resolution [3 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#59875Fetched 2026-04-08 02:39:27
View on GitHub
Comments
0
Participants
1
Timeline
12
Reactions
0
Author
Participants
Timeline (top)
referenced ×5cross-referenced ×3labeled ×2closed ×1

Groq Whisper audio transcription fails silently on 2026.3.31+ — provider resolution no longer finds Groq for the audio capability, confirmed working on 2026.3.14 with identical config and API key, verified via Groq dashboard showing zero requests.

Error Message

  1. Error log from 2026.4.2 instance:

audio understanding failed: Error: Media provider not available: groq

  1. Log trace from 2026.4.2-beta.1 instance (no error, pipeline bypassed):

Audio arrives at inbound monitor:

{"from":"<group-jid>","body":"media:audio","mediaPath":"...ogg","mediaType":"audio/ogg; codecs=opus"} → "inbound message"

Immediately forwarded to session with zero processing:

{"body":"[WhatsApp ...] <sender>: media:audio","mediaType":"audio/ogg; codecs=opus"} → "inbound web message"

No STT/Whisper/Groq/transcription entries appear between these two log lines. The audio pipeline is entirely skipped.

  1. Groq Console → Usage dashboard: Zero API requests from the 2026.4.x instance in 24+ hours. Confirms the request never leaves the machine.

  2. 2026.3.14 instance comparison: Same voice note, same API key — transcript delivered successfully. Groq Console shows the API request.

  3. Groq API key independently verified:

$ curl https://api.groq.com/openai/v1/models -H "Authorization: Bearer $GROQ_API_KEY"

Returns model list including whisper-large-v3-turbo ✅

  1. Related issue: #7573 — similar symptoms reported against 2026.2.1 (explicit audio models ignored, Groq Console shows zero requests).

Groq Console - 24h zero usage

Groq Console - 7 days usage


Root Cause

Groq Whisper audio transcription fails silently on 2026.3.31+ — provider resolution no longer finds Groq for the audio capability, confirmed working on 2026.3.14 with identical config and API key, verified via Groq dashboard showing zero requests.

Fix Action

Fix / Workaround

Consequence: Voice notes are silently dropped as unprocessable input. On 2026.4.2-beta.1, no error is logged — users may not realise their voice messages aren't being understood. The only partial mitigation is auto-detect fallback to free OpenAI Whisper, which is intermittent and unreliable. Paid Groq API key sits unused.

Temporary workaround: Remove explicit Groq config and rely on auto-detect, which occasionally falls through to the built-in free OpenAI Whisper fallback. This is intermittent and unreliable but provides partial STT coverage.

PR fix notes

PR #59926: fix(media): always resolve bundled capability providers via compat config when cfg is provided

Description (problem / solution / changelog)

Summary

  • Problem: Groq (and potentially other bundled audio providers like Deepgram) are silently unavailable even when correctly configured via tools.media.audio.models, producing "Media provider not available: groq" on every transcription attempt.
  • Why it matters: Groq Whisper STT is completely broken for any user whose active gateway registry already contains other media understanding providers (e.g. OpenAI for image understanding). The early-return in resolvePluginCapabilityProviders caused the bundled-compat load path to be skipped entirely.
  • Root cause: resolvePluginCapabilityProviders checked activeProviders.length > 0 and returned immediately — but activeProviders is the full active registry across all capability types, not just audio. So if OpenAI was loaded for image, groq was never loaded for audio.
  • What changed: When a caller config is provided, always resolve via the compat-config path (which injects bundled providers like groq/deepgram via withBundledPluginEnablementCompat). Fall back to the active registry only when no cfg is provided or compat load yields nothing.
  • What did NOT change: No config schema changes. No behavior change when no cfg is passed.

Change Type

  • Bug fix

Scope

  • Integrations

Linked Issue/PR

  • Closes #59875
  • Related #59502, #59437
  • This PR fixes a bug or regression

Root Cause / Regression History

  • Root cause: resolvePluginCapabilityProviders line 72 if (activeProviders.length > 0) return — early return skips the compat config load that injects bundled capability providers.
  • Missing detection / guardrail: No test covered the case where the active registry has providers for capability A (image) but the request is for capability B (audio) with a bundled provider not in the startup registry.
  • Prior context: The early return was introduced to prefer the active gateway registry when already populated; this optimization broke the fallback for capability-specific providers.
  • Why this regressed now: PR #3dbd81e610 restored bundled compat loading in resolveCapabilityProviderConfig but left the early-return guard intact, so the compat path was only reached when the active registry was completely empty.

Regression Test Plan

  • Coverage level: Unit test
  • Target test: src/plugins/capability-provider-runtime.test.ts (if exists) or src/media-understanding/runner.ts test suite
  • Scenario: active registry contains openai (image), user requests groq (audio) — groq must be returned
  • Existing test: extensions/groq/plugin-registration.contract.test.ts covers registration; gap is in resolvePluginCapabilityProviders multi-provider scenario

User-visible / Behavior Changes

Groq Whisper audio transcription now works when tools.media.audio.models: [{"provider": "groq", "model": "whisper-large-v3-turbo"}] is configured and GROQ_API_KEY is set, even when other media providers (OpenAI, Google) are already loaded in the gateway registry.

Security Impact

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No (Groq was already being called when active registry was empty)
  • Command/tool execution surface changed? No
  • Data access scope changed? No

Repro + Verification

Environment

  • OpenClaw 2026.4.1, Amazon Linux 2023
  • GROQ_API_KEY set, tools.media.audio.enabled: true

Steps

  1. Set tools.media.audio.models: [{"provider": "groq", "model": "whisper-large-v3-turbo"}]
  2. Enable OpenAI for image understanding (or any other media provider)
  3. Send voice note via WhatsApp/Telegram
  4. Before: audio understanding failed: Error: Media provider not available: groq
  5. After: Groq transcribes successfully

Expected

Groq is loaded and audio is transcribed

Actual (before fix)

Media provider not available: groq — provider silently missing from registry

Evidence

  • Issue #59875 reports exact error: "audio understanding failed: Error: Media provider not available: groq"
  • Code trace: resolvePluginCapabilityProvidersactiveProviders.length > 0 early-return → compat load skipped → groq not in registry → line 521 in runner.entries.ts throws

Human Verification

  • Traced the code path manually through resolvePluginCapabilityProvidersresolveRuntimePluginRegistryresolveCapabilityProviderConfigwithBundledPluginEnablementCompat
  • Confirmed the early-return bypasses compat loading when activeProviders.length > 0
  • Did NOT verify in a live environment with GROQ_API_KEY (no live Groq key available)

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.

Compatibility / Migration

  • Backward compatible? Yes — behavior only changes when cfg is provided and compat load succeeds
  • Config/env changes? No
  • Migration needed? No

Risks and Mitigations

  • Risk: compat load is slightly heavier than the active registry path
    • Mitigation: resolveRuntimePluginRegistry uses getCompatibleActivePluginRegistry cache internally; most calls will hit cache and be cheap

AI-assisted: Prepared with Claude via OpenClaw. Root cause identified by code trace through capability-provider-runtime.ts, bundled-compat.ts, and runner.entries.ts. Verified by contributor.

Changed files

  • docs/channels/slack.md (modified, +34/-1)
  • src/plugins/capability-provider-runtime.ts (modified, +20/-7)

PR #59982: fix: add enabledByDefault to groq and deepgram media plugin manifests

Description (problem / solution / changelog)

Problem

The groq and deepgram plugin manifests were missing "enabledByDefault": true. Without this flag, both plugins are treated as bundled-but-disabled-by-default during gateway startup.

resolveRuntimePluginRegistry() loads without them. When buildProviderRegistry later needs to resolve audio providers, resolvePluginCapabilityProviders short-circuits to the active registry (skipping the compat path), so groq and deepgram are absent from the registry.

This caused the error in #59875:

audio understanding failed: Error: Media provider not available: groq

Even with GROQ_API_KEY / DEEPGRAM_API_KEY correctly configured, the plugin never loaded.

Root cause

resolvePluginActivationState in src/plugins/config-state.ts reaches this branch for groq/deepgram:

if (params.origin === 'bundled') {
  return { enabled: false, reason: 'bundled (disabled by default)' };
}

Because params.enabledByDefault is undefined (missing from manifest).

Fix

Add "enabledByDefault": true to both extensions/groq/openclaw.plugin.json and extensions/deepgram/openclaw.plugin.json, matching the pattern used by mistral and other audio/media providers.

Testing

  • All 1213 plugin tests pass (pnpm exec vitest run src/plugins/)
  • All 117 media-understanding tests pass (pnpm exec vitest run src/media-understanding/)
  • Pre-commit hook (tsgo + oxlint) passes

Fixes #59875

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • extensions/deepgram/openclaw.plugin.json (modified, +1/-0)
  • extensions/groq/openclaw.plugin.json (modified, +1/-0)

PR #60392: fix(plugins): auto-enable media provider plugins referenced from tools.media

Description (problem / solution / changelog)

Problem

After the plugin activation refactor (f911bbc353), bundled media provider plugins like Groq silently stop working even when:

  • GROQ_API_KEY is set in env
  • tools.media.audio.models explicitly references provider: groq

The auto-enable logic handles channel plugins and web search/fetch providers, but never checks tools.media config for media understanding providers. Groq (and likely Deepgram) fall through to "bundled, disabled by default."

Multiple users affected since 2026.3.31 / 2026.4.x.

Fixes #59875, fixes #59502, fixes #54695, fixes #59437

Solution

  • Add collectConfiguredMediaProviderIds() to scan all tools.media.{models,audio,image,video}.models entries for provider references
  • Add resolveMediaUnderstandingProviderPluginIds() to map provider IDs → plugin IDs (bundled snapshots + third-party manifests via mediaUnderstandingProviders contract)
  • Wire both into resolveConfiguredPlugins() so referenced media providers get auto-enabled
  • Update configMayNeedPluginAutoEnable() and configMayNeedPluginManifestRegistry() guards
  • Add "media-provider-configured" kind to PluginAutoEnableCandidate union type

Review feedback addressed (v2)

  • P1 (greptile): Normalize bundled provider IDs via normalizeMediaProviderId() before map insertion to avoid alias mismatches (e.g. "gemini""google")
  • P1 (codex): Move media provider check outside the channels-only early return in configMayNeedPluginManifestRegistry so third-party media plugins can resolve without channel config
  • P2 (greptile): Skip registry loading for bundled-only media providers via isBundledMediaProviderId() gate

Testing

  • Added tests for bundled (Groq) and third-party media provider plugin auto-enable
  • Manually verified: Groq audio transcription works with only tools.media.audio.models: [{ provider: "groq" }] — no plugins.entries.groq.enabled: true workaround needed
  • Existing tests pass: vitest run src/config/plugin-auto-enable.test.ts && vitest run src/plugins/providers.test.ts

AI-assisted

This PR was authored with AI assistance (Claude Opus 4.6). Fully tested locally (unit tests + manual Groq transcription verification on a dev OpenClaw instance). We understand what the code does — it extends the existing plugin auto-enable pattern to cover media understanding providers the same way it already covers channels, browser, and web search/fetch providers.

Changed files

  • src/config/plugin-auto-enable.test.ts (modified, +64/-1)
  • src/config/plugin-auto-enable.ts (modified, +95/-9)

Code Example

1. Error log from 2026.4.2 instance:

audio understanding failed: Error: Media provider not available: groq


2. Log trace from 2026.4.2-beta.1 instance (no error, pipeline bypassed):

Audio arrives at inbound monitor:

{"from":"<group-jid>","body":"<media:audio>","mediaPath":"...ogg","mediaType":"audio/ogg; codecs=opus"}"inbound message"


Immediately forwarded to session with zero processing:

{"body":"[WhatsApp ...] <sender>: <media:audio>","mediaType":"audio/ogg; codecs=opus"}"inbound web message"


No STT/Whisper/Groq/transcription entries appear between these two log lines. The audio pipeline is entirely skipped.

3. Groq ConsoleUsage dashboard:
Zero API requests from the 2026.4.x instance in 24+ hours. Confirms the request never leaves the machine.

4. 2026.3.14 instance comparison:
Same voice note, same API key — transcript delivered successfully. Groq Console shows the API request.

5. Groq API key independently verified:

$ curl https://api.groq.com/openai/v1/models -H "Authorization: Bearer $GROQ_API_KEY"
# Returns model list including whisper-large-v3-turbo ✅


6. Related issue: #7573 — similar symptoms reported against 2026.2.1 (explicit audio models ignored, Groq Console shows zero requests).

---
![Groq Console - 24h zero usage](https://github.com/user-attachments/assets/614beb4a-cf0f-452e-8e6d-89f2ec75e9fc)

![Groq Console - 7 days usage](https://github.com/user-attachments/assets/b8b2d201-4eca-4d41-9ac3-17d285a43d79)

---
RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Beta release blocker

No

Summary

Groq Whisper audio transcription fails silently on 2026.3.31+ — provider resolution no longer finds Groq for the audio capability, confirmed working on 2026.3.14 with identical config and API key, verified via Groq dashboard showing zero requests.

Steps to reproduce

  1. Configure GROQ_API_KEY in env.vars, set tools.media.audio.enabled: true with scope: { default: "allow" }, and tools.media.audio.models: [{ "provider": "groq", "model": "whisper-large-v3-turbo" }].
  2. Restart gateway.
  3. Send a voice note (audio/ogg; codecs=opus) via WhatsApp to the OpenClaw instance running 2026.3.31 or later.
  4. Observe audio arrives as raw media:audio with no transcription — Groq dashboard confirms zero API requests.
  5. Repeat on an identical instance running 2026.3.14 with the same API key — transcription succeeds.

Expected behavior

In 2026.3.14, Groq Whisper transcribes WhatsApp voice notes successfully — audio is converted to text and delivered to the agent session, confirmed working daily for approximately two months. Groq Console confirms API requests are made.

Actual behavior

On OpenClaw 2026.3.31+, inbound voice notes arrive as raw media:audio with no transcription. Two distinct failure modes observed:

  • 2026.4.2: Gateway logs "audio understanding failed: Error: Media provider not available: groq" — provider lookup explicitly fails.
  • 2026.4.2-beta.1 (721cab2): No error logged. Logs show audio enters process-message pipeline and is forwarded directly to the agent session with zero processing steps — the STT pipeline is entirely bypassed.

Groq Console → Usage confirms zero API requests from the affected machine in 24+ hours. All config approaches (explicit tools.media.audio.models, shared tools.media.models with capabilities, and auto-detect) produce the same result.

OpenClaw version

Broken: 2026.3.31 (56b5ba0), 2026.4.1 (4f407d2), 2026.4.2 (f8e67ef), 2026.4.2-beta.1 (721cab2). Working: 2026.3.14 (594920f)

Operating system

macOS Monterey 12.7.x (x64)

Install method

pnpm dev

Model

anthropic/claude-sonnet-4.6 and anthropic/claude-opus-4.6

Provider / routing chain

WhatsApp (Baileys) → OpenClaw gateway → tools.media.audio pipeline → Groq API (whisper-large-v3-turbo) — pipeline fails before reaching Groq

Additional provider/model setup details

Auth: GROQ_API_KEY set in env.vars (2026.4.x instance) and top-level env (2026.3.14 instance, legacy format). No auth.profiles entry for Groq — key is resolved via env var only. models.providers is empty — Groq is not registered as a chat model provider (only used for audio).

Audio config tested on the 4.x instance:

  • tools.media.audio.enabled: true
  • tools.media.audio.scope: { "default": "allow" }
  • tools.media.audio.models: [{ "provider": "groq", "model": "whisper-large-v3-turbo" }]
  • Also tested: tools.media.models: [{ "provider": "groq", "model": "whisper-large-v3-turbo", "capabilities": ["audio"] }] — same result.
  • Also tested: auto-detect (no models arrays) — Groq is never selected, falls through to built-in OpenAI fallback.

No proxies, no baseUrl overrides, no per-agent audio config. Direct egress to api.groq.com.

Groq key verified independently: curl https://api.groq.com/openai/v1/models -H "Authorization: Bearer $GROQ_API_KEY" returns whisper models — confirms the key works outside OpenClaw.

Suspected trigger: 2026.3.31 changelog entry "deprecate legacy provider compat subpaths" — likely broke Groq's registration path for the audio capability.

Logs, screenshots, and evidence

1. Error log from 2026.4.2 instance:

audio understanding failed: Error: Media provider not available: groq


2. Log trace from 2026.4.2-beta.1 instance (no error, pipeline bypassed):

Audio arrives at inbound monitor:

{"from":"<group-jid>","body":"<media:audio>","mediaPath":"...ogg","mediaType":"audio/ogg; codecs=opus"}"inbound message"


Immediately forwarded to session with zero processing:

{"body":"[WhatsApp ...] <sender>: <media:audio>","mediaType":"audio/ogg; codecs=opus"}"inbound web message"


No STT/Whisper/Groq/transcription entries appear between these two log lines. The audio pipeline is entirely skipped.

3. Groq Console → Usage dashboard:
Zero API requests from the 2026.4.x instance in 24+ hours. Confirms the request never leaves the machine.

4. 2026.3.14 instance comparison:
Same voice note, same API key — transcript delivered successfully. Groq Console shows the API request.

5. Groq API key independently verified:

$ curl https://api.groq.com/openai/v1/models -H "Authorization: Bearer $GROQ_API_KEY"
# Returns model list including whisper-large-v3-turbo ✅


6. Related issue: #7573 — similar symptoms reported against 2026.2.1 (explicit audio models ignored, Groq Console shows zero requests).

---
![Groq Console - 24h zero usage](https://github.com/user-attachments/assets/614beb4a-cf0f-452e-8e6d-89f2ec75e9fc)

![Groq Console - 7 days usage](https://github.com/user-attachments/assets/b8b2d201-4eca-4d41-9ac3-17d285a43d79)

---

Impact and severity

Affected: Any OpenClaw user running 2026.3.31+ with Groq as their audio transcription provider. Tested on WhatsApp (Baileys) — likely affects all channels since the failure is in the media understanding pipeline, not the channel layer.

Severity: Blocks workflow. Voice notes are a primary input method — without transcription, the agent receives raw `` tags and cannot process spoken content.

Frequency: Always. 100% reproducible on every voice note sent to a 2026.3.31+ instance. Not intermittent — Groq is never reached.

Consequence: Voice notes are silently dropped as unprocessable input. On 2026.4.2-beta.1, no error is logged — users may not realise their voice messages aren't being understood. The only partial mitigation is auto-detect fallback to free OpenAI Whisper, which is intermittent and unreliable. Paid Groq API key sits unused.

Additional information

Last known good: 2026.3.14 (594920f).

First known bad: 2026.3.31 (56b5ba0).

Suspected trigger: 2026.3.31 changelog entry "deprecate legacy provider compat subpaths" — Groq's audio provider registration path appears to have been broken by this change.

Temporary workaround: Remove explicit Groq config and rely on auto-detect, which occasionally falls through to the built-in free OpenAI Whisper fallback. This is intermittent and unreliable but provides partial STT coverage.

Note on failure mode change: 2026.4.2 logs "Media provider not available: groq" (explicit error). 2026.4.2-beta.1 produces complete silence — no error, no transcription attempt logged. This suggests active changes to the provider resolution code between these versions, but Groq audio remains broken on both.

Related: #7573 (filed Feb 2026, similar symptoms on 2026.2.1 — may indicate a recurring or never-fully-resolved issue with Groq audio provider registration).

extent analysis

TL;DR

The most likely fix for the Groq Whisper audio transcription failure is to revert the change that deprecated legacy provider compat subpaths, which likely broke Groq's registration path for the audio capability, or to update the provider registration to be compatible with the new changes.

Guidance

  • Verify that the GROQ_API_KEY is correctly set in the environment variables and that the Groq API is accessible independently of OpenClaw.
  • Check the OpenClaw configuration to ensure that the Groq provider is correctly registered and configured for audio transcription.
  • Test the audio transcription with different versions of OpenClaw to confirm that the issue is indeed related to the change in the provider registration.
  • Consider temporarily removing the explicit Groq config and relying on auto-detect to fall back to the built-in free OpenAI Whisper fallback, although this is intermittent and unreliable.

Example

No code snippet is provided as the issue seems to be related to configuration and provider registration rather than code.

Notes

The issue seems to be related to a change in the provider registration mechanism in OpenClaw, which broke the registration path for the Groq audio provider. The fact that the issue is reproducible on every voice note sent to a 2026.3.31+ instance suggests that the problem is not intermittent and is likely related to the configuration or provider registration.

Recommendation

Apply a workaround by removing the explicit Groq config and relying on auto-detect to fall back to the built-in free OpenAI Whisper fallback, although this is intermittent and unreliable, until a proper fix is available that updates the provider registration to be compatible with the new changes.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

In 2026.3.14, Groq Whisper transcribes WhatsApp voice notes successfully — audio is converted to text and delivered to the agent session, confirmed working daily for approximately two months. Groq Console confirms API requests are made.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING