openclaw - ✅(Solved) Fix [Bug]:webchat: image/PDF visual inspection fails with 'No media-understanding provider registered' [3 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#53687Fetched 2026-04-08 01:24:49
View on GitHub
Comments
0
Participants
1
Timeline
6
Reactions
0
Participants
Timeline (top)
cross-referenced ×4labeled ×2

In OpenClaw webchat, Ballie can fetch Dropbox files and even render a linked PDF to PNG locally, but cannot visually inspect screenshots or rendered PDF pages because image analysis fails with a missing media-understanding provider.

Error Message

Observed error:

Root Cause

In OpenClaw webchat, Ballie can fetch Dropbox files and even render a linked PDF to PNG locally, but cannot visually inspect screenshots or rendered PDF pages because image analysis fails with a missing media-understanding provider.

PR fix notes

PR #29778: Fix WebChat image-only messages rejected as empty (opts.images not checked in hasMediaAttachment)

Description (problem / solution / changelog)

Fixes #24662 Fixes #43590 Fixes #45487 Fixes #45917 Fixes #46534 Fixes #52673 Fixes #53271 Fixes #56561 Fixes #57064 Related #41801 Related #44446 Related #53687


Summary

  • Problem: WebChat image-only messages are incorrectly rejected as empty with "I didn't receive any text in your message" because hasMediaAttachment does not check opts.images
  • Why it matters: Blocks all multimodal workflows via WebChat — users cannot send images through the Control UI for agent analysis
  • What changed: Added opts?.images && opts.images.length > 0 check to the hasMediaAttachment condition in get-reply-run.ts
  • What did NOT change: No changes to text handling, session context media paths, or other reply logic

Root Cause Analysis

The hasMediaAttachment guard in src/auto-reply/reply/get-reply-run.ts only checked sessionCtx.MediaPath and sessionCtx.MediaPaths, but not opts.images. When WebChat sends images via chat.send with the attachments parameter, they arrive as opts.images — completely bypassing the media check.

Other channels (Telegram, Discord, QQ, 飞书, etc.) use sessionCtx.MediaPath / sessionCtx.MediaPaths (set by their respective channel ingestors), so they are unaffected. This is a WebChat-specific bug.

  const hasMediaAttachment = Boolean(
-   sessionCtx.MediaPath || (sessionCtx.MediaPaths && sessionCtx.MediaPaths.length > 0),
+   sessionCtx.MediaPath ||
+   (sessionCtx.MediaPaths && sessionCtx.MediaPaths.length > 0) ||
+   (opts?.images && opts.images.length > 0),
  );
  if (!baseBodyTrimmed && !hasMediaAttachment) {
    await typing.onReplyStart();
    logVerbose("Inbound body empty after normalization; skipping agent run");
    typing.cleanup();
    return { text: "I didn't receive any text in your message. Please resend or add a caption." };
  }

WebChat Frontend Image Path (confirmed via source code audit)

The Control UI handles images through three paths:

  1. Paste — Listens for paste events, reads clipboardData.items of type image/*
  2. File pickerinput[type=file] change event with accept="image/*"
  3. Drag & dropdragover + drop event handlers

All three convert images to base64 data URLs and send them via chat.send:

await e.client.request(`chat.send`, {
  sessionKey: e.sessionKey,
  message: r,
  deliver: false,
  idempotencyKey: s,
  attachments: [{type: "image", mimeType: t.mimeType, content: base64Content}]
})

The gateway correctly parses these into opts.images via normalizeRpcAttachmentsToChatAttachments(). The only missing piece is the hasMediaAttachment guard not checking opts.images.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX (WebChat image handling)
  • CI/CD / infra

User-visible / Behavior Changes

  • Image-only messages sent via WebChat will now be correctly processed by the agent
  • Previously, such messages were rejected as empty with "I didn't receive any text in your message"

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No

Repro + Verification

Environment

  • OS: Linux 6.8.0-101-generic (x64)
  • Runtime/container: Node.js v22.22.1
  • Model/provider: Any multimodal model
  • Integration/channel: WebChat (Control UI)

Steps

  1. Open WebChat / Control UI
  2. Paste or drag-and-drop an image (no text caption)
  3. Send the message

Expected (after fix)

  • Agent receives and processes the image
  • Model sees the image content and responds accordingly

Actual (before fix)

  • Message rejected: "I didn't receive any text in your message. Please resend or add a caption."

Test Coverage

Added two test cases in src/auto-reply/reply/get-reply-run.media-only.test.ts:

  1. allows image-only messages via opts.images without sessionCtx media — Core scenario. Sends images via opts.images (WebChat path), verifies the message proceeds to the agent runner.
  2. still rejects empty messages when opts.images is an empty array — Boundary case ensuring images: [] still triggers the empty-body rejection.

Notes

  • #53271's analysis is partially incorrect: WebChat does upload images; the real problem is the backend rejecting them, not the frontend failing to upload.
  • A secondary issue exists: hasInboundMedia() in get-reply.ts has the same blind spot for opts.images, which may skip media understanding and sandbox staging for WebChat images. Tracked in #53687.

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? No
  • Migration needed? No
  • Revert plan: Single-line revert

CI Note

The 3 failing jobs (checks-node-test-4, checks-windows-node-test-1/2) fail on resolveChannelModelOverride > keeps bundled Feishu parent fallback matching before registry bootstrap — a known upstream issue caused by Tiangong model endpoint timeouts. Unrelated to this PR's changes. The new tests pass across all other lanes.

Failure Recovery (if this breaks)

  • How to disable/revert this change quickly: Revert the commit
  • Files/config to restore: N/A
  • Known bad symptoms reviewers should watch for: None expected

Risks and Mitigations

  • Risk: Minimal — single condition addition following existing patterns
  • Mitigation: Uses optional chaining and array length validation, matching the MediaPaths pattern

Changed files

  • src/auto-reply/reply/get-reply-run.media-only.test.ts (modified, +60/-0)
  • src/auto-reply/reply/get-reply-run.ts (modified, +3/-1)

PR #61808: 🤖 Fix WebChat image-only messages rejected as empty (opts.images not checked in hasMediaAttachment)

Description (problem / solution / changelog)

🤖 AI-Assisted PR

Built with Claude Code (AI coding agent).

  • Testing degree: Lightly tested — change applies existing test patterns; tests exist upstream
  • I confirm I understand the code: Yes — single-condition addition to `hasMediaAttachment` guard

Fixes #24662 Fixes #43590 Fixes #45487 Fixes #45917 Fixes #46534 Fixes #52673 Fixes #53271 Fixes #56561 Fixes #57064 Related #41801 Related #44446 Related #53687

Summary

  • Problem: WebChat image-only messages are incorrectly rejected as empty with "I didn't receive any text in your message" because hasMediaAttachment does not check opts.images
  • Why it matters: Blocks all multimodal workflows via WebChat — users cannot send images through the Control UI for agent analysis
  • What changed: Added opts?.images && opts.images.length > 0 check to the hasMediaAttachment condition in get-reply-run.ts
  • What did NOT change: No changes to text handling, session context media paths, or other reply logic

Root Cause Analysis

The hasMediaAttachment guard in src/auto-reply/reply/get-reply-run.ts only checked sessionCtx.MediaPath and sessionCtx.MediaPaths, but not opts.images. When WebChat sends images via chat.send with the attachments parameter, they arrive as opts.images — completely bypassing the media check.

Other channels (Telegram, Discord, QQ, 飞书, etc.) use sessionCtx.MediaPath / sessionCtx.MediaPaths (set by their respective channel ingestors), so they are unaffected. This is a WebChat-specific bug.

  const hasMediaAttachment = Boolean(
-   sessionCtx.MediaPath || (sessionCtx.MediaPaths && sessionCtx.MediaPaths.length > 0),
+   sessionCtx.MediaPath ||
+   (sessionCtx.MediaPaths && sessionCtx.MediaPaths.length > 0) ||
+   (opts?.images && opts.images.length > 0),
  );
  if (!baseBodyTrimmed && !hasMediaAttachment) {
    await typing.onReplyStart();
    logVerbose("Inbound body empty after normalization; skipping agent run");
    typing.cleanup();
    return { text: "I didn't receive any text in your message. Please resend or add a caption." };
  }

Change Type

  • Bug fix
  • Feature
  • Refactor

Security Impact

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No

Compatibility

  • Backward compatible? Yes
  • Config/env changes? No
  • Migration needed? No
  • Revert plan: Single-line revert

Changed files

  • src/auto-reply/reply/get-reply-run.media-only.test.ts (modified, +60/-0)
  • src/auto-reply/reply/get-reply-run.ts (modified, +233/-157)

PR #61811: 🤖 Fix WebChat image-only messages rejected as empty (opts.images not checked in hasMediaAttachment)

Description (problem / solution / changelog)

🤖 AI-Assisted PR

Built with Claude Code (AI coding agent).

  • Testing degree: Lightly tested — change follows existing patterns
  • I confirm I understand the code: Yes — single-condition addition to hasMediaAttachment guard

Fixes #24662 Fixes #43590 Fixes #45487 Fixes #45917 Fixes #46534 Fixes #52673 Fixes #53271 Fixes #56561 Fixes #57064 Related #41801 Related #44446 Related #53687

Summary

  • Problem: WebChat image-only messages are incorrectly rejected as empty with "I didn't receive any text in your message" because hasMediaAttachment does not check opts.images
  • Why it matters: Blocks all multimodal workflows via WebChat — users cannot send images through the Control UI for agent analysis
  • What changed: Added opts?.images && opts.images.length > 0 check to the hasMediaAttachment condition in get-reply-run.ts

Root Cause

The hasMediaAttachment guard only checked sessionCtx.MediaPath and sessionCtx.MediaPaths, but not opts.images. When WebChat sends images via chat.send with the attachments parameter, they arrive as opts.images — completely bypassing the media check.

Other channels (Telegram, Discord, QQ, 飞书, etc.) use sessionCtx.MediaPath / sessionCtx.MediaPaths, so they are unaffected. This is a WebChat-specific bug.

  const hasMediaAttachment = Boolean(
-   sessionCtx.MediaPath || (sessionCtx.MediaPaths && sessionCtx.MediaPaths.length > 0),
+   sessionCtx.MediaPath ||
+   (sessionCtx.MediaPaths && sessionCtx.MediaPaths.length > 0) ||
+   (opts?.images && opts.images.length > 0),
  );

Security Impact

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • Command/tool execution surface changed? No

Compatibility

  • Backward compatible? Yes
  • Config/env changes? No
  • Revert plan: Single-line revert

Changed files

  • src/auto-reply/reply/get-reply-run.media-only.test.ts (modified, +64/-0)
  • src/auto-reply/reply/get-reply-run.ts (modified, +3/-1)
RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Summary

In OpenClaw webchat, Ballie can fetch Dropbox files and even render a linked PDF to PNG locally, but cannot visually inspect screenshots or rendered PDF pages because image analysis fails with a missing media-understanding provider.

Steps to reproduce

  • Surface: webchat
  • Chat type: direct chat
  • Runtime observed in session: OpenClaw main agent session on macOS
  • Repo: openclaw/openclaw

Expected behavior

Expected behavior The assistant should be able to:

  • fetch image/PDF links (e.g. Dropbox)
  • inspect uploaded screenshots/images
  • inspect rendered PDF pages
  • discuss the actual visual content of diagrams/documents

Actual behavior

The assistant can:

  • fetch Dropbox links
  • download the PDF
  • convert the PDF to PNG locally

But image analysis fails at the final step, so the assistant cannot inspect the visual content.

Observed error:

All image models failed (2): openai/gpt-5-mini: No media-understanding provider registered for openai | anthropic/claude-opus-4-5: No media-understanding provider registered for anthropic

OpenClaw version

2026.3.23

Operating system

mac OS 15.7.5

Install method

npm global

Model

openAI 5

Provider / routing chain

Gateway

Additional provider/model setup details

~/.openclaw/openclaw.json

Logs, screenshots, and evidence

Impact and severity

Why this matters This blocks a very basic assistant workflow:

  • reading diagrams
  • reviewing screenshots
  • discussing PDF page layouts/graphics

It creates unnecessary friction for users who reasonably expect image/PDF inspection to work in chat.

Additional information

Suggestion Please ensure a media-understanding provider is registered and available for image analysis in this runtime/surface, or expose clearer capability/status reporting before the assistant attempts visual inspection.

extent analysis

Fix Plan

To resolve the issue of missing media-understanding provider for image analysis, follow these steps:

  • Register a media-understanding provider for the OpenClaw webchat:
    • Update the openclaw.json file in ~/.openclaw/ to include the media-understanding provider configuration.
    • Example configuration:
{
  "mediaUnderstandingProviders": [
    {
      "name": "openai",
      "provider": "openai/gpt-5-mini",
      "enabled": true
    },
    {
      "name": "anthropic",
      "provider": "anthropic/claude-opus-4-5",
      "enabled": true
    }
  ]
}
  • Ensure the media-understanding provider is installed and configured correctly:
    • Run npm install openai and npm install anthropic to install the required packages.
    • Update the OpenClaw code to import and register the media-understanding providers:
const openai = require('openai');
const anthropic = require('anthropic');

// Register media-understanding providers
openclaw.registerMediaUnderstandingProvider('openai', openai);
openclaw.registerMediaUnderstandingProvider('anthropic', anthropic);
  • Restart the OpenClaw webchat to apply the changes.

Verification

To verify that the fix worked, test the image analysis functionality in the OpenClaw webchat:

  • Upload a screenshot or PDF file to the chat.
  • Attempt to inspect the visual content of the uploaded file.
  • Verify that the assistant can successfully analyze the image and discuss its content.

Extra Tips

  • Ensure that the media-understanding providers are properly configured and enabled in the openclaw.json file.
  • Check the OpenClaw logs for any errors or warnings related to the media-understanding providers.
  • If issues persist, try updating the OpenClaw version or seeking further assistance from the OpenClaw community.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Expected behavior The assistant should be able to:

  • fetch image/PDF links (e.g. Dropbox)
  • inspect uploaded screenshots/images
  • inspect rendered PDF pages
  • discuss the actual visual content of diagrams/documents

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Bug]:webchat: image/PDF visual inspection fails with 'No media-understanding provider registered' [3 pull requests, 1 participants]