openclaw - ✅(Solved) Fix [Bug]: Vision pipeline failure on macOS (M2): "Model does not support images" despite correct config, and "Failed to optimize image" persists after sharp install [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#74552Fetched 2026-04-30 06:23:04
View on GitHub
Comments
1
Participants
2
Timeline
4
Reactions
2
Timeline (top)
labeled ×2commented ×1cross-referenced ×1

Bug Description

On OpenClaw 2026.4.26, the vision pipeline is fundamentally broken across multiple platforms (macOS ARM64 and Linux ARM64). The issue is two-fold:

  1. Model Capability Misdetection: LLMs configured with input: ["text", "image"] are rejected with reason=Model does not support images.
  2. Missing/Inaccessible Dependency: The fallback image optimization fails because sharp is not bundled or correctly linked, even after manual attempts to install it globally.

Environment 1: macOS (M2 Mac Mini)

• OpenClaw: 2026.4.26 • Node.js: v25.8.2 • Arch: Apple Silicon (M2)

Environment 2: Linux (Oracle Cloud VPS)

• OS: Ubuntu 24.04.4 LTS (Kernel 6.17.0-1011-oracle) • Arch: aarch64 (ARM64) • Node.js: v25.9.0 • OpenClaw: 2026.4.26 (be8c246)

Steps to Reproduce

  1. Deploy OpenClaw on an ARM64 host (macOS or Linux).
  2. Configure any vision-capable model.
  3. Send an image.
  4. Observe Model does not support images error followed by Failed to optimize image.

Environment 1: macOS (M2 Mac Mini) - Primary Focus

• OpenClaw: 2026.4.26 • Node.js: v25.8.2 • Dependency Status: require('sharp') works perfectly in the terminal. • Actual Behavior: Despite sharp being available, the logs show:

01:58:32 [media-understanding] image: failed (0/1) reason=Model does not support images
01:59:03 [tools] image failed: Failed to optimize image raw_params={}

Environment 2: Linux (Oracle Cloud VPS)

• OS: Ubuntu 24.04.4 LTS (aarch64) • Node.js: v25.9.0 • Dependency Status: sharp is missing/not found. • Actual Behavior: Same as macOS.

Key Findings

  1. Model Detection Bug: The system incorrectly triggers a fallback because it fails to recognize the image capability of the model (likely Issue #65431).
  2. Optimization Logic Bug: On macOS, the fallback fails even with sharp installed. This suggests that the error Failed to optimize image might be swallowing a different underlying error (as suggested in Issue #73148) or that the optimization parameters (size/quality) are invalid for the M2 architecture.
  3. Inconsistency: The error message Failed to optimize image is too opaque, hiding whether it's a missing dependency (Linux case) or a runtime processing error (macOS case).

Steps to Reproduce

  1. On an M2 Mac with sharp installed, run OpenClaw 2026.4.26.
  2. Send an image to a vision-capable model.
  3. Observe the tool failure despite the environment being correctly set up.

Error Message

  1. Observe Model does not support images error followed by Failed to optimize image.
  2. Optimization Logic Bug: On macOS, the fallback fails even with sharp installed. This suggests that the error Failed to optimize image might be swallowing a different underlying error (as suggested in Issue #73148) or that the optimization parameters (size/quality) are invalid for the M2 architecture.
  3. Inconsistency: The error message Failed to optimize image is too opaque, hiding whether it's a missing dependency (Linux case) or a runtime processing error (macOS case). On macOS (M2), this happens even when sharp is manually installed and verifiable via require('sharp'). On Linux (ARM64), it fails similarly due to the module being missing from the expected path. The final result is that the LLM responds as if no image was provided or hallucinates an error message.

Root Cause

  1. Model Capability Misdetection: LLMs configured with input: ["text", "image"] are rejected with reason=Model does not support images.
  2. Missing/Inaccessible Dependency: The fallback image optimization fails because sharp is not bundled or correctly linked, even after manual attempts to install it globally.

Fix Action

Fixed

PR fix notes

PR #74567: fix: improve error messages for vision pipeline failures on ARM64

Description (problem / solution / changelog)

Summary

  • Improve diagnostics for ARM64/image pipeline failures without changing routing behavior.
  • Preserve redacted resize/Sharp failure details in optimizeImageToJpeg instead of throwing only Failed to optimize image.
  • Include the resolved model id and input capabilities when media understanding rejects a text-only model for image input.
  • Add focused regression coverage for both diagnostics.

Scope

This is a diagnostics-only improvement for #74552. It does not claim to fix the remaining image-model routing or packaged sharp investigation, so #74552 should stay open after this lands.

Refs openclaw/openclaw#74552

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • src/media-understanding/image.test.ts (modified, +29/-0)
  • src/media-understanding/image.ts (modified, +8/-1)
  • src/media/web-media.test.ts (modified, +8/-1)
  • src/media/web-media.ts (modified, +11/-2)

Code Example

01:58:32 [media-understanding] image: failed (0/1) reason=Model does not support images
01:59:03 [tools] image failed: Failed to optimize image raw_params={}

---

[media-understanding] image: failed (0/1) reason=Model does not support images
[tools] image failed: Failed to optimize image raw_params={}

---
RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Beta release blocker

No

Summary

Bug Description

On OpenClaw 2026.4.26, the vision pipeline is fundamentally broken across multiple platforms (macOS ARM64 and Linux ARM64). The issue is two-fold:

  1. Model Capability Misdetection: LLMs configured with input: ["text", "image"] are rejected with reason=Model does not support images.
  2. Missing/Inaccessible Dependency: The fallback image optimization fails because sharp is not bundled or correctly linked, even after manual attempts to install it globally.

Environment 1: macOS (M2 Mac Mini)

• OpenClaw: 2026.4.26 • Node.js: v25.8.2 • Arch: Apple Silicon (M2)

Environment 2: Linux (Oracle Cloud VPS)

• OS: Ubuntu 24.04.4 LTS (Kernel 6.17.0-1011-oracle) • Arch: aarch64 (ARM64) • Node.js: v25.9.0 • OpenClaw: 2026.4.26 (be8c246)

Steps to Reproduce

  1. Deploy OpenClaw on an ARM64 host (macOS or Linux).
  2. Configure any vision-capable model.
  3. Send an image.
  4. Observe Model does not support images error followed by Failed to optimize image.

Environment 1: macOS (M2 Mac Mini) - Primary Focus

• OpenClaw: 2026.4.26 • Node.js: v25.8.2 • Dependency Status: require('sharp') works perfectly in the terminal. • Actual Behavior: Despite sharp being available, the logs show:

01:58:32 [media-understanding] image: failed (0/1) reason=Model does not support images
01:59:03 [tools] image failed: Failed to optimize image raw_params={}

Environment 2: Linux (Oracle Cloud VPS)

• OS: Ubuntu 24.04.4 LTS (aarch64) • Node.js: v25.9.0 • Dependency Status: sharp is missing/not found. • Actual Behavior: Same as macOS.

Key Findings

  1. Model Detection Bug: The system incorrectly triggers a fallback because it fails to recognize the image capability of the model (likely Issue #65431).
  2. Optimization Logic Bug: On macOS, the fallback fails even with sharp installed. This suggests that the error Failed to optimize image might be swallowing a different underlying error (as suggested in Issue #73148) or that the optimization parameters (size/quality) are invalid for the M2 architecture.
  3. Inconsistency: The error message Failed to optimize image is too opaque, hiding whether it's a missing dependency (Linux case) or a runtime processing error (macOS case).

Steps to Reproduce

  1. On an M2 Mac with sharp installed, run OpenClaw 2026.4.26.
  2. Send an image to a vision-capable model.
  3. Observe the tool failure despite the environment being correctly set up.

Steps to reproduce

  1. Install OpenClaw 2026.4.26 on an ARM64 host (macOS M2 or Linux).
  2. Configure a vision-capable model (e.g., Gemini 1.5) with input: ["text", "image"] in openclaw.json.
  3. (On macOS) Ensure sharp is installed and verifiable via node -e "require('sharp')".
  4. Send any JPEG/PNG image to the agent and ask "What is in this image?".
  5. Observe the logs: It first triggers reason=Model does not support images, then fails with Failed to optimize image despite sharp being present on macOS.
[media-understanding] image: failed (0/1) reason=Model does not support images
[tools] image failed: Failed to optimize image raw_params={}

Expected behavior

The system should correctly identify that the model supports image input based on the configuration (input: ["text", "image"]). It should then pass the image data directly to the LLM (or use the local sharp/sips optimization successfully if a fallback is truly required) and provide a relevant analysis of the image content.

As seen in versions prior to 2026.4.x, the vision pipeline should be seamless when the correct provider and model capabilities are declared, without triggering an erroneous "Model does not support images" rejection.

Actual behavior

The image tool fails immediately when provided with an image. The logs show a two-stage failure:

  1. First, the model capability check fails: [media-understanding] image: failed (0/1) reason=Model does not support images.
  2. Then, the local optimization fallback fails: [tools] image failed: Failed to optimize image raw_params={}.

On macOS (M2), this happens even when sharp is manually installed and verifiable via require('sharp'). On Linux (ARM64), it fails similarly due to the module being missing from the expected path. The final result is that the LLM responds as if no image was provided or hallucinates an error message.

OpenClaw version

2026.4.26

Operating system

Mac OS 15.7.7(24G718)/Ubuntu 24.04.4 LTS (Kernel 6.17.0-1011-oracle)

Install method

npm global

Model

gemini-3.1-flash-lite/gemini-3-flash / nvidia_nim/google/gemma-4-31b-it

Provider / routing chain

google/ Nvidia-nim/ liteLLM-proxy/ollama

Additional provider/model setup details

No response

Logs, screenshots, and evidence

Impact and severity

No response

Additional information

No response

extent analysis

TL;DR

The issue can be addressed by ensuring the correct installation and linking of the sharp dependency, and potentially fixing the model capability detection logic.

Guidance

  • Verify that sharp is correctly installed and linked in the project by running node -e "require('sharp')" in the terminal.
  • Check the openclaw.json configuration file to ensure that the model is correctly configured with input: ["text", "image"].
  • Investigate the model capability detection logic to determine why it is failing to recognize the image capability of the model.
  • Consider updating the error message for Failed to optimize image to provide more detailed information about the underlying error.

Example

No code example is provided as the issue is more related to configuration and dependency management.

Notes

The issue seems to be related to a regression in the vision pipeline of OpenClaw 2026.4.26, and the provided information suggests that it is not a simple fix. Further investigation is required to determine the root cause of the model capability detection logic failure.

Recommendation

Apply a workaround by manually installing and linking the sharp dependency, and investigate the model capability detection logic to determine the root cause of the issue. This is because the issue is likely related to a combination of dependency management and logic errors, and a simple upgrade or fix may not be available.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

The system should correctly identify that the model supports image input based on the configuration (input: ["text", "image"]). It should then pass the image data directly to the LLM (or use the local sharp/sips optimization successfully if a fallback is truly required) and provide a relevant analysis of the image content.

As seen in versions prior to 2026.4.x, the vision pipeline should be seamless when the correct provider and model capabilities are declared, without triggering an erroneous "Model does not support images" rejection.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Bug]: Vision pipeline failure on macOS (M2): "Model does not support images" despite correct config, and "Failed to optimize image" persists after sharp install [1 pull requests, 1 comments, 2 participants]