openclaw - 💡(How to fix) Fix [Bug]: read tool and gateway media handler strip image data — multimodal models cannot see images [1 comments, 2 participants]

openclaw2026-05-13 15:17:19

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#81452•Fetched 2026-05-14 03:32:07

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Chaocro

Participants

Chaocro

clawsweeper[bot]

Timeline (top)

labeled ×2closed ×1commented ×1

The read tool and gateway media handler strip image data entirely, preventing vision-capable models from receiving user-uploaded images in their prompt context. This is a follow-up to #14707 (self-generated images) — the same root cause also affects images sent by users via webchat.

Root Cause

Code Example

Gateway message processing shows:
[media reference removed - already processed by model]

read tool output for image files shows:
[image data removed - already processed by model]

In both cases, no image data reaches the model context.

RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

Summary

Steps to reproduce

Start OpenClaw gateway with a vision-capable model configured (e.g., kimi/kimi-k2.6).
Open Control UI webchat (openclaw dashboard).
Send a screenshot/image to the agent via the chat input.
Ask the agent to describe what is in the image.
Observe that the image data is replaced with [media reference removed - already processed by model] in the agent context.
Alternatively, save an image to disk and use the read tool on it.
Observe that read outputs [image data removed - already processed by model] instead of passing the image to the model.

Expected behavior

When a vision-capable multimodal model (e.g., Kimi K2.6) is active, uploaded or read images should be passed as native image blocks in the model's prompt context, allowing the model to "see" and analyze the image directly.

Actual behavior

Gateway media handler replaces uploaded images with [media reference removed - already processed by model].
The read tool replaces image file content with [image data removed - already processed by model].
The model receives only a text placeholder instead of the actual image, making it impossible to analyze visual content.
This occurs even when the active model (kimi/kimi-k2.6) natively supports multimodal image input.

OpenClaw version

2026.5.7 (eeef486)

Operating system

Windows 11 (26200)

Install method

npm global

Model

kimi/kimi-k2.6 (vision-capable multimodal model)

Provider / routing chain

openclaw -> kimi

Additional provider/model setup details

Both deepseek and kimi providers are configured in ~/.openclaw/openclaw.json. The active session model is kimi/kimi-k2.6, which supports multimodal image input natively. The agents.defaults.models catalog includes kimi/kimi-k2.6 with alias "Kimi".

Logs, screenshots, and evidence

Gateway message processing shows:
[media reference removed - already processed by model]

read tool output for image files shows:
[image data removed - already processed by model]

In both cases, no image data reaches the model context.

Impact and severity

Affected: All users attempting to send images to vision-capable models via webchat or the read tool. Severity: High (breaks core multimodal functionality). Frequency: 100% reproducible (every image upload/read attempt). Consequence: Users cannot use image input with multimodal models; agents cannot analyze screenshots, photos, or diagrams. This significantly limits OpenClaw's utility for visual tasks.

Additional information

This appears related to #14707 (self-generated images cannot be injected into agent context) and #62514 (WEBUI image input edge cases). The root cause may be that the gateway media handler and read tool treat images as opaque attachments rather than passing them as multimodal content blocks to vision-capable models. A unified fix for native multimodal context injection might address all three issues.

Note on testing: This should be caught by a basic E2E test — upload an image to a vision-capable model and assert the model receives the image in its context. If such a test exists, it is not working. If it doesn't exist, it should be added.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

#inference speed #output truncation #response parsing #generation error #database connection

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix [Bug]: read tool and gateway media handler strip image data — multimodal models cannot see images [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

Bug type

Beta release blocker

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

FAQ

Expected behavior

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: read tool and gateway media handler strip image data — multimodal models cannot see images [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

Bug type

Beta release blocker

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING