openclaw - 💡(How to fix) Fix Bug: WebChat HEIC images — tools.media.image pipeline [Image] description not injected correctly [1 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

When HEIC/HEIF images are uploaded via WebChat/Control UI, the tools.media.image pipeline processes the image (using a configured image model like grok-4.1-fast via dmxapi), but the resulting [Image] description block is either not injected into the agent conversation context, or describes the WRONG image (previous image instead of the current one).

The agent receives only [User sent media without caption] with no image data and no description, causing it to hallucinate or describe a previous image.

Root Cause

Root cause conjecture

Fix Action

Fixed

Code Example

User message content:
  - Only TEXT block: "[User sent media without caption]"  
  - NO image_url block (image data was stripped)
  - NO [Image] description block injected
RAW_BUFFERClick to expand / collapse

Summary

When HEIC/HEIF images are uploaded via WebChat/Control UI, the tools.media.image pipeline processes the image (using a configured image model like grok-4.1-fast via dmxapi), but the resulting [Image] description block is either not injected into the agent conversation context, or describes the WRONG image (previous image instead of the current one).

The agent receives only [User sent media without caption] with no image data and no description, causing it to hallucinate or describe a previous image.

Environment

  • OpenClaw: 2026.5.19 (a185ca2)
  • Host: Linux (AlmaLinux 9.5, x86_64)
  • Surface: WebChat / Control UI
  • Session model: xiaomi-coding/mimo-v2.5 (vision-capable)
  • Image model: dmxapi/grok-4.1-fast (configured via agents.defaults.imageModel)
  • tools.media.image: enabled with provider model entry
  • heif-convert (libheif 1.16.1) installed for HEIC→JPEG conversion

Repro

  1. Upload a .heic image via WebChat
  2. Ask the agent to describe it
  3. Agent either:
    • Hallucinates a description (no [Image] block present)
    • Describes the PREVIOUS image instead of the current one

What works

  • HEIC→JPEG conversion at filesystem level via heif-convert + symlink (xxx.heic → xxx.jpg) works correctly
  • The converted JPEG file content is valid and can be read by the image tool
  • JPG/PNG uploads work correctly — [Image] descriptions are properly injected

What fails

  • tools.media.image pipeline picks up the .heic file
  • grok-4.1-fast processes it
  • But the [Image] description block either:
    1. Is not injected into the agent context at all (agent sees only [User sent media without caption])
    2. OR describes a different image (race condition — pipeline reads the file before filesystem-level HEIC→JPEG conversion completes)

Analysis from session transcript

User message content:
  - Only TEXT block: "[User sent media without caption]"  
  - NO image_url block (image data was stripped)
  - NO [Image] description block injected

The [image data removed - already processed by model] tag appears, suggesting the pipeline ran, but the description is never provided to the agent.

Root cause conjecture

Two likely contributing factors:

  1. Race condition: The media pipeline reads the .heic file IMMEDIATELY upon arrival, before any external conversion (e.g., inotify-based heif-convert + symlink replacement) can complete. The pipeline encounters raw HEIC bytes and either fails silently or produces a stale/wrong result.

  2. Missing description injection: Even when the HEIC file is pre-converted to JPEG content (via symlink), the [Image] description block is not reliably injected into the agent conversation context for text-only or vision-capable session models when HEIC is the original upload format.

Suggested fix

  1. Add HEIC/HEIF normalization to the tools.media.image pipeline path (before the image model API call), similar to the normalization already done in src/media/input-files.ts (ref #50081)
  2. Ensure the [Image] description block is injected even when the original upload format was HEIC/HEIF
  3. Consider adding HEIC MIME type support to the media understanding pipeline so it can handle HEIC files natively without relying on external conversion

Related

  • #50081 (closed) — fixed HEIC normalization for direct model prompt images, but not the media pipeline description path
  • #17670 (closed) — iMessage HEIC attachments

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Bug: WebChat HEIC images — tools.media.image pipeline [Image] description not injected correctly [1 pull requests]