openclaw - ✅(Solved) Fix [Bug] Image tool returns empty text for all OpenRouter vision models [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#70410Fetched 2026-04-23 07:25:07
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0
Participants

The image tool returns "Image model returned no text" for all OpenRouter vision models, both free and paid. Local models (LM Studio) work fine.

Error Message

  1. Models resolve correctly — no "Unknown model" error

Error Log

Root Cause

  1. Models resolve correctly — no "Unknown model" error
  2. API auth works — OpenRouter accepts the request (no 401/403)
  3. Response: finish_reason: "stop" but content is empty string
  4. Format: pi-ai correctly converts {type: "image", data: base64} to OpenAI {type: "image_url", image_url: {url: "data:..."}}

Fix Action

Fix / Workaround

I tried patching this to include {type: "text", text: prompt} in the user message content array, but it still returned empty text after a full gateway restart. The issue may be deeper — possibly in how extractAssistantText processes the response, or in pi-ai's response handling for OpenRouter.

PR fix notes

PR #50492: feat(openrouter): add media-understanding provider for image analysis

Description (problem / solution / changelog)

AI-assisted: Yes

Built with Vox (OpenClaw agent) + Claude.

Problem

The image tool doesn't work with OpenRouter vision models (e.g. openrouter/xiaomi/mimo-v2-omni) because there's no OpenRouter media-understanding provider registered. The provider whitelist in src/media-understanding/providers/index.ts is hardcoded to anthropic, openai, minimax, moonshot, zai, groq, and deepgram.

Solution

Add a media-understanding provider to the OpenRouter extension that:

  • Bypasses the model registry (OpenRouter models aren't registered there)
  • Calls OpenRouter's /v1/chat/completions API directly with inline base64 images
  • Handles both single-image (describeImage) and multi-image (describeImages) requests
  • Registers in the extension's register() method alongside the existing provider registration

This means any OpenRouter vision-capable model works with the image tool, including ones like openrouter/xiaomi/mimo-v2-omni that aren't in the model registry.

Files changed

  • extensions/openrouter/index.ts — import + register the media-understanding provider
  • extensions/openrouter/media-understanding-provider.ts — new file, OpenAI-compatible vision API wrapper

Changed files

  • extensions/openrouter/index.ts (modified, +2/-0)
  • extensions/openrouter/media-understanding-provider.ts (added, +129/-0)

Code Example

function buildImageContext(prompt, images) {
    return {
        systemPrompt: prompt,
        messages: [{
            role: "user",
            content: images.map((image) => ({
                type: "image",
                data: image.buffer.toString("base64"),
                mimeType: image.mime ?? "image/jpeg"
            })),
            // ^ No text content in user message!
        }]
    };
}

---

[tools] image failed: All image models failed (2): openrouter/openai/gpt-4.1-mini: Image model returned no text (openrouter/openai/gpt-4.1-mini). | openrouter/google/gemini-2.5-flash: Image model returned no text (openrouter/google/gemini-2.5-flash).
RAW_BUFFERClick to expand / collapse

Description

The image tool returns "Image model returned no text" for all OpenRouter vision models, both free and paid. Local models (LM Studio) work fine.

Environment

  • OpenClaw: 2026.4.9
  • Node: v22.22.0
  • OS: macOS arm64 (Darwin 25.3.0)

Models Tested (all fail)

  • openrouter/openai/gpt-4.1-mini (paid, non-reasoning)
  • openrouter/google/gemini-2.5-flash (paid)
  • openrouter/google/gemma-4-31b-it:free (free)
  • openrouter/qwen/qwen3.5-flash-02-23 (paid)

Working

  • lm-studio/qwen/qwen3.5-9b (local) — works in 14-21 seconds

Analysis

  1. Models resolve correctly — no "Unknown model" error
  2. API auth works — OpenRouter accepts the request (no 401/403)
  3. Response: finish_reason: "stop" but content is empty string
  4. Format: pi-ai correctly converts {type: "image", data: base64} to OpenAI {type: "image_url", image_url: {url: "data:..."}}

Source Code Findings

In image-BekbXrJh.js, buildImageContext puts the prompt in systemPrompt but the user message content array contains only images, no text:

function buildImageContext(prompt, images) {
    return {
        systemPrompt: prompt,
        messages: [{
            role: "user",
            content: images.map((image) => ({
                type: "image",
                data: image.buffer.toString("base64"),
                mimeType: image.mime ?? "image/jpeg"
            })),
            // ^ No text content in user message!
        }]
    };
}

OpenRouter docs recommend: "we recommend sending the text prompt first, then the images" — the current implementation only sends images in the user message.

I tried patching this to include {type: "text", text: prompt} in the user message content array, but it still returned empty text after a full gateway restart. The issue may be deeper — possibly in how extractAssistantText processes the response, or in pi-ai's response handling for OpenRouter.

Suggested Fix

  1. Add {type: "text", text: prompt} to user message content alongside images (per OpenRouter docs)
  2. Add debug logging in coerceImageAssistantText to log the raw stopReason, errorMessage, and message content when text is empty
  3. Investigate if pi-ai openai-completions.js handles OpenRouter's response format correctly for vision requests

Error Log

[tools] image failed: All image models failed (2): openrouter/openai/gpt-4.1-mini: Image model returned no text (openrouter/openai/gpt-4.1-mini). | openrouter/google/gemini-2.5-flash: Image model returned no text (openrouter/google/gemini-2.5-flash).

extent analysis

TL;DR

Modify the buildImageContext function to include the text prompt in the user message content array, as recommended by OpenRouter docs.

Guidance

  • Verify that the buildImageContext function is correctly modified to include the text prompt in the user message content array, alongside the images.
  • Add debug logging in coerceImageAssistantText to log the raw stopReason, errorMessage, and message content when text is empty, to help identify potential issues in the response handling.
  • Investigate if pi-ai's openai-completions.js handles OpenRouter's response format correctly for vision requests, as this may be a contributing factor to the issue.
  • Test the modified implementation with a local model, such as lm-studio/qwen/qwen3.5-9b, to ensure that the fix does not introduce any regressions.

Example

function buildImageContext(prompt, images) {
    return {
        systemPrompt: prompt,
        messages: [{
            role: "user",
            content: [
                { type: "text", text: prompt },
                ...images.map((image) => ({
                    type: "image",
                    data: image.buffer.toString("base64"),
                    mimeType: image.mime ?? "image/jpeg"
                })),
            ],
        }]
    };
}

Notes

The issue may be deeper than just the buildImageContext function, and may require further investigation into the response handling and pi-ai's implementation.

Recommendation

Apply the suggested fix to modify the buildImageContext function, as this is the most likely cause of the issue, and test thoroughly to ensure that the fix resolves the problem without introducing any regressions.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Bug] Image tool returns empty text for all OpenRouter vision models [1 pull requests, 1 participants]