ollama - 💡(How to fix) Fix Anthropic compatibility: image content blocks are dropped when forwarded to vision-capable cloud models (v0.21.0) [2 comments, 2 participants]

ollama2026-04-21 06:47:41

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

ollama/ollama#15727•Fetched 2026-04-22 07:43:46

View on GitHub

Comments

Participants

Timeline

Reactions

Author

peter20201011-cmyk

Participants

dani931004

peter20201011-cmyk

Timeline (top)

commented ×2

Code Example

curl -s http://localhost:11434/v1/messages \
  -H "Content-Type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "kimi-k2.6:cloud",
    "max_tokens": 200,
    "messages": [{
      "role": "user",
      "content": [
        {"type": "text", "text": "Describe this tiny test image in one sentence."},
        {"type": "image", "source": {"type": "base64", "media_type": "image/png",
         "data": "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mNkYAAAAAYAAjCB0C8AAAAASUVORK5CYII="}}
      ]
    }]
  }'

RAW_BUFFERClick to expand / collapse

Bug description

When using Ollama's Anthropic-compatible endpoint (POST /v1/messages) with a vision-capable cloud model such as kimi-k2.6:cloud, image content blocks in the request are silently dropped before being forwarded to the model. The model only receives the surrounding text, so it responds as if no image was attached.

This breaks Claude Code (and any other Anthropic-API client) when combined with tools that send screenshots — e.g. the computer-use MCP server — since the model has no idea an image was ever sent.

Environment

Ollama: v0.21.0 (confirmed via /api/version)
Model: kimi-k2.6:cloud — ollama show reports Capabilities: vision, thinking, completion, tools
OS: Windows 11
Client: curl (identical behavior seen via Claude Code CLI)

Reproduction

curl -s http://localhost:11434/v1/messages \
  -H "Content-Type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "kimi-k2.6:cloud",
    "max_tokens": 200,
    "messages": [{
      "role": "user",
      "content": [
        {"type": "text", "text": "Describe this tiny test image in one sentence."},
        {"type": "image", "source": {"type": "base64", "media_type": "image/png",
         "data": "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mNkYAAAAAYAAjCB0C8AAAAASUVORK5CYII="}}
      ]
    }]
  }'

Expected

Model receives the image bytes and describes it (or at minimum reports back that it was received).

Actual

The model's thinking output says:

"looking at the input, I don't actually see an image - I see the text [Image: but no actual image content loaded"

usage.input_tokens is 23 — far below what a real image would add — confirming the base64 payload never reaches the model. The image block appears to be replaced by a text placeholder during the Anthropic → Ollama format conversion.

Notes

The same Kimi model works fine with images when used through Moonshot's own Anthropic endpoint, so the model itself is not the issue.
Ollama's own OpenAI-compatibility layer (/v1/chat/completions) handles images correctly for other vision models, which suggests the Anthropic layer's request translator is just missing the image → images[] mapping.
DeepWiki's Anthropic compatibility layer doc lists image blocks as a recognized type in MessagesRequest but does not indicate whether they are actively processed — this report confirms they are not.

Suggested fix

In the Anthropic compatibility translator, when a user message contains an image content block of type: "base64", the base64 data should be appended to the images array of the corresponding Ollama /api/chat message (or stored in the multimodal buffer for vision models), instead of being stringified into a [Image: placeholder.

extent analysis

TL;DR

Modify the Anthropic compatibility translator in Ollama to properly handle image content blocks by appending base64 data to the images array.

Guidance

Review the Anthropic compatibility translator code to identify where the image content block is being stringified into a [Image: placeholder.
Update the translator to append the base64 data to the images array of the corresponding Ollama /api/chat message.
Verify that the model receives the image bytes by checking the usage.input_tokens value, which should be significantly higher than 23.
Test the updated translator with the provided curl command to ensure that the model correctly describes the image.

Example

// Example of the updated message structure
{
  "model": "kimi-k2.6:cloud",
  "max_tokens": 200,
  "messages": [{
    "role": "user",
    "content": [
      {"type": "text", "text": "Describe this tiny test image in one sentence."},
      {"type": "image", "source": {"type": "base64", "media_type": "image/png",
       "data": "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mNkYAAAAAYAAjCB0C8AAAAASUVORK5CYII="}}
    ]
  }],
  "images": [
    {"type": "base64", "media_type": "image/png", "data": "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mNkYAAAAAYAAjCB0C8AAAAASUVORK5CYII="}
  ]
}

Notes

The provided curl command and expected output can be used to test the updated translator. The usage.input_tokens value can be used to verify that the image bytes are being received by the model.

Recommendation

Apply the suggested fix to the Anthropic compatibility translator to properly handle image content blocks and ensure that the model receives the image bytes. This should resolve the issue and allow the model to correctly describe the image.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #agent execution #callback error #memory management #API rate limit

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

ollama - 💡(How to fix) Fix Anthropic compatibility: image content blocks are dropped when forwarded to vision-capable cloud models (v0.21.0) [2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Code Example

Bug description

Environment

Reproduction

Expected

Actual

Notes

Suggested fix

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

ollama - 💡(How to fix) Fix Anthropic compatibility: image content blocks are dropped when forwarded to vision-capable cloud models (v0.21.0) [2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Code Example

Bug description

Environment

Reproduction

Expected

Actual

Notes

Suggested fix

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING