ollama - 💡(How to fix) Fix Huge difference in image input tokens with local Qwen3.5 versions when format="json" specified [3 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
ollama/ollama#15592Fetched 2026-04-16 06:35:56
View on GitHub
Comments
3
Participants
2
Timeline
5
Reactions
0
Author
Participants
Timeline (top)
commented ×3labeled ×1renamed ×1

Code Example

IMG=$(base64 < Image_001.jpeg | tr -d '\n')

echo "{\"model\": \"qwen3.5:27b\", \"messages\": [{\"role\": \"user\",  \"content\": \"\",\"images\": [\"$IMG\"] }],\"stream\": false}" | curl -X POST http://0.0.0.0:11434/api/chat -H "Content-Type:
 application/json" -d @-
{"model":"qwen3.5:27b","created_at":"2026-04-14T12:57:14.954993416Z","message":{"..."},"done":true,"done_reason":"stop","total_duration":160473406601,"load_duration":6219169752,"prompt_eval_count":432,"prompt_eval_duration":724455268,"eval_count":2003,"eval_duration":152220636272}


echo "{\"model\": \"qwen3.5:27b-q4_K_M\", \"messages\": [{\"role\": \"user\",  \"content\": \"\",\"images\": [\"$IMG\"] }],\"stream\": false, \"format\": \"json\"}" | curl -X POST http://0.0.0.0:11434/api/chat -H "Content-Type: application/json" -d @-
{"model":"qwen3.5:27b-q4_K_M","created_at":"2026-04-14T21:27:56.795489648Z","message":{"..."},"done":true,"done_reason":"stop","total_duration":156916654288,"load_duration":8222672080,"prompt_eval_count":1789,"prompt_eval_duration":2405578208,"eval_count":300,"eval_duration":26354827648}

echo "{\"model\": \"qwen3.5:397b-cloud\", \"messages\": [{\"role\": \"user\",  \"content\": \"\",\"images\": [\"$IMG\"] }],\"stream\": false}" | curl -X POST http://0.0.0.0:11434/api/chat -H "Content-Type: application/json" -H "Authorization: Bearer $OLLAMA_API_KEY" -d @-
{"model":"qwen3.5:397b","created_at":"2026-04-14T12:53:46.888392524Z","message":{"..."},"done":true,"done_reason":"stop","total_duration":15905907345,"prompt_eval_count":432,"eval_count":775}

echo "{\"model\": \"qwen3.5:397b-cloud\", \"messages\": [{\"role\": \"user\",  \"content\": \"\",\"images\": [\"$IMG\"] }],\"stream\": false, \"format\": \"json\"}" | curl -X POST http://0.0.0.0:11434/api/chat -H "Content-Type: application/json" -H "Authorization: Bearer "$OLLAMA_API_KEY -d @-
{"model":"qwen3.5:397b","created_at":"2026-04-14T21:51:49.78707483Z","message":{"..."},"done":true,"done_reason":"stop","total_duration":15994634694,"prompt_eval_count":432,"eval_count":889}

echo "{\"model\": \"gemma4:31b-it-q4_K_M\", \"messages\": [{\"role\": \"user\",  \"content\": \"\",\"images\": [\"$IMG\"] }],\"stream\": false}" | curl -X POST http://0.0.0.0:11434/api/chat -H "Content-Type: application/json" -H "Authorization: Bearer "$OLLAMA_API_KEY -d @-
{"model":"gemma4:31b-it-q4_K_M","created_at":"2026-04-14T21:56:58.18721552Z","message":{"..."},"done":true,"done_reason":"stop","total_duration":90482110976,"load_duration":285760912,"prompt_eval_count":281,"prompt_eval_duration":99862896,"eval_count":915,"eval_duration":8977992150

echo "{\"model\": \"gemma4:31b-it-q4_K_M\", \"messages\": [{\"role\": \"user\",  \"content\": \"\",\"images\": [\"$IMG\"] }],\"stream\": false, \"format\": \"json\"}" | curl -X POST http://0.0.0.0:11434/api/chat -H "Content-Type: application/json" -H "Authorization: Bearer "$OLLAMA_API_KEY -d @-
{"model":"gemma4:31b-it-q4_K_M","created_at":"2026-04-14T21:59:13.743806688Z","message":{"..."},"done":true,"done_reason":"stop","total_duration":56701847024,"load_duration":220037664,"prompt_eval_count":286,"prompt_eval_duration":116230352,"eval_count":28,"eval_duration":2635753616}

---
RAW_BUFFERClick to expand / collapse

What is the issue?

With local versions of Qwen3.5 (9b, 27b, 35b, 122b), using a blank prompt and a single image (attached) as input, I notice a huge difference in input tokens depending whether format is unspecified or assigned to "json" (prompt_eval_count increases from 432 to 1789).

This does not occur with the cloud version (397b-cloud) (prompt_eval_count remains unchanged at 432)

This does not occur either with other models such as gemma4 (prompt_eval_count slightly increases from 281 to 286)

This phenomenon can be easily reproduced using the attached image an the curl commands below.

What is the reason for that?

IMG=$(base64 < Image_001.jpeg | tr -d '\n')

echo "{\"model\": \"qwen3.5:27b\", \"messages\": [{\"role\": \"user\",  \"content\": \"\",\"images\": [\"$IMG\"] }],\"stream\": false}" | curl -X POST http://0.0.0.0:11434/api/chat -H "Content-Type:
 application/json" -d @-
{"model":"qwen3.5:27b","created_at":"2026-04-14T12:57:14.954993416Z","message":{"..."},"done":true,"done_reason":"stop","total_duration":160473406601,"load_duration":6219169752,"prompt_eval_count":432,"prompt_eval_duration":724455268,"eval_count":2003,"eval_duration":152220636272}


echo "{\"model\": \"qwen3.5:27b-q4_K_M\", \"messages\": [{\"role\": \"user\",  \"content\": \"\",\"images\": [\"$IMG\"] }],\"stream\": false, \"format\": \"json\"}" | curl -X POST http://0.0.0.0:11434/api/chat -H "Content-Type: application/json" -d @-
{"model":"qwen3.5:27b-q4_K_M","created_at":"2026-04-14T21:27:56.795489648Z","message":{"..."},"done":true,"done_reason":"stop","total_duration":156916654288,"load_duration":8222672080,"prompt_eval_count":1789,"prompt_eval_duration":2405578208,"eval_count":300,"eval_duration":26354827648}

echo "{\"model\": \"qwen3.5:397b-cloud\", \"messages\": [{\"role\": \"user\",  \"content\": \"\",\"images\": [\"$IMG\"] }],\"stream\": false}" | curl -X POST http://0.0.0.0:11434/api/chat -H "Content-Type: application/json" -H "Authorization: Bearer $OLLAMA_API_KEY" -d @-
{"model":"qwen3.5:397b","created_at":"2026-04-14T12:53:46.888392524Z","message":{"..."},"done":true,"done_reason":"stop","total_duration":15905907345,"prompt_eval_count":432,"eval_count":775}

echo "{\"model\": \"qwen3.5:397b-cloud\", \"messages\": [{\"role\": \"user\",  \"content\": \"\",\"images\": [\"$IMG\"] }],\"stream\": false, \"format\": \"json\"}" | curl -X POST http://0.0.0.0:11434/api/chat -H "Content-Type: application/json" -H "Authorization: Bearer "$OLLAMA_API_KEY -d @-
{"model":"qwen3.5:397b","created_at":"2026-04-14T21:51:49.78707483Z","message":{"..."},"done":true,"done_reason":"stop","total_duration":15994634694,"prompt_eval_count":432,"eval_count":889}

echo "{\"model\": \"gemma4:31b-it-q4_K_M\", \"messages\": [{\"role\": \"user\",  \"content\": \"\",\"images\": [\"$IMG\"] }],\"stream\": false}" | curl -X POST http://0.0.0.0:11434/api/chat -H "Content-Type: application/json" -H "Authorization: Bearer "$OLLAMA_API_KEY -d @-
{"model":"gemma4:31b-it-q4_K_M","created_at":"2026-04-14T21:56:58.18721552Z","message":{"..."},"done":true,"done_reason":"stop","total_duration":90482110976,"load_duration":285760912,"prompt_eval_count":281,"prompt_eval_duration":99862896,"eval_count":915,"eval_duration":8977992150

echo "{\"model\": \"gemma4:31b-it-q4_K_M\", \"messages\": [{\"role\": \"user\",  \"content\": \"\",\"images\": [\"$IMG\"] }],\"stream\": false, \"format\": \"json\"}" | curl -X POST http://0.0.0.0:11434/api/chat -H "Content-Type: application/json" -H "Authorization: Bearer "$OLLAMA_API_KEY -d @-
{"model":"gemma4:31b-it-q4_K_M","created_at":"2026-04-14T21:59:13.743806688Z","message":{"..."},"done":true,"done_reason":"stop","total_duration":56701847024,"load_duration":220037664,"prompt_eval_count":286,"prompt_eval_duration":116230352,"eval_count":28,"eval_duration":2635753616}

Image

(note; the attached image is a fake bank transfer order with synthetic imaginary data)

Relevant log output

OS

Linux

GPU

NVIDIA GB10 (DGX SPARK)

CPU

No response

Ollama version

0.20.4

extent analysis

TL;DR

The issue can be mitigated by removing the "format": "json" parameter from the API request.

Guidance

  • The large difference in prompt_eval_count is observed only when the "format" is specified as "json" in the API request for local versions of Qwen3.5.
  • The issue does not occur with the cloud version of Qwen3.5 or with other models like gemma4, suggesting a potential model-specific or version-specific bug.
  • To verify, try removing the "format": "json" parameter from the API request and check if the prompt_eval_count remains consistent.
  • If the issue persists, further investigation into the model's implementation or the API's handling of the "format" parameter may be necessary.

Example

No code snippet is provided as the issue seems to be related to the API request parameters rather than code implementation.

Notes

The root cause of the issue is unclear, but it appears to be related to the interaction between the Qwen3.5 model and the "format": "json" parameter. Further debugging or investigation into the model's implementation or the API's handling of this parameter may be necessary to fully resolve the issue.

Recommendation

Apply workaround: Remove the "format": "json" parameter from the API request, as it seems to cause the discrepancy in prompt_eval_count for local versions of Qwen3.5. This change can help mitigate the issue until a more permanent fix is found.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

ollama - 💡(How to fix) Fix Huge difference in image input tokens with local Qwen3.5 versions when format="json" specified [3 comments, 2 participants]