litellm - ✅(Solved) Fix [Bug]: ollama_chat provider drops image_url content blocks — images never reach Ollama [2 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#24598Fetched 2026-04-08 01:32:29
View on GitHub
Comments
0
Participants
1
Timeline
6
Reactions
0
Author
Participants
Timeline (top)
labeled ×3cross-referenced ×2referenced ×1

When sending a multimodal message (text + image) via ollama_chat, the image is silently dropped before the HTTP request reaches Ollama. The model receives "images": [] despite the image being present in the input.

Error Message

This causes a 404, and the fallback ModelInfoBase returned on error does not set supports_vision=True, so LiteLLM may treat the model as text-only.

Root Cause

There are three bugs in the ollama_chat pipeline that interact:


Fix Action

Workaround

Monkey-patching extract_images_from_message to also check image_url blocks when content is still a list, and ensuring _flatten_ollama_content does not run before image extraction, restores correct behaviour.


Suggested labels: bug ollama multimodal vision

PR fix notes

PR #24615: fix(ollama): preserve image_url blocks in ollama_chat multimodal requests

Description (problem / solution / changelog)

Summary

Fixes #24598 — ollama_chat provider silently drops image_url content blocks in multimodal requests, so Ollama always receives "images": [] even when the caller supplies images.

Four bugs fixed across two files:


Bug 1 — Operation ordering: extract images before flattening content (ollama/chat/transformation.py)

convert_content_list_to_str (the "flatten" step) only retains type: text blocks — type: image_url blocks are intentionally omitted because images travel in Ollamaʼs separate images field. Previously extract_images_from_message was called after the flatten call. While neither function mutated m["content"] today, the ordering made the invariant fragile. Moving image extraction before the flatten call makes the contract explicit and prevents a silent regression if any future code mutates m["content"] in the text-processing path.


Bug 2 — if images is not None guard always True (ollama/chat/transformation.py)

extract_images_from_message returns [] (never None) when there are no images. The old guard

if images is not None:
    ollama_message["images"] = images

therefore set "images": [] on every Ollama message, even text-only ones. Changed to if images: so the key is omitted when the list is empty.


Bug 3 — if content_str is not None guard always True (ollama/chat/transformation.py)

convert_content_list_to_str returns "" (never None) for image-only messages (lists that contain no text blocks). The old guard forwarded "content": "" to Ollama for those messages. Changed to if content_str:.


Bug 4 — /api/show URL construction appends path twice (ollama/completion/transformation.py)

get_model_info built the model-info URL as:

url=f"{api_base}/api/show"

When api_base already ends with /api/chat (set internally by get_complete_url), this produced:

http://localhost:11434/api/chat/api/show   ← 404

instead of:

http://localhost:11434/api/show            ← correct

The fix strips the trailing /api/chat segment before appending /api/show, mirroring the guard that already exists in get_complete_url.


Tests

7 new regression tests added to tests/test_litellm/llms/ollama/test_ollama_chat_transformation.py:

ClassTests
TestOllamaImageUrlFixdata-URI extraction, HTTP URL passthrough, image-only message, text-only message, multiple images
TestOllamaGetModelInfoUrlFixstrips /api/chat suffix, plain base URL unchanged

Two existing tests updated to reflect corrected behaviour (empty-images list no longer attached to text-only messages; empty-content string no longer forwarded).

All 25 tests in the file pass (python3 -m pytest tests/test_litellm/llms/ollama/test_ollama_chat_transformation.py).


Type of change

  • Bug fix (non-breaking change that fixes an issue)
  • New tests

Changed files

  • litellm/llms/ollama/chat/transformation.py (modified, +9/-3)
  • litellm/llms/ollama/completion/transformation.py (modified, +5/-0)
  • tests/test_litellm/llms/ollama/test_ollama_chat_transformation.py (modified, +293/-9)

PR #24618: fix(ollama): preserve image_url blocks in ollama_chat multimodal requests

Description (problem / solution / changelog)

Summary

Fixes https://github.com/BerriAI/litellm/issues/24615ollama_chat provider silently drops image_url content blocks in multimodal requests.

Four bugs fixed across two files:

Bug 1 — Operation ordering: extract images before flattening content

convert_content_list_to_str only retains type: text blocks — type: image_url blocks are intentionally omitted because images travel in Ollama's separate images field. Previously extract_images_from_message was called after the flatten call. Moving image extraction before the flatten call makes the contract explicit and prevents a silent regression.

Bug 2 — if images is not None guard always True

extract_images_from_message returns [] (never None) when there are no images. The old guard set "images": [] on every Ollama message, even text-only ones. Changed to if images: so the key is omitted when the list is empty.

Bug 3 — if content_str is not None guard always True

convert_content_list_to_str returns "" (never None) for image-only messages. The old guard forwarded "content": "" to Ollama for those messages. Changed to if content_str:.

Bug 4 — /api/show URL construction appends path twice

When api_base ends with /api/chat, the URL became http://localhost:11434/api/chat/api/show (404). Fixed by stripping the /api/chat suffix before appending /api/show.

Tests

Updated 2 existing tests to expect correct behavior (empty keys not set). All 18 tests in test_ollama_chat_transformation.py pass.

Changed files

  • litellm/llms/ollama/chat/transformation.py (modified, +3/-3)
  • litellm/llms/ollama/completion/transformation.py (modified, +3/-0)
  • litellm/model_prices_and_context_window_backup.json (modified, +17/-0)
  • model_prices_and_context_window.json (modified, +17/-0)
  • tests/test_litellm/llms/ollama/test_ollama_chat_transformation.py (modified, +4/-5)

Code Example

content_str = convert_content_list_to_str(cast(AllMessageValues, m))
images = extract_images_from_message(cast(AllMessageValues, m))

---

def extract_images_from_message(message):
    images = []
    message_content = message.get("content")
    if isinstance(message_content, list):   # <-- never True after flattening
        for m in message_content:
            image_url = m.get("image_url")
            ...
    return images

---

http://localhost:11434/api/chat/api/show   ← 404

---

http://localhost:11434/api/show            ← correct

---

if api_base.endswith("/api/chat"):
    url = api_base
else:
    url = f"{api_base}/api/chat"

---

if api_base.endswith("/api/chat"):
    api_base = api_base[: -len("/api/chat")]
url = f"{api_base}/api/show"

---

import litellm

response = litellm.completion(
    model="ollama_chat/llama3.2-vision:11b",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What is in this image?"},
                {
                    "type": "image_url",
                    "image_url": {"url": "data:image/png;base64,<base64>"},
                },
            ],
        }
    ],
)
print(response)
# Model responds as if no image was provided.
# Ollama receives:
# {"messages": [{"role": "user", "content": "What is in this image?", "images": []}]}

---

import litellm

response = litellm.completion(
    model="ollama_chat/llama3.2-vision:11b",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What is in this image?"},
                {
                    "type": "image_url",
                    "image_url": {"url": "data:image/png;base64,<base64>"},
                },
            ],
        }
    ],
)
print(response)
# Model responds as if no image was provided.
# Ollama receives:
# {"messages": [{"role": "user", "content": "What is in this image?", "images": []}]}

---
RAW_BUFFERClick to expand / collapse

Check for existing issues

  • I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

Description

When sending a multimodal message (text + image) via ollama_chat, the image is silently dropped before the HTTP request reaches Ollama. The model receives "images": [] despite the image being present in the input.

Environment

  • LiteLLM version: 1.82.4
  • Provider: ollama_chat
  • Ollama version: latest
  • Model: any vision-capable model (reproduced with ministral-3:8b, llama3.2-vision:11b)

Root Cause

There are three bugs in the ollama_chat pipeline that interact:


Bug 1 — _flatten_ollama_content silently drops image_url blocks

_flatten_ollama_content iterates over content blocks and only preserves "type": "text" blocks. All other block types — including "type": "image_url" — are silently discarded. This means by the time transform_request receives the message, content is already a plain string with no image data.


Bug 2 — extract_images_from_message is called too late

In litellm/llms/ollama/chat/transformation.py, transform_request calls:

content_str = convert_content_list_to_str(cast(AllMessageValues, m))
images = extract_images_from_message(cast(AllMessageValues, m))

extract_images_from_message (in common_utils.py:1265) only extracts images when message["content"] is a list:

def extract_images_from_message(message):
    images = []
    message_content = message.get("content")
    if isinstance(message_content, list):   # <-- never True after flattening
        for m in message_content:
            image_url = m.get("image_url")
            ...
    return images

If any upstream normalization has already flattened content to a string (which _flatten_ollama_content does), extract_images_from_message always returns [] and ollama_message["images"] is never populated.


Bug 3 — get_model_info constructs a broken api/show URL

In litellm/llms/ollama/chat/transformation.py, get_model_info calls api/show to check model capabilities. However when api_base already ends with /api/chat (which is the case when LiteLLM sets it internally), the URL becomes:

http://localhost:11434/api/chat/api/show   ← 404

instead of:

http://localhost:11434/api/show            ← correct

This causes a 404, and the fallback ModelInfoBase returned on error does not set supports_vision=True, so LiteLLM may treat the model as text-only.

The fix in get_complete_url already guards against double-appending /api/chat:

if api_base.endswith("/api/chat"):
    url = api_base
else:
    url = f"{api_base}/api/chat"

get_model_info needs the same guard — strip /api/chat from api_base before constructing the /api/show URL.


Expected Fix

Bug 1 & 2: _flatten_ollama_content should not discard image_url blocks. extract_images_from_message should be called before convert_content_list_to_str, or convert_content_list_to_str should not be called on messages where content is still needed as a list for image extraction.

Bug 3: get_model_info should sanitize api_base the same way get_complete_url does:

if api_base.endswith("/api/chat"):
    api_base = api_base[: -len("/api/chat")]
url = f"{api_base}/api/show"

Minimal Reproduction

import litellm

response = litellm.completion(
    model="ollama_chat/llama3.2-vision:11b",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What is in this image?"},
                {
                    "type": "image_url",
                    "image_url": {"url": "data:image/png;base64,<base64>"},
                },
            ],
        }
    ],
)
print(response)
# Model responds as if no image was provided.
# Ollama receives:
# {"messages": [{"role": "user", "content": "What is in this image?", "images": []}]}

Workaround

Monkey-patching extract_images_from_message to also check image_url blocks when content is still a list, and ensuring _flatten_ollama_content does not run before image extraction, restores correct behaviour.


Suggested labels: bug ollama multimodal vision

Steps to Reproduce

Minimal Reproduction

import litellm

response = litellm.completion(
    model="ollama_chat/llama3.2-vision:11b",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What is in this image?"},
                {
                    "type": "image_url",
                    "image_url": {"url": "data:image/png;base64,<base64>"},
                },
            ],
        }
    ],
)
print(response)
# Model responds as if no image was provided.
# Ollama receives:
# {"messages": [{"role": "user", "content": "What is in this image?", "images": []}]}

Relevant log output

What part of LiteLLM is this about?

SDK (litellm Python package)

What LiteLLM version are you on ?

1.82.4

Twitter / LinkedIn details

No response

extent analysis

Fix Plan

To fix the issues with the ollama_chat pipeline, we need to address the three bugs mentioned:

  • Bug 1 & 2: Modify _flatten_ollama_content to preserve image_url blocks and ensure extract_images_from_message is called before convert_content_list_to_str.
  • Bug 3: Sanitize api_base in get_model_info to prevent double-appending /api/chat.

Here are the concrete steps:

Step 1: Modify _flatten_ollama_content

def _flatten_ollama_content(content):
    # Preserve image_url blocks
    flattened_content = []
    for block in content:
        if block["type"] == "text":
            flattened_content.append(block["text"])
        elif block["type"] == "image_url":
            flattened_content.append(block["image_url"])
    return flattened_content

Step 2: Update transform_request

def transform_request(message):
    # Call extract_images_from_message before convert_content_list_to_str
    images = extract_images_from_message(message)
    content_str = convert_content_list_to_str(message)
    # ...

Step 3: Sanitize api_base in get_model_info

def get_model_info(api_base):
    if api_base.endswith("/api/chat"):
        api_base = api_base[: -len("/api/chat")]
    url = f"{api_base}/api/show"
    # ...

Verification

To verify the fixes, you can use the minimal reproduction code provided in the issue body:

import litellm

response = litellm.completion(
    model="ollama_chat/llama3.2-vision:11b",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What is in this image?"},
                {
                    "type": "image_url",
                    "image_url": {"url": "data:image/png;base64,<base64>"},
                },
            ],
        }
    ],
)
print(response)

The model should now respond correctly, taking into account the provided image.

Extra Tips

  • Make sure to test the fixes thoroughly to ensure they do not introduce any new issues.
  • Consider adding additional logging or debugging statements to help identify any potential problems.
  • If you encounter any further issues, refer to the LiteLLM documentation and community resources for support.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING