langchain - 💡(How to fix) Fix fix(ollama): multimodal message text content incorrectly prefixed with \n

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

I´m trying to use deepseek-ocr running on my local hosted ollama to do an ocr of a document

Error Message

Vision models that are sensitive to exact prompt formatting return empty responses without any error or warning. One confirmed case: deepseek-ocr, which requires single-line prompts and silently produces an empty response when the content starts with \n. The failure is invisible to the user and very hard to debug.

Error Message and Stack Trace (if applicable)

Root Cause

I´m trying to use deepseek-ocr running on my local hosted ollama to do an ocr of a document

Fix Action

Fix / Workaround

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Code Example

# langchain_ollama/chat_models.py, inside _convert_messages_to_ollama_messages
content = ""
for content_part in message.content:
    ...
    elif content_part.get("type") == "text":
        content += f"\n{content_part['text']}"   # ← \n always prepended

---

HumanMessage(content=[
    {"type": "text", "text": "Extract all visible text from this image."},
    {"type": "image_url", "image_url": "data:image/png;base64,<data>"},
])

---

content = "\nExtract all visible text from this image."

---

from langchain_core.messages import HumanMessage
from langchain_ollama import ChatOllama

# A typical multimodal message: text instruction + base64 image
message = HumanMessage(
    content=[
        {"type": "text", "text": "Extract all text from this image."},
        {
            "type": "image_url",
            # Minimal valid base64 PNG (1×1 transparent pixel)
            "image_url": {
                "url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNk+M9QDwADhgGAWjR9awAAAABJRU5ErkJggg=="
            },
        },
    ]
)

# Instantiate with any model name — no network call is made here
model = ChatOllama(model="any")

ollama_messages = model._convert_messages_to_ollama_messages([message])
content = ollama_messages[0]["content"]

---
RAW_BUFFERClick to expand / collapse

Submission checklist

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Package (Required)

  • langchain
  • langchain-openai
  • langchain-anthropic
  • langchain-classic
  • langchain-core
  • langchain-model-profiles
  • langchain-tests
  • langchain-text-splitters
  • langchain-chroma
  • langchain-deepseek
  • langchain-exa
  • langchain-fireworks
  • langchain-groq
  • langchain-huggingface
  • langchain-mistralai
  • langchain-nomic
  • langchain-ollama
  • langchain-openrouter
  • langchain-perplexity
  • langchain-qdrant
  • langchain-xai
  • Other / not sure / general

Related Issues / PRs

When a HumanMessage (or any message) has list-format content — which is required for multimodal inputs containing both text and an image — _convert_messages_to_ollama_messages unconditionally prepends \n to every text segment, including the very first one:

# langchain_ollama/chat_models.py, inside _convert_messages_to_ollama_messages
content = ""
for content_part in message.content:
    ...
    elif content_part.get("type") == "text":
        content += f"\n{content_part['text']}"   # ← \n always prepended

A call like:

HumanMessage(content=[
    {"type": "text", "text": "Extract all visible text from this image."},
    {"type": "image_url", "image_url": "data:image/png;base64,<data>"},
])

produces the Ollama message

content = "\nExtract all visible text from this image."

Impact

Vision models that are sensitive to exact prompt formatting return empty responses without any error or warning. One confirmed case: deepseek-ocr, which requires single-line prompts and silently produces an empty response when the content starts with \n. The failure is invisible to the user and very hard to debug.

Reproduction Steps / Example Code (Python)

from langchain_core.messages import HumanMessage
from langchain_ollama import ChatOllama

# A typical multimodal message: text instruction + base64 image
message = HumanMessage(
    content=[
        {"type": "text", "text": "Extract all text from this image."},
        {
            "type": "image_url",
            # Minimal valid base64 PNG (1×1 transparent pixel)
            "image_url": {
                "url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNk+M9QDwADhgGAWjR9awAAAABJRU5ErkJggg=="
            },
        },
    ]
)

# Instantiate with any model name — no network call is made here
model = ChatOllama(model="any")

ollama_messages = model._convert_messages_to_ollama_messages([message])
content = ollama_messages[0]["content"]

Error Message and Stack Trace (if applicable)

Description

I´m trying to use deepseek-ocr running on my local hosted ollama to do an ocr of a document

System Info

System Information

OS: Linux OS Version: #1 SMP PREEMPT_DYNAMIC Thu Jun 5 18:30:46 UTC 2025 Python Version: 3.12.3 (main, Mar 23 2026, 19:04:32) [GCC 13.3.0]

Package Information

langchain_core: 1.4.0 langsmith: 0.8.5 langchain_ollama: 1.1.0 langchain_protocol: 0.0.15

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

langchain - 💡(How to fix) Fix fix(ollama): multimodal message text content incorrectly prefixed with \n