litellm - ✅(Solved) Fix [Bug]: LiteLLM incorrectly transforms tool result messages for gemma4:26b via Ollama 0.20.7+ [2 pull requests]

litellm2026-04-20 10:05:53

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

When using ollama/gemma4:26b through LiteLLM proxy, tool result messages (role: tool) are not correctly forwarded to Ollama after Ollama 0.20.6+. The model enters an infinite loop, re-calling the same tool indefinitely instead of consuming the result and producing a text answer.

The same request sent directly to Ollama works correctly, proving the issue is in LiteLLM's message transformation layer.

Root Cause

Root Cause Indicator

Fix Action

Workaround

Use Ollama 0.20.5. The message format expected by 0.20.5 is compatible with LiteLLM's current transformation.

PR fix notes

PR #26121: fix(ollama): forward tool_calls and tool_call_id in transform_request

Repository: BerriAI/litellm
Author: mverrilli
State: closed | merged: False
Link: https://github.com/BerriAI/litellm/pull/26121

Description (problem / solution / changelog)

Summary

transform_request translated tool_calls on assistant messages to OllamaToolCall format but never copied them into the outgoing OllamaChatCompletionMessage — Ollama received {role: assistant, content: ''} with no tool_calls
The model had no record of having made a tool call and re-issued the same call on every turn, causing an infinite loop
tool_call_id on role: tool messages was also silently dropped; Ollama uses this field to resolve the tool function name from conversation history
Added tool_call_id to OllamaChatCompletionMessage TypedDict

Fixes #26094 (reported via https://github.com/ollama/ollama/issues/15719)

Test plan

TestOllamaToolCallTransformation::test_transform_request_preserves_tool_calls — asserts tool_calls survive the transform on assistant messages
TestOllamaToolCallTransformation::test_transform_request_forwards_tool_call_id — asserts tool_call_id is forwarded on tool response messages
Full test_ollama_chat_transformation.py suite: 24/24 pass

Changed files

litellm/llms/ollama/chat/transformation.py (modified, +5/-0)
litellm/types/llms/ollama.py (modified, +1/-0)
tests/test_litellm/llms/ollama/test_ollama_chat_transformation.py (modified, +95/-1)

PR #26122: fix(ollama): forward tool_calls and tool_call_id in transform_request

Repository: BerriAI/litellm
Author: mverrilli
State: open | merged: False
Link: https://github.com/BerriAI/litellm/pull/26122

Description (problem / solution / changelog)

Summary

transform_request translated tool_calls on assistant messages to OllamaToolCall format but never copied them into the outgoing OllamaChatCompletionMessage — Ollama received {role: assistant, content: ''} with no tool_calls
The model had no record of having made a tool call and re-issued the same call on every turn, causing an infinite loop
tool_call_id on role: tool messages was also silently dropped; Ollama uses this field to resolve the tool function name from conversation history
Added tool_call_id to OllamaChatCompletionMessage TypedDict

Fixes #26094 (reported via https://github.com/ollama/ollama/issues/15719)

Test plan

TestOllamaToolCallTransformation::test_transform_request_preserves_tool_calls — asserts tool_calls survive the transform on assistant messages
TestOllamaToolCallTransformation::test_transform_request_forwards_tool_call_id — asserts tool_call_id is forwarded on tool response messages
Full test_ollama_chat_transformation.py suite: 24/24 pass

Changed files

litellm/llms/ollama/chat/transformation.py (modified, +7/-2)
litellm/types/llms/ollama.py (modified, +1/-0)
tests/test_litellm/llms/ollama/test_ollama_chat_transformation.py (modified, +93/-0)

Code Example

{
  "model_name": "ollama/gemma4:26b",
  "model_info": {
    "supports_function_calling": false
  }
}

---

promptTokens  completionTokens  tool_called
5232          18                calendar_read({"overdue": true})
5237          18                calendar_read({"overdue": true})   +5 tokens
5239          18                calendar_read({"overdue": true})   +2 tokens
5241          18                calendar_read({"overdue": true})   +2 tokens
5243          18                calendar_read({"overdue": true})   +2 tokens

---

import json, urllib.request

LITELLM_URL = 'http://<litellm-host>/v1/chat/completions'
LITELLM_KEY = '<your-key>'

tools = [{
    'type': 'function',
    'function': {
        'name': 'get_weather',
        'description': 'Get current weather for a city',
        'parameters': {
            'type': 'object',
            'properties': {'city': {'type': 'string'}},
            'required': ['city']
        }
    }
}]

def call(messages, tools=None):
    payload = {'model': 'ollama/gemma4:26b', 'messages': messages, 'stream': False}
    if tools:
        payload['tools'] = tools
    data = json.dumps(payload).encode()
    req = urllib.request.Request(LITELLM_URL, data=data, headers={
        'Content-Type': 'application/json',
        'Authorization': f'Bearer {LITELLM_KEY}'
    })
    with urllib.request.urlopen(req, timeout=120) as r:
        return json.loads(r.read())['choices'][0]['message']

# Step 1 — model generates tool call (works fine)
m1 = call([{'role': 'user', 'content': 'Weather in Warsaw?'}], tools)

# Step 2 — send tool result (BUG: model loops instead of answering)
msgs = [
    {'role': 'user', 'content': 'Weather in Warsaw?'},
    {'role': 'assistant', 'content': '', 'tool_calls': m1['tool_calls']},
    {'role': 'tool',
     'tool_call_id': m1['tool_calls'][0]['id'],
     'content': '{"temperature": "15C", "conditions": "sunny"}'}
]
m2 = call(msgs, tools)
# Expected: m2['content'] = "The weather in Warsaw is 15°C and sunny."
# Actual:   m2['tool_calls'] = [get_weather(city=Warsaw)]  ← infinite loop

---

RAW_BUFFERClick to expand / collapse

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

Bug Report: LiteLLM incorrectly transforms tool result messages for gemma4:26b via Ollama

Date: 2026-04-17
Reporter: legalos.ai project
Severity: Critical — tool calling broken for gemma4 via Ollama backend
LiteLLM version: 1.82.3
Ollama version: 0.20.7

Summary

The same request sent directly to Ollama works correctly, proving the issue is in LiteLLM's message transformation layer.

Environment

Component	Version
LiteLLM	1.82.3
Ollama	0.20.7
Model	`gemma4:26b`

Root Cause Indicator

LiteLLM's internal model registry reports gemma4:26b as not supporting function calling:

{
  "model_name": "ollama/gemma4:26b",
  "model_info": {
    "supports_function_calling": false
  }
}

This is incorrect — gemma4:26b does support native tool calling via Ollama's /api/chat endpoint. The false flag likely causes LiteLLM to use a prompt-injection fallback instead of passing tool messages natively, producing a format that Ollama 0.20.7 no longer accepts.

Observed vs Expected

	Step 2 `content`	Step 2 `tool_calls`
LiteLLM → Ollama 0.20.7	`''`	`[get_weather(city=Warsaw)]` ← BUG
LiteLLM → Ollama 0.20.5	`'The weather in Warsaw is 15°C...'`	`None` ← OK
Direct Ollama 0.20.7	`'The weather in Warsaw is 15°C...'`	`None` ← OK

Evidence from Production

Production logs showing identical args on every iteration (prompt grows by 2–5 tokens instead of ~80, confirming tool content is not reaching the model):

promptTokens  completionTokens  tool_called
5232          18                calendar_read({"overdue": true})
5237          18                calendar_read({"overdue": true})   +5 tokens
5239          18                calendar_read({"overdue": true})   +2 tokens
5241          18                calendar_read({"overdue": true})   +2 tokens
5243          18                calendar_read({"overdue": true})   +2 tokens

Observed for multiple different tools: calendar_read, law_lookup, dokuwiki_search, deep_research.

Suggested Fix

Update model registry: set supports_function_calling: true for gemma4 models when backend is ollama_chat.
Pass tool messages natively: when the Ollama backend is ollama_chat, forward role: tool messages using Ollama's native /api/chat format rather than the prompt-injection fallback.
Regression test: add a test case for the multi-turn tool call flow (tool_call → tool_result → text_answer) for ollama_chat backends.

Workaround

Use Ollama 0.20.5. The message format expected by 0.20.5 is compatible with LiteLLM's current transformation.

Ollama bug report: see docs/bug-reports/ollama-0.20.7-gemma4-tool-loop.md
Ollama 0.20.6 changelog: https://github.com/ollama/ollama/releases/tag/v0.20.6

Steps to Reproduce

import json, urllib.request

LITELLM_URL = 'http://<litellm-host>/v1/chat/completions'
LITELLM_KEY = '<your-key>'

tools = [{
    'type': 'function',
    'function': {
        'name': 'get_weather',
        'description': 'Get current weather for a city',
        'parameters': {
            'type': 'object',
            'properties': {'city': {'type': 'string'}},
            'required': ['city']
        }
    }
}]

def call(messages, tools=None):
    payload = {'model': 'ollama/gemma4:26b', 'messages': messages, 'stream': False}
    if tools:
        payload['tools'] = tools
    data = json.dumps(payload).encode()
    req = urllib.request.Request(LITELLM_URL, data=data, headers={
        'Content-Type': 'application/json',
        'Authorization': f'Bearer {LITELLM_KEY}'
    })
    with urllib.request.urlopen(req, timeout=120) as r:
        return json.loads(r.read())['choices'][0]['message']

# Step 1 — model generates tool call (works fine)
m1 = call([{'role': 'user', 'content': 'Weather in Warsaw?'}], tools)

# Step 2 — send tool result (BUG: model loops instead of answering)
msgs = [
    {'role': 'user', 'content': 'Weather in Warsaw?'},
    {'role': 'assistant', 'content': '', 'tool_calls': m1['tool_calls']},
    {'role': 'tool',
     'tool_call_id': m1['tool_calls'][0]['id'],
     'content': '{"temperature": "15C", "conditions": "sunny"}'}
]
m2 = call(msgs, tools)
# Expected: m2['content'] = "The weather in Warsaw is 15°C and sunny."
# Actual:   m2['tool_calls'] = [get_weather(city=Warsaw)]  ← infinite loop

Relevant log output

What part of LiteLLM is this about?

No response

What LiteLLM version are you on ?

v1.82.3

Twitter / LinkedIn details

https://www.linkedin.com/in/lukasz-pozniak/

extent analysis

TL;DR

The most likely fix is to update the model registry to set supports_function_calling: true for gemma4 models when the backend is ollama_chat, allowing LiteLLM to pass tool messages natively.

Guidance

Verify that the issue is indeed caused by the incorrect supports_function_calling flag in the model registry by checking the LiteLLM internal model registry reports.
Update the model registry to set supports_function_calling: true for gemma4 models when the backend is ollama_chat.
Test the multi-turn tool call flow (tool_call → tool_result → text_answer) for ollama_chat backends to ensure the fix works as expected.
Consider using Ollama 0.20.5 as a temporary workaround until the fix is implemented.

Example

No code snippet is provided as the issue is related to the model registry configuration.

Notes

The fix assumes that the gemma4:26b model does support native tool calling via Ollama's /api/chat endpoint. If this is not the case, further investigation is needed.

Recommendation

Apply the workaround by using Ollama 0.20.5 until the model registry update can be implemented, as this version is compatible with LiteLLM's current transformation.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #installation #tensor shape #autograd error #model save/load

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

litellm - ✅(Solved) Fix [Bug]: LiteLLM incorrectly transforms tool result messages for gemma4:26b via Ollama 0.20.7+ [2 pull requests]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Root Cause Indicator

Fix Action

Workaround

PR fix notes

PR #26121: fix(ollama): forward tool_calls and tool_call_id in transform_request

Description (problem / solution / changelog)

Summary

Test plan

Changed files

PR #26122: fix(ollama): forward tool_calls and tool_call_id in transform_request

Description (problem / solution / changelog)

Summary

Test plan

Changed files

Code Example

Check for existing issues

What happened?

Bug Report: LiteLLM incorrectly transforms tool result messages for gemma4:26b via Ollama

Summary

Environment

Root Cause Indicator

Observed vs Expected

Evidence from Production

Suggested Fix

Workaround

Related

Steps to Reproduce

Steps to Reproduce

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on ?

Twitter / LinkedIn details

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING