litellm - ✅(Solved) Fix [Bug]: LiteLLM incorrectly transforms tool result messages for gemma4:26b via Ollama 0.20.7+ [2 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

When using ollama/gemma4:26b through LiteLLM proxy, tool result messages (role: tool) are not correctly forwarded to Ollama after Ollama 0.20.6+. The model enters an infinite loop, re-calling the same tool indefinitely instead of consuming the result and producing a text answer.

The same request sent directly to Ollama works correctly, proving the issue is in LiteLLM's message transformation layer.


Root Cause

Root Cause Indicator

Fix Action

Workaround

Use Ollama 0.20.5. The message format expected by 0.20.5 is compatible with LiteLLM's current transformation.


PR fix notes

PR #26121: fix(ollama): forward tool_calls and tool_call_id in transform_request

Description (problem / solution / changelog)

Summary

  • transform_request translated tool_calls on assistant messages to OllamaToolCall format but never copied them into the outgoing OllamaChatCompletionMessage — Ollama received {role: assistant, content: ''} with no tool_calls
  • The model had no record of having made a tool call and re-issued the same call on every turn, causing an infinite loop
  • tool_call_id on role: tool messages was also silently dropped; Ollama uses this field to resolve the tool function name from conversation history
  • Added tool_call_id to OllamaChatCompletionMessage TypedDict

Fixes #26094 (reported via https://github.com/ollama/ollama/issues/15719)

Test plan

  • TestOllamaToolCallTransformation::test_transform_request_preserves_tool_calls — asserts tool_calls survive the transform on assistant messages
  • TestOllamaToolCallTransformation::test_transform_request_forwards_tool_call_id — asserts tool_call_id is forwarded on tool response messages
  • Full test_ollama_chat_transformation.py suite: 24/24 pass

Changed files

  • litellm/llms/ollama/chat/transformation.py (modified, +5/-0)
  • litellm/types/llms/ollama.py (modified, +1/-0)
  • tests/test_litellm/llms/ollama/test_ollama_chat_transformation.py (modified, +95/-1)

PR #26122: fix(ollama): forward tool_calls and tool_call_id in transform_request

Description (problem / solution / changelog)

Summary

  • transform_request translated tool_calls on assistant messages to OllamaToolCall format but never copied them into the outgoing OllamaChatCompletionMessage — Ollama received {role: assistant, content: ''} with no tool_calls
  • The model had no record of having made a tool call and re-issued the same call on every turn, causing an infinite loop
  • tool_call_id on role: tool messages was also silently dropped; Ollama uses this field to resolve the tool function name from conversation history
  • Added tool_call_id to OllamaChatCompletionMessage TypedDict

Fixes #26094 (reported via https://github.com/ollama/ollama/issues/15719)

Test plan

  • TestOllamaToolCallTransformation::test_transform_request_preserves_tool_calls — asserts tool_calls survive the transform on assistant messages
  • TestOllamaToolCallTransformation::test_transform_request_forwards_tool_call_id — asserts tool_call_id is forwarded on tool response messages
  • Full test_ollama_chat_transformation.py suite: 24/24 pass

Changed files

  • litellm/llms/ollama/chat/transformation.py (modified, +7/-2)
  • litellm/types/llms/ollama.py (modified, +1/-0)
  • tests/test_litellm/llms/ollama/test_ollama_chat_transformation.py (modified, +93/-0)

Code Example

{
  "model_name": "ollama/gemma4:26b",
  "model_info": {
    "supports_function_calling": false
  }
}

---

promptTokens  completionTokens  tool_called
5232          18                calendar_read({"overdue": true})
5237          18                calendar_read({"overdue": true})   +5 tokens
5239          18                calendar_read({"overdue": true})   +2 tokens
5241          18                calendar_read({"overdue": true})   +2 tokens
5243          18                calendar_read({"overdue": true})   +2 tokens

---

import json, urllib.request

LITELLM_URL = 'http://<litellm-host>/v1/chat/completions'
LITELLM_KEY = '<your-key>'

tools = [{
    'type': 'function',
    'function': {
        'name': 'get_weather',
        'description': 'Get current weather for a city',
        'parameters': {
            'type': 'object',
            'properties': {'city': {'type': 'string'}},
            'required': ['city']
        }
    }
}]

def call(messages, tools=None):
    payload = {'model': 'ollama/gemma4:26b', 'messages': messages, 'stream': False}
    if tools:
        payload['tools'] = tools
    data = json.dumps(payload).encode()
    req = urllib.request.Request(LITELLM_URL, data=data, headers={
        'Content-Type': 'application/json',
        'Authorization': f'Bearer {LITELLM_KEY}'
    })
    with urllib.request.urlopen(req, timeout=120) as r:
        return json.loads(r.read())['choices'][0]['message']

# Step 1 — model generates tool call (works fine)
m1 = call([{'role': 'user', 'content': 'Weather in Warsaw?'}], tools)

# Step 2 — send tool result (BUG: model loops instead of answering)
msgs = [
    {'role': 'user', 'content': 'Weather in Warsaw?'},
    {'role': 'assistant', 'content': '', 'tool_calls': m1['tool_calls']},
    {'role': 'tool',
     'tool_call_id': m1['tool_calls'][0]['id'],
     'content': '{"temperature": "15C", "conditions": "sunny"}'}
]
m2 = call(msgs, tools)
# Expected: m2['content'] = "The weather in Warsaw is 15°C and sunny."
# Actual:   m2['tool_calls'] = [get_weather(city=Warsaw)]  ← infinite loop

---
RAW_BUFFERClick to expand / collapse

Check for existing issues

  • I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

Bug Report: LiteLLM incorrectly transforms tool result messages for gemma4:26b via Ollama

Date: 2026-04-17
Reporter: legalos.ai project
Severity: Critical — tool calling broken for gemma4 via Ollama backend
LiteLLM version: 1.82.3
Ollama version: 0.20.7


Summary

When using ollama/gemma4:26b through LiteLLM proxy, tool result messages (role: tool) are not correctly forwarded to Ollama after Ollama 0.20.6+. The model enters an infinite loop, re-calling the same tool indefinitely instead of consuming the result and producing a text answer.

The same request sent directly to Ollama works correctly, proving the issue is in LiteLLM's message transformation layer.


Environment

ComponentVersion
LiteLLM1.82.3
Ollama0.20.7
Modelgemma4:26b

Root Cause Indicator

LiteLLM's internal model registry reports gemma4:26b as not supporting function calling:

{
  "model_name": "ollama/gemma4:26b",
  "model_info": {
    "supports_function_calling": false
  }
}

This is incorrect — gemma4:26b does support native tool calling via Ollama's /api/chat endpoint. The false flag likely causes LiteLLM to use a prompt-injection fallback instead of passing tool messages natively, producing a format that Ollama 0.20.7 no longer accepts.



Observed vs Expected

Step 2 contentStep 2 tool_calls
LiteLLM → Ollama 0.20.7''[get_weather(city=Warsaw)]BUG
LiteLLM → Ollama 0.20.5'The weather in Warsaw is 15°C...'None ← OK
Direct Ollama 0.20.7'The weather in Warsaw is 15°C...'None ← OK

Evidence from Production

Production logs showing identical args on every iteration (prompt grows by 2–5 tokens instead of ~80, confirming tool content is not reaching the model):

promptTokens  completionTokens  tool_called
5232          18                calendar_read({"overdue": true})
5237          18                calendar_read({"overdue": true})   +5 tokens
5239          18                calendar_read({"overdue": true})   +2 tokens
5241          18                calendar_read({"overdue": true})   +2 tokens
5243          18                calendar_read({"overdue": true})   +2 tokens

Observed for multiple different tools: calendar_read, law_lookup, dokuwiki_search, deep_research.


Suggested Fix

  1. Update model registry: set supports_function_calling: true for gemma4 models when backend is ollama_chat.

  2. Pass tool messages natively: when the Ollama backend is ollama_chat, forward role: tool messages using Ollama's native /api/chat format rather than the prompt-injection fallback.

  3. Regression test: add a test case for the multi-turn tool call flow (tool_call → tool_result → text_answer) for ollama_chat backends.


Workaround

Use Ollama 0.20.5. The message format expected by 0.20.5 is compatible with LiteLLM's current transformation.


Related

Steps to Reproduce

Steps to Reproduce

import json, urllib.request

LITELLM_URL = 'http://<litellm-host>/v1/chat/completions'
LITELLM_KEY = '<your-key>'

tools = [{
    'type': 'function',
    'function': {
        'name': 'get_weather',
        'description': 'Get current weather for a city',
        'parameters': {
            'type': 'object',
            'properties': {'city': {'type': 'string'}},
            'required': ['city']
        }
    }
}]

def call(messages, tools=None):
    payload = {'model': 'ollama/gemma4:26b', 'messages': messages, 'stream': False}
    if tools:
        payload['tools'] = tools
    data = json.dumps(payload).encode()
    req = urllib.request.Request(LITELLM_URL, data=data, headers={
        'Content-Type': 'application/json',
        'Authorization': f'Bearer {LITELLM_KEY}'
    })
    with urllib.request.urlopen(req, timeout=120) as r:
        return json.loads(r.read())['choices'][0]['message']

# Step 1 — model generates tool call (works fine)
m1 = call([{'role': 'user', 'content': 'Weather in Warsaw?'}], tools)

# Step 2 — send tool result (BUG: model loops instead of answering)
msgs = [
    {'role': 'user', 'content': 'Weather in Warsaw?'},
    {'role': 'assistant', 'content': '', 'tool_calls': m1['tool_calls']},
    {'role': 'tool',
     'tool_call_id': m1['tool_calls'][0]['id'],
     'content': '{"temperature": "15C", "conditions": "sunny"}'}
]
m2 = call(msgs, tools)
# Expected: m2['content'] = "The weather in Warsaw is 15°C and sunny."
# Actual:   m2['tool_calls'] = [get_weather(city=Warsaw)]  ← infinite loop

Relevant log output

What part of LiteLLM is this about?

No response

What LiteLLM version are you on ?

v1.82.3

Twitter / LinkedIn details

https://www.linkedin.com/in/lukasz-pozniak/

extent analysis

TL;DR

The most likely fix is to update the model registry to set supports_function_calling: true for gemma4 models when the backend is ollama_chat, allowing LiteLLM to pass tool messages natively.

Guidance

  • Verify that the issue is indeed caused by the incorrect supports_function_calling flag in the model registry by checking the LiteLLM internal model registry reports.
  • Update the model registry to set supports_function_calling: true for gemma4 models when the backend is ollama_chat.
  • Test the multi-turn tool call flow (tool_call → tool_result → text_answer) for ollama_chat backends to ensure the fix works as expected.
  • Consider using Ollama 0.20.5 as a temporary workaround until the fix is implemented.

Example

No code snippet is provided as the issue is related to the model registry configuration.

Notes

The fix assumes that the gemma4:26b model does support native tool calling via Ollama's /api/chat endpoint. If this is not the case, further investigation is needed.

Recommendation

Apply the workaround by using Ollama 0.20.5 until the model registry update can be implemented, as this version is compatible with LiteLLM's current transformation.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

litellm - ✅(Solved) Fix [Bug]: LiteLLM incorrectly transforms tool result messages for gemma4:26b via Ollama 0.20.7+ [2 pull requests]