ollama - 💡(How to fix) Fix [Bug] gemma4 parser fails to extract tool_calls when combining system prompt + think:false + tools [6 comments, 2 participants]

Fix Action

Fix / Workaround

<h3>Expected behavior</h3> <p>Test 3 should produce the same structured <code>tool_calls</code> output as Test 1, since the only difference is the addition of a system prompt. The <code>think: false</code> flag should disable thinking without breaking tool call parsing.</p> <h3>Impact</h3> <p>This bug makes <code>gemma4:e4b</code> unusable with any client that sends a system prompt alongside tools and <code>think: false</code>, including:</p> <ul> <li><strong>Home Assistant</strong> Ollama integration (always sends a system prompt with entity definitions)</li> <li>Any OpenAI-compatible client using system prompts with tool definitions</li> </ul> <p>The workaround of leaving thinking enabled (Test 2) works for tool calling but adds 10+ seconds of latency and causes thinking tokens to leak into streaming clients.</p> <h3>Possibly related issues</h3> <ul> <li>#15241 — gemma4 tool call parsing fails</li> <li>#15315 — gemma4:e4b tool parsing errors persist in 0.20.1</li> <li>#15254 — fix gemma4 arg parsing with quoted strings</li> <li>#15306 — rework gemma4 tool call handling</li> </ul></body></html> </body> </html>

<h3>What is the issue?</h3> <p>The <code>gemma4</code> parser in Ollama 0.20.6 fails to extract tool calls from the model response when a <strong>system prompt</strong> is combined with <strong><code>think: false</code></strong> and <strong>tools</strong>. The model correctly generates the tool call JSON, but the parser does not intercept it — the raw JSON leaks into the <code>content</code> field instead of being placed in the <code>tool_calls</code> field.</p> <p>This breaks Home Assistant's Ollama integration, which always sends a system prompt (containing assistant instructions and exposed entity definitions) along with tool definitions.</p> <h3>Environment</h3> <ul> <li><strong>Ollama version:</strong> 0.20.6</li> <li><strong>Model:</strong> <code>gemma4:e4b</code> (official, pulled via <code>ollama pull gemma4:e4b</code>)</li> <li><strong>OS:</strong> Ubuntu 24.04 (LXC container on Proxmox VE)</li> <li><strong>Hardware:</strong> AMD Ryzen 7 8745HS, Radeon 780M iGPU (ROCm), 32 GB RAM</li> <li><strong>Client:</strong> Home Assistant OS (Core 2026.4.2, Supervisor 2026.03.3, OS 17.2, Frontend 20260325.7) Ollama integration + direct curl testing</li> </ul> <h3>Reproduction steps</h3> <p>Run the following three curl commands against a fresh <code>gemma4:e4b</code> model. They demonstrate that the bug only occurs with a specific combination.</p> <p><strong>Test 1 — No system prompt + <code>think: false</code> → ✅ WORKS</strong></p> <pre><code class="language-bash">curl -s http://localhost:11434/api/chat -d '{ "model": "gemma4:e4b", "messages": [{"role": "user", "content": "What is the weather in Talence?"}], "tools": [{"type": "function", "function": {"name": "get_weather", "description": "Get weather info", "parameters": {"type": "object", "properties": {"location": {"type": "string"}}, "required": ["location"]}}}], "stream": false, "think": false }' | python3 -m json.tool </code></pre> <p><strong>Result:</strong> <code>content</code> is empty, <code>tool_calls</code> is correctly populated:</p> <pre><code class="language-json">{ "message": { "role": "assistant", "content": "", "tool_calls": [ { "id": "call_g76u5xbz", "function": { "index": 0, "name": "get_weather", "arguments": {"location": "Talence"} } } ] } } </code></pre> <p><strong>Test 2 — System prompt + thinking active (default) → ⚠️ tool_calls OK but thinking leaks</strong></p> <pre><code class="language-bash">curl -s http://localhost:11434/api/chat -d '{ "model": "gemma4:e4b", "messages": [ {"role": "system", "content": "Tu es Jarvis, assistant domotique. Réponds en français."}, {"role": "user", "content": "Je veux la météo"} ], "tools": [{"type": "function", "function": {"name": "GetLiveContext", "description": "Get live context", "parameters": {"type": "object", "properties": {}, "required": []}}}], "stream": false }' | python3 -m json.tool </code></pre> <p><strong>Result:</strong> <code>tool_calls</code> is correctly populated, but <code>thinking</code> field contains a long reasoning chain (14 seconds). The tool calling itself works:</p> <pre><code class="language-json">{ "message": { "role": "assistant", "content": "", "thinking": "1. **Analyze the Request:** ... (long reasoning) ... 6. **Generate the tool call:** Call GetLiveContext.", "tool_calls": [ { "id": "call_bgl0bmz2", "function": { "index": 0, "name": "GetLiveContext", "arguments": {} } } ] } } </code></pre> <p><strong>Test 3 — System prompt + <code>think: false</code> → ❌ BUG — tool_calls not parsed</strong></p> <pre><code class="language-bash">curl -s http://localhost:11434/api/chat -d '{ "model": "gemma4:e4b", "messages": [ {"role": "system", "content": "Tu es Jarvis, assistant domotique. Réponds en français."}, {"role": "user", "content": "Je veux la météo"} ], "tools": [{"type": "function", "function": {"name": "GetLiveContext", "description": "Get live context", "parameters": {"type": "object", "properties": {}, "required": []}}}], "stream": false, "think": false }' | python3 -m json.tool </code></pre> <p><strong>Result:</strong> The model generates the correct tool call JSON, but the parser does NOT intercept it. The raw JSON leaks into <code>content</code> with a trailing <code><channel|></code> token:</p> <pre><code class="language-json">{ "message": { "role": "assistant", "content": "{\n \"tool_calls\": [\n {\n \"function\": \"GetLiveContext\",\n \"args\": {}\n }\n ]\n}\n<channel|>" } } </code></pre> <p>No <code>tool_calls</code> field is present. No <code>thinking</code> field.</p> <h3>Summary</h3>

Test	System prompt	think: false	tool_calls parsed	Duration
1	❌ No	✅ Yes	✅ Yes	~2s
2	✅ Yes	❌ No	✅ Yes (but thinking leaks)	~14s
3	✅ Yes	✅ Yes	❌ No — JSON in content	~2s

extent analysis

TL;DR

The gemma4 parser in Ollama 0.20.6 fails to extract tool calls when a system prompt is combined with think: false and tools, causing the raw JSON to leak into the content field.

Guidance

The issue seems to be related to the combination of a system prompt, think: false, and tools, which causes the parser to fail to extract tool calls.
To verify, run the provided curl commands (Test 1, Test 2, and Test 3) against a fresh gemma4:e4b model to reproduce the issue.
As a temporary workaround, consider leaving thinking enabled (as in Test 2) to allow tool calling to work, although this adds latency and causes thinking tokens to leak into streaming clients.
Review the possibly related issues (#15241, #15315, #15254, #15306) to see if they provide any insights or fixes for the tool call parsing issue.

Example

No code snippet is provided as the issue is related to the gemma4 parser and its interaction with system prompts, think: false, and tools.

Notes

The issue is specific to the gemma4:e4b model and Ollama version 0.20.6, and may not be applicable to other models or versions. The provided curl commands and test results are essential to understanding and reproducing the issue.

Recommendation

Apply the workaround of leaving thinking enabled (as in Test 2) until a fix is available, as it allows tool calling to work, although it adds latency and causes thinking tokens to leak into streaming clients.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

ollama - 💡(How to fix) Fix [Bug] gemma4 parser fails to extract tool_calls when combining system prompt + think:false + tools [6 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fix / Workaround

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

ollama - 💡(How to fix) Fix [Bug] gemma4 parser fails to extract tool_calls when combining system prompt + think:false + tools [6 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fix / Workaround

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING