ollama - 💡(How to fix) Fix gemma4-64k chat-template special tokens (<|tool_call|>, <|"|>, <|channel|>) leak into OpenAI-compat tool-calling output [1 participants]

ollama2026-04-24 14:33:30

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

ollama/ollama#15798•Fetched 2026-04-25 06:03:30

View on GitHub

Comments

Participants

Timeline

Reactions

Author

mons-bot

Participants

mons-bot

Timeline (top)

closed ×1

Fix Action

Workaround

None. Swapping model family is the only reliable option.

Closing: will redo with a generic minimal repro if needed.

RAW_BUFFERClick to expand / collapse

Observed

Running gemma4-64k:latest through Ollama's OpenAI-compat endpoint (/v1/chat/completions with tool schemas), the model emits tool calls as plain text inside the assistant message — often inside a thinking block — with Gemma chat-template special tokens leaked verbatim (<|tool_call|>, <|"|>, <|channel|>, <|tool_response|>). finish_reason returns stop, so the harness ends the turn as if the model had spoken a normal reply, and no tool call is dispatched.

Ollama version

0.21.1. Prior related issues (#15241, #15315) claim a fix in 0.20.6 but the behavior is still reproducible on 0.21.1.

Pattern

String arguments wrapped in <|"|>...<|"|> instead of real quotes.
call: prefix where a structured tool-call should be.
Surrounding <|tool_call|> / <|channel|> template tokens emitted as plain text inside content.

Workaround

None. Swapping model family is the only reliable option.

Closing: will redo with a generic minimal repro if needed.

extent analysis

TL;DR

The issue can be potentially resolved by upgrading to a version where the fix for similar issues (#15241, #15315) is confirmed to be stable, or by exploring alternative model families as a workaround.

Guidance

Review the release notes for versions after 0.20.6 to confirm if the fix for issues #15241 and #15315 is stable and applicable to the current scenario.
Test the behavior with different model families to identify if the issue is model-specific or a broader compatibility problem with the Ollama OpenAI-compat endpoint.
Investigate the tool schema configuration for /v1/chat/completions to ensure it aligns with the expected format for tool calls, potentially adjusting the schema to better match the model's output.
Consider filing a new issue or updating the existing ones (#15241, #15315) with the current version (0.21.1) and detailed reproduction steps for further assistance.

Example

No specific code example can be provided without more details on the tool schema and model configuration. However, ensuring that the tool calls are properly formatted and that the model is correctly configured to handle these calls is crucial.

Notes

The provided information suggests that the issue might be related to how the model handles tool calls and special tokens, but without direct access to the model's configuration or the exact tool schema used, it's challenging to provide a definitive fix. The issue seems to persist despite previous fixes, indicating a potential regression or a specific scenario not covered by those fixes.

Recommendation

Apply workaround: Since the issue persists in version 0.21.1 and previous fixes do not seem to apply, swapping to a different model family appears to be the most reliable temporary solution until a more permanent fix can be identified and implemented.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#tensor shape #autograd error #model save/load #optimization #mixed precision

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

ollama - 💡(How to fix) Fix gemma4-64k chat-template special tokens (<|tool_call|>, <|"|>, <|channel|>) leak into OpenAI-compat tool-calling output [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Workaround

Observed

Ollama version

Pattern

Workaround

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

ollama - 💡(How to fix) Fix gemma4-64k chat-template special tokens (<|tool_call|>, <|"|>, <|channel|>) leak into OpenAI-compat tool-calling output [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Workaround

Observed

Ollama version

Pattern

Workaround

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING