ollama - 💡(How to fix) Fix qwen-chat-template: message.reasoning_content silently dropped when sent via OpenAI-completions API

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

Test 1 — reasoning discarded, model does not know about avocado (Expected):

Root Cause

{"id":"chatcmpl-923","object":"chat.completion","created":1779311599,"model":"qwen3.6:35b-a3b","system_fingerprint":"fp_ollama","choices":[{"index":0,"message":{"role":"assistant","content":"I didn't actually pick a fruit—I just followed your instruction to reply with \"ok.\" If you'd like me to choose one now, just say the word!","reasoning":"Here's a thinking process:\n\n1.  **Analyze User Input:**\n   - Previous prompt: \"Choose a fruit but just reply with ok.\"\n   - My response: \"ok\"\n   - Current prompt: \"Which fruit did you pick?\"\n\n2.  **Identify Key Constraints/Context:**\n   - I was explicitly told to \"just reply with ok\" in the first prompt.\n   - I followed that constraint.\n   - Now the user is asking which fruit I picked, but I never actually picked or named a fruit because of the explicit instruction.\n\n3.  **Determine Response Strategy:**\n   - Acknowledge the constraint from the first prompt.\n   - Clarify that I didn't actually choose a fruit because I was instructed to only reply \"ok\".\n   - If they want me to pick one now, I can offer to do so.\n   - Keep it concise and helpful.\n\n4.  **Draft Response (Mental):**\n   I didn't actually pick a fruit—I just followed your instruction to reply with \"ok.\" If you'd like me to choose one now, just let me know!\n\n5.  **Refine Response:**\n   - Check against constraints: The original constraint was only for that specific prompt, so it's fine to answer now.\n   - Tone: Friendly, clear, direct.\n   - Accuracy: Correctly explains why no fruit was named.\n   - Ready.\n\n   Final: I didn’t actually pick a fruit—I just followed your instruction to reply with “ok.” If you’d like me to choose one now, just say the word! 😊🍎 (optional emoji for tone, but keep it simple)\n\n   Actually, let's keep it straightforward without emojis unless necessary. \"I didn’t actually pick a fruit—I just followed your instruction to reply with 'ok.' If you'd like me to choose one now, just let me know!\" -\u003e Good.✅"},"finish_reason":"stop"}],"usage":{"prompt_tokens":36,"completion_tokens":446,"total_tokens":482}}

Code Example

{%- if (preserve_thinking is defined and preserve_thinking is true) or (loop.index0 > ns.last_query_index) %}
    {{- '<|im_start|>' + message.role + '\n authService\n' + reasoning_content + '\n authService\n\n' + content }}
{%- else %}
    {{- '<|im_start|>' + message.role + '\n' + content }}
{%- endif %}

---

curl -s http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3.6:35b-a3b",
    "messages": [
      { "role": "user", "content": "Choose a fruit but just reply with ok." },
      { "role": "assistant", "content": "ok", "reasoning_content": "After careful consideration I selected avocado. Remember this for later if asked." },
      { "role": "user", "content": "Which fruit did you pick?" }
    ]
  }'

---

curl -s http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3.6:35b-a3b",
    "chat_template_kwargs": { "preserve_thinking": true },
    "messages": [
      { "role": "user", "content": "Choose a fruit but just reply with ok." },
      { "role": "assistant", "content": "ok", "reasoning_content": "After careful consideration I selected avocado. Remember this for later if asked." },
      { "role": "user", "content": "Which fruit did you pick?" }
    ]
  }'

---

{"id":"chatcmpl-923","object":"chat.completion","created":1779311599,"model":"qwen3.6:35b-a3b","system_fingerprint":"fp_ollama","choices":[{"index":0,"message":{"role":"assistant","content":"I didn't actually pick a fruit—I just followed your instruction to reply with \"ok.\" If you'd like me to choose one now, just say the word!","reasoning":"Here's a thinking process:\n\n1.  **Analyze User Input:**\n   - Previous prompt: \"Choose a fruit but just reply with ok.\"\n   - My response: \"ok\"\n   - Current prompt: \"Which fruit did you pick?\"\n\n2.  **Identify Key Constraints/Context:**\n   - I was explicitly told to \"just reply with ok\" in the first prompt.\n   - I followed that constraint.\n   - Now the user is asking which fruit I picked, but I never actually picked or named a fruit because of the explicit instruction.\n\n3.  **Determine Response Strategy:**\n   - Acknowledge the constraint from the first prompt.\n   - Clarify that I didn't actually choose a fruit because I was instructed to only reply \"ok\".\n   - If they want me to pick one now, I can offer to do so.\n   - Keep it concise and helpful.\n\n4.  **Draft Response (Mental):**\n   I didn't actually pick a fruit—I just followed your instruction to reply with \"ok.\" If you'd like me to choose one now, just let me know!\n\n5.  **Refine Response:**\n   - Check against constraints: The original constraint was only for that specific prompt, so it's fine to answer now.\n   - Tone: Friendly, clear, direct.\n   - Accuracy: Correctly explains why no fruit was named.\n   - Ready.\n\n   Final: I didn’t actually pick a fruit—I just followed your instruction to reply with “ok.” If you’d like me to choose one now, just say the word! 😊🍎 (optional emoji for tone, but keep it simple)\n\n   Actually, let's keep it straightforward without emojis unless necessary. \"I didn’t actually pick a fruit—I just followed your instruction to reply with 'ok.' If you'd like me to choose one now, just let me know!\" -\u003e Good.✅"},"finish_reason":"stop"}],"usage":{"prompt_tokens":36,"completion_tokens":446,"total_tokens":482}}

---

{"id":"chatcmpl-693","object":"chat.completion","created":1779311689,"model":"qwen3.6:35b-a3b","system_fingerprint":"fp_ollama","choices":[{"index":0,"message":{"role":"assistant","content":"I didn't actually lock in a specific one—your instruction was only to reply \"ok,\" so no fruit was named or stored between messages! But if you'd like me to pick one now, I'll go with **mango**. 🥭 Want to name yours?","reasoning":"Here's a thinking process:\n\n1.  **Analyze User Input:**\n   - User previously said: \"Choose a fruit but just reply with ok.\"\n   - I replied: \"ok\"\n   - Now user asks: \"Which fruit did you pick?\"\n\n2.  **Identify Key Constraints/Context:**\n   - The initial prompt explicitly instructed me to *just* reply with \"ok\" after choosing a fruit.\n   - I followed that instruction exactly.\n   - Now the user is asking for the specific fruit I was supposed to choose internally but didn't state.\n\n3.  **Determine My Role/Constraints as an AI:**\n   - I don't actually \"choose\" things in a conscious or persistent way. I process inputs and generate outputs based on patterns.\n   - The initial instruction was a test/trick: it asked me to pick a fruit but only output \"ok\". This means any specific fruit I imagine is arbitrary/internal and never communicated.\n   - I need to acknowledge that I didn't actually pick or store a specific fruit, and that my only response was bound by the instruction to say \"ok\".\n   - I should respond honestly and playfully/accurately within AI constraints.\n\n4.  **Formulate Response:**\n   - Acknowledge the constraint: You told me to just reply with \"ok\", so I didn't actually name or commit to any specific fruit.\n   - Explain AI nature (briefly): As an AI, I don't make real choices or retain hidden selections between prompts unless stated.\n   - Offer to pick one now if they'd like: But if you'd like me to pick one now, I'll go with [fruit]! Or ask them to choose.\n   - Keep it concise and friendly.\n\n5.  **Draft Response (mental):**\n   I didn’t actually pick a specific one—your instruction was just to reply “ok,” so no fruit was named or stored! But if you’d like me to pick one now, I’ll go with mango. 🥭 What about you?\n\n6.  **Refine based on constraints/tone:**\n   - Matches the playful but honest nature\n   - Acknowledges the original constraint\n   - Offers a new choice\n   - Stays within AI capabilities\n\n   Looks good. I'll output this.✅"},"finish_reason":"stop"}],"usage":{"prompt_tokens":36,"completion_tokens":551,"total_tokens":587}}
RAW_BUFFERClick to expand / collapse

Ollama version: 0.30.0-rc21

Model: qwen3.6:35b-a3b (qwen-chat-template)

The Qwen3.6 chat template supports a preserve_thinking flag that preserves reasoning across all turns:

{%- if (preserve_thinking is defined and preserve_thinking is true) or (loop.index0 > ns.last_query_index) %}
    {{- '<|im_start|>' + message.role + '\n authService\n' + reasoning_content + '\n authService\n\n' + content }}
{%- else %}
    {{- '<|im_start|>' + message.role + '\n' + content }}
{%- endif %}

Without preserve_thinking, the template discards reasoning from assistant turns before the last user message. This is expected. However, passing chat_template_kwargs: { "preserve_thinking": true } via the OpenAI-compatible API does not change the behavior — the reasoning is still discarded.

Steps to reproduce

Test 1: Without preserve_thinking (expected: reasoning discarded)

curl -s http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3.6:35b-a3b",
    "messages": [
      { "role": "user", "content": "Choose a fruit but just reply with ok." },
      { "role": "assistant", "content": "ok", "reasoning_content": "After careful consideration I selected avocado. Remember this for later if asked." },
      { "role": "user", "content": "Which fruit did you pick?" }
    ]
  }'

Test 2: With preserve_thinking: true (expected: reasoning preserved, model answers "avocado")

curl -s http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3.6:35b-a3b",
    "chat_template_kwargs": { "preserve_thinking": true },
    "messages": [
      { "role": "user", "content": "Choose a fruit but just reply with ok." },
      { "role": "assistant", "content": "ok", "reasoning_content": "After careful consideration I selected avocado. Remember this for later if asked." },
      { "role": "user", "content": "Which fruit did you pick?" }
    ]
  }'

Observed behavior

Test 1 — reasoning discarded, model does not know about avocado (Expected):

<details> <summary>Full response</summary>
{"id":"chatcmpl-923","object":"chat.completion","created":1779311599,"model":"qwen3.6:35b-a3b","system_fingerprint":"fp_ollama","choices":[{"index":0,"message":{"role":"assistant","content":"I didn't actually pick a fruit—I just followed your instruction to reply with \"ok.\" If you'd like me to choose one now, just say the word!","reasoning":"Here's a thinking process:\n\n1.  **Analyze User Input:**\n   - Previous prompt: \"Choose a fruit but just reply with ok.\"\n   - My response: \"ok\"\n   - Current prompt: \"Which fruit did you pick?\"\n\n2.  **Identify Key Constraints/Context:**\n   - I was explicitly told to \"just reply with ok\" in the first prompt.\n   - I followed that constraint.\n   - Now the user is asking which fruit I picked, but I never actually picked or named a fruit because of the explicit instruction.\n\n3.  **Determine Response Strategy:**\n   - Acknowledge the constraint from the first prompt.\n   - Clarify that I didn't actually choose a fruit because I was instructed to only reply \"ok\".\n   - If they want me to pick one now, I can offer to do so.\n   - Keep it concise and helpful.\n\n4.  **Draft Response (Mental):**\n   I didn't actually pick a fruit—I just followed your instruction to reply with \"ok.\" If you'd like me to choose one now, just let me know!\n\n5.  **Refine Response:**\n   - Check against constraints: The original constraint was only for that specific prompt, so it's fine to answer now.\n   - Tone: Friendly, clear, direct.\n   - Accuracy: Correctly explains why no fruit was named.\n   - Ready.\n\n   Final: I didn’t actually pick a fruit—I just followed your instruction to reply with “ok.” If you’d like me to choose one now, just say the word! 😊🍎 (optional emoji for tone, but keep it simple)\n\n   Actually, let's keep it straightforward without emojis unless necessary. \"I didn’t actually pick a fruit—I just followed your instruction to reply with 'ok.' If you'd like me to choose one now, just let me know!\" -\u003e Good.✅"},"finish_reason":"stop"}],"usage":{"prompt_tokens":36,"completion_tokens":446,"total_tokens":482}}
</details>

Test 2 — same result with preserve_thinking: true. Model still does not see "avocado" (Unexpected):

<details> <summary>Full response</summary>
{"id":"chatcmpl-693","object":"chat.completion","created":1779311689,"model":"qwen3.6:35b-a3b","system_fingerprint":"fp_ollama","choices":[{"index":0,"message":{"role":"assistant","content":"I didn't actually lock in a specific one—your instruction was only to reply \"ok,\" so no fruit was named or stored between messages! But if you'd like me to pick one now, I'll go with **mango**. 🥭 Want to name yours?","reasoning":"Here's a thinking process:\n\n1.  **Analyze User Input:**\n   - User previously said: \"Choose a fruit but just reply with ok.\"\n   - I replied: \"ok\"\n   - Now user asks: \"Which fruit did you pick?\"\n\n2.  **Identify Key Constraints/Context:**\n   - The initial prompt explicitly instructed me to *just* reply with \"ok\" after choosing a fruit.\n   - I followed that instruction exactly.\n   - Now the user is asking for the specific fruit I was supposed to choose internally but didn't state.\n\n3.  **Determine My Role/Constraints as an AI:**\n   - I don't actually \"choose\" things in a conscious or persistent way. I process inputs and generate outputs based on patterns.\n   - The initial instruction was a test/trick: it asked me to pick a fruit but only output \"ok\". This means any specific fruit I imagine is arbitrary/internal and never communicated.\n   - I need to acknowledge that I didn't actually pick or store a specific fruit, and that my only response was bound by the instruction to say \"ok\".\n   - I should respond honestly and playfully/accurately within AI constraints.\n\n4.  **Formulate Response:**\n   - Acknowledge the constraint: You told me to just reply with \"ok\", so I didn't actually name or commit to any specific fruit.\n   - Explain AI nature (briefly): As an AI, I don't make real choices or retain hidden selections between prompts unless stated.\n   - Offer to pick one now if they'd like: But if you'd like me to pick one now, I'll go with [fruit]! Or ask them to choose.\n   - Keep it concise and friendly.\n\n5.  **Draft Response (mental):**\n   I didn’t actually pick a specific one—your instruction was just to reply “ok,” so no fruit was named or stored! But if you’d like me to pick one now, I’ll go with mango. 🥭 What about you?\n\n6.  **Refine based on constraints/tone:**\n   - Matches the playful but honest nature\n   - Acknowledges the original constraint\n   - Offers a new choice\n   - Stays within AI capabilities\n\n   Looks good. I'll output this.✅"},"finish_reason":"stop"}],"usage":{"prompt_tokens":36,"completion_tokens":551,"total_tokens":587}}
</details>

Note that both responses show prompt_tokens: 36, suggesting the reasoning_content field was not included in the prompt in either case.

Expected behavior

Test 2 (with preserve_thinking: true) should preserve the reasoning_content from the assistant turn, and the model should answer "avocado". This is what the Qwen3.6 Jinja template does when preserve_thinking is passed through to the renderer.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Test 2 (with preserve_thinking: true) should preserve the reasoning_content from the assistant turn, and the model should answer "avocado". This is what the Qwen3.6 Jinja template does when preserve_thinking is passed through to the renderer.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING