openclaw - 💡(How to fix) Fix Ollama: think:true forwarded to non-reasoning models causes HTTP 400 (regression of #69902 fix in 2026.4.23)

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

PR #69967 (fixes #69902, shipped in 2026.4.23) added top-level think forwarding to native Ollama /api/chat requests, but does not gate the forwarding on the target model's reasoning capability. As a result, any agent configured with thinkingLevel != "off" that routes to a non-reasoning Ollama model (e.g. qwen2.5:72b-instruct, qwen2.5-coder:32b) receives:

HTTP 400 "<model>" does not support thinking

The same release (2026.4.23) added capability gating for thinking on three other surfaces (CHANGELOG entries for OpenAI Responses /think off, Discord /think autocomplete, and provider max reasoning). The Ollama path was simply missed.

Error Message

→ {"error":""qwen2.5:72b-instruct" does not support thinking"}

Root Cause

In dist/stream-D2e1fsSh.js, createConfiguredOllamaCompatStreamWrapper:

if (isNativeOllamaTransport && ctx.thinkingLevel === "off")
  streamFn = createOllamaThinkingWrapper(streamFn, false);
else if (isNativeOllamaTransport && ctx.thinkingLevel)
  streamFn = createOllamaThinkingWrapper(streamFn, true);   // ← unconditional

The wrapper injects think: true into the payload whenever thinkingLevel is truthy, without consulting model.reasoning. Compare to the gating already in place for OpenAI Responses and Discord /think (added in the same 2026.4.23 release), which do consult per-model reasoning capability.

Fix Action

Fix / Workaround

Workaround: explicitly set thinkingDefault: "off" on every agent bound to a non-reasoning Ollama model. Works, but is per-binding rather than per-model capability, so it's fragile across config additions.

Code Example

HTTP 400 "<model>" does not support thinking

---

{
  "models": {
    "providers": {
      "ollama-remote": {
        "baseUrl": "http://localhost:11434",
        "api": "ollama",
        "models": [
          {
            "id": "qwen2.5:72b-instruct",
            "name": "Qwen2.5 72B Instruct",
            "reasoning": false,
            "input": ["text"],
            "api": "ollama"
          }
        ]
      }
    }
  },
  "agents": {
    "list": [
      {
        "id": "repro-agent",
        "model": {
          "primary": "ollama-remote/qwen2.5:72b-instruct"
        },
        "thinkingDefault": "high"
      }
    ]
  }
}

---

curl -sS http://localhost:11434/api/chat \
  -d '{"model":"qwen2.5:72b-instruct","messages":[{"role":"user","content":"ok"}],"stream":false,"think":true}'
# → {"error":"\"qwen2.5:72b-instruct\" does not support thinking"}

curl -sS http://localhost:11434/api/chat \
  -d '{"model":"qwen2.5:72b-instruct","messages":[{"role":"user","content":"ok"}],"stream":false}'
# → 200, {"message":{"role":"assistant","content":"ok"},...}

---

if (isNativeOllamaTransport && ctx.thinkingLevel === "off")
  streamFn = createOllamaThinkingWrapper(streamFn, false);
else if (isNativeOllamaTransport && ctx.thinkingLevel)
  streamFn = createOllamaThinkingWrapper(streamFn, true);   // ← unconditional

---

-else if (isNativeOllamaTransport && ctx.thinkingLevel)
+else if (isNativeOllamaTransport && ctx.thinkingLevel && model?.reasoning !== false)
   streamFn = createOllamaThinkingWrapper(streamFn, true);
RAW_BUFFERClick to expand / collapse

Summary

PR #69967 (fixes #69902, shipped in 2026.4.23) added top-level think forwarding to native Ollama /api/chat requests, but does not gate the forwarding on the target model's reasoning capability. As a result, any agent configured with thinkingLevel != "off" that routes to a non-reasoning Ollama model (e.g. qwen2.5:72b-instruct, qwen2.5-coder:32b) receives:

HTTP 400 "<model>" does not support thinking

The same release (2026.4.23) added capability gating for thinking on three other surfaces (CHANGELOG entries for OpenAI Responses /think off, Discord /think autocomplete, and provider max reasoning). The Ollama path was simply missed.

Environment

  • openclaw 2026.4.23 (npm-installed)
  • Ollama 0.x serving qwen2.5:72b-instruct (declared reasoning: false in openclaw catalog) and qwen2.5-coder:32b (declared reasoning: false)
  • Tailscale ollama-remote at 100.126.23.34:11434, but the bug is independent of transport — reproducible against any native-ollama provider.

Reproduction

  1. Configure an agent with thinkingDefault: "high" (or any level other than "off") bound to a non-reasoning Ollama model. Minimal openclaw.json excerpt:
{
  "models": {
    "providers": {
      "ollama-remote": {
        "baseUrl": "http://localhost:11434",
        "api": "ollama",
        "models": [
          {
            "id": "qwen2.5:72b-instruct",
            "name": "Qwen2.5 72B Instruct",
            "reasoning": false,
            "input": ["text"],
            "api": "ollama"
          }
        ]
      }
    }
  },
  "agents": {
    "list": [
      {
        "id": "repro-agent",
        "model": {
          "primary": "ollama-remote/qwen2.5:72b-instruct"
        },
        "thinkingDefault": "high"
      }
    ]
  }
}
  1. Route any chat through repro-agent. The gateway constructs the outbound /api/chat body with think: true. Ollama rejects with HTTP 400.

  2. Direct wire-level reproduction (bypasses openclaw, demonstrates the underlying Ollama behavior the gateway is incompatible with):

curl -sS http://localhost:11434/api/chat \
  -d '{"model":"qwen2.5:72b-instruct","messages":[{"role":"user","content":"ok"}],"stream":false,"think":true}'
# → {"error":"\"qwen2.5:72b-instruct\" does not support thinking"}

curl -sS http://localhost:11434/api/chat \
  -d '{"model":"qwen2.5:72b-instruct","messages":[{"role":"user","content":"ok"}],"stream":false}'
# → 200, {"message":{"role":"assistant","content":"ok"},...}

Root cause

In dist/stream-D2e1fsSh.js, createConfiguredOllamaCompatStreamWrapper:

if (isNativeOllamaTransport && ctx.thinkingLevel === "off")
  streamFn = createOllamaThinkingWrapper(streamFn, false);
else if (isNativeOllamaTransport && ctx.thinkingLevel)
  streamFn = createOllamaThinkingWrapper(streamFn, true);   // ← unconditional

The wrapper injects think: true into the payload whenever thinkingLevel is truthy, without consulting model.reasoning. Compare to the gating already in place for OpenAI Responses and Discord /think (added in the same 2026.4.23 release), which do consult per-model reasoning capability.

Proposed fix

One-line change at the affected else if:

-else if (isNativeOllamaTransport && ctx.thinkingLevel)
+else if (isNativeOllamaTransport && ctx.thinkingLevel && model?.reasoning !== false)
   streamFn = createOllamaThinkingWrapper(streamFn, true);

The predicate is !== false rather than truthy: only suppress when the catalog positively asserts non-reasoning. Catalog entries that omit the reasoning field (auto-discovered models, future additions) retain the existing default-on behavior; only models explicitly flagged reasoning: false get the think suppression.

The === "off" branch (which injects think: false) is unchanged — explicit opt-out by agent still propagates correctly. The Moonshot/Kimi cloud thinking wrapper at the same site is also unchanged — it operates on kimi-k...:cloud models via a separate path.

Impact severity

Medium-to-low individually, but with surprising blast radius from the 2026.4.23 release-note item that raised the implicit default thinking level from off to medium for reasoning-capable models. Operators who upgraded to 2026.4.23 with non-reasoning Ollama fallbacks (qwen2.5 family is the typical case — qwen3 added thinking and is the model the original #69902 fix targeted) will see those fallback chains break silently the first time something fails over.

Workaround: explicitly set thinkingDefault: "off" on every agent bound to a non-reasoning Ollama model. Works, but is per-binding rather than per-model capability, so it's fragile across config additions.

Related

  • #69902 (original issue, qwen3 thinking idle until watchdog)
  • #69967 (the fix that introduced the regression for non-reasoning models)
  • CHANGELOG 2026.4.23: "Thinking defaults/status: raise the implicit default thinking level for reasoning-capable models from legacy off/low fallback behavior to a safe provider-supported medium equivalent…" (related; this is what made the bug noticeable in normal operation)
  • CHANGELOG 2026.4.23: capability-gated thinking landed for OpenAI Responses, Discord /think autocomplete, and provider max reasoning in the same release — the proposed fix is consistent with that established pattern.

Notes

Happy to open a PR with the one-line fix + a unit test against a mock provider with reasoning: false if it would help land this faster.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Ollama: think:true forwarded to non-reasoning models causes HTTP 400 (regression of #69902 fix in 2026.4.23)