openclaw - 💡(How to fix) Fix Ollama provider: missing response-level reasoning stripper for Kimi models causes inline reasoning leak to chat

openclaw2026-05-24 16:54:31

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

When using ollama/kimi-k2.6:cloud (and likely kimi-k2.5:cloud) with Think: off, the model's inline reasoning text leaks into the visible chat output. The gateway correctly sends think: false (native Ollama) and thinking: { type: "disabled" } (Moonshot wrapper) on outgoing requests, but the model still emits reasoning text inline — separated from the actual response by a boundary delimiter. The Ollama provider has no response-level stripper for this inline reasoning, unlike the opencode-go provider which has stripOpencodeGoKimiReasoningPayload.

Root Cause

Request side is correct — createConfiguredOllamaCompatStreamWrapper applies both:
- createOllamaThinkingWrapper(..., false) → sets think: false on native Ollama payload
- createMoonshotThinkingWrapper(..., "disabled") → sets thinking: { type: "disabled" }
Model ignores the disable signal — kimi-k2.6 still outputs reasoning inline, likely because the Ollama API passthrough doesn't propagate the disable parameter correctly to the underlying model, or the model inherently emits reasoning regardless.
Missing response stripper — The opencode-go provider has stripOpencodeGoKimiReasoningPayload which:
- Deletes reasoning, reasoning_details, reasoning_content, reasoning_text fields
- Filters out type: "thinking" / type: "reasoning" content parts from messages
- Replaces stripped content with [assistant reasoning omitted]
The ollama provider has no equivalent response sanitizer for Kimi models.

Fix Action

Fix / Workaround

Workarounds

Code Example

{
  models: {
    providers: {
      ollama: {
        baseUrl: "http://192.168.1.72:11434",
        api: "ollama",
        models: [
          {
            id: "kimi-k2.6:cloud",
            name: "kimi-k2.6:cloud",
            reasoning: false,
            params: {
              num_ctx: 262144
            }
          }
        ]
      }
    }
  },
  agents: {
    defaults: {
      model: { primary: "ollama/kimi-k2.6:cloud" }
    }
  }
}

---

"The user is asking what projects we've worked on so far. Based on my memory files...

...Let me provide a clear summary. ️ So far we've got **3 active projects** tracked:"

RAW_BUFFERClick to expand / collapse

Summary

Environment

OpenClaw version: 2026.5.22 (a374c3a)
Provider: ollama
Model: ollama/kimi-k2.6:cloud (also observed with kimi-k2.5:cloud)
Runtime: Think: off
OS: Windows 10.0.26200 (x64)
Ollama base URL: http://192.168.1.72:11434

Reproduction

Config (relevant excerpt)

{
  models: {
    providers: {
      ollama: {
        baseUrl: "http://192.168.1.72:11434",
        api: "ollama",
        models: [
          {
            id: "kimi-k2.6:cloud",
            name: "kimi-k2.6:cloud",
            reasoning: false,
            params: {
              num_ctx: 262144
            }
          }
        ]
      }
    }
  },
  agents: {
    defaults: {
      model: { primary: "ollama/kimi-k2.6:cloud" }
    }
  }
}

Steps

Set primary model to ollama/kimi-k2.6:cloud
Ensure thinking / Think is off
Send any message that triggers the agent
Observe the assistant response in chat

Actual behavior

The visible response contains the model's internal reasoning monologue, followed by a boundary delimiter, then the actual response. Example from session history:

"The user is asking what projects we've worked on so far. Based on my memory files...

...Let me provide a clear summary. ️ So far we've got **3 active projects** tracked:"

The text before ️ is raw reasoning that should be internal only.

Expected behavior

Only the text after the reasoning delimiter should be visible to the user. The reasoning block should be stripped, discarded, or stored as internal metadata — never rendered as chat content.

Root cause analysis

Request side is correct — createConfiguredOllamaCompatStreamWrapper applies both:
- createOllamaThinkingWrapper(..., false) → sets think: false on native Ollama payload
- createMoonshotThinkingWrapper(..., "disabled") → sets thinking: { type: "disabled" }
Model ignores the disable signal — kimi-k2.6 still outputs reasoning inline, likely because the Ollama API passthrough doesn't propagate the disable parameter correctly to the underlying model, or the model inherently emits reasoning regardless.
Missing response stripper — The opencode-go provider has stripOpencodeGoKimiReasoningPayload which:
- Deletes reasoning, reasoning_details, reasoning_content, reasoning_text fields
- Filters out type: "thinking" / type: "reasoning" content parts from messages
- Replaces stripped content with [assistant reasoning omitted]
The ollama provider has no equivalent response sanitizer for Kimi models.

Cross-references

#81988 — opencode-go/kimi-k2.6: reasoning field leaks through passthrough replay policy (same family, different provider)
#83812 — opencode-go/kimi-k2.6 sends unsupported reasoning_details in replayed messages (request-side fix for opencode-go)
#6470 — Discord: reasoning content posted as regular messages (general reasoning leak class)
ollama/ollama#10456 — Ollama-level discussion on disabling thinking mode

Suggested fix

Add a Kimi-specific response sanitizer in the Ollama provider, analogous to what exists for opencode-go:

Option A (provider-level): In createConfiguredOllamaCompatStreamWrapper, when isOllamaCloudKimiModelRef(modelId) is true, wrap the stream with a response interceptor that strips inline reasoning text from assistant message content before it reaches the user.

Option B (gateway-level): Add a general stripInlineReasoningFromAssistantText utility in the message processing pipeline that recognizes the ️ (or equivalent) delimiter and splits/omits the reasoning portion.

Option C (model registry): Mark kimi-k2.6:cloud and kimi-k2.5:cloud under the Ollama provider as reasoning: true with a reasoningOutputMode: "inline" so the gateway knows to apply stripping regardless of what the model claims.

Workarounds

Switch to a non-reasoning model (e.g., llama3.2:3b, gemma4)
Modify the model's Ollama Modelfile to inject a system prompt forbidding reasoning output

Impact

Severity: Medium-High — leaks internal decision-making and planning to user-visible chat
Affected channels: All (webchat, Discord, Telegram, etc.)
Frequency: Every multi-turn assistant message with ollama/kimi-k2.6:cloud

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

Only the text after the reasoning delimiter should be visible to the user. The reasoning block should be stripped, discarded, or stored as internal metadata — never rendered as chat content.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix Ollama provider: missing response-level reasoning stripper for Kimi models causes inline reasoning leak to chat

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Workarounds

Code Example

Summary

Environment

Reproduction

Config (relevant excerpt)

Steps

Actual behavior

Expected behavior

Root cause analysis

Cross-references

Suggested fix

Workarounds

Impact

FAQ

Expected behavior

Still need to ship something?

TRENDING