openclaw - ✅(Solved) Fix [Bug] Moonshot K2.6 multi-turn tool calls fail: reasoning_content stripped during replay [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#70392Fetched 2026-04-23 07:25:24
View on GitHub
Comments
1
Participants
2
Timeline
4
Reactions
0
Participants
Timeline (top)
cross-referenced ×2closed ×1commented ×1

Error Message

  1. 400 error: reasoning_content is missing in assistant tool call message

Root Cause

K2.6 is a thinking model. When it generates a tool call, the Moonshot API returns reasoning_content alongside the standard content and tool_calls fields. The API is stateless and requires clients to echo back the exact reasoning_content when sending the next turn with the tool result.

OpenClaw strips the reasoning_content field during conversation replay. The OPENAI_COMPATIBLE_REPLAY_HOOKS used by the Moonshot provider only handle sanitizeToolCallIds and applyAssistantFirstOrderingFix — it does not preserve reasoning_content on assistant messages.

While pi-ai's convertMessages() does convert thinking blocks back to reasoning_content via thinkingSignature, this only works if the thinking blocks survive through the replay/sanitize pipeline. For Moonshot models, they appear to be lost before reaching convertMessages().

Fix Action

Workaround

Disable thinking mode by sending {"thinking": {"type": "disabled"}} in the request. This makes K2.6 behave as a standard non-thinking model and bypasses the reasoning_content requirement entirely.

Note: OpenClaw currently does not expose a per-model config option to pass this parameter. Previous attempts to set thinking, thinkingLevel, or params.thinking in agents.defaults.models were rejected as unrecognized keys.

PR fix notes

PR #70030: fix(moonshot): preserve native Kimi tool_call IDs in openai-completions replay

Description (problem / solution / changelog)

Summary

  • Problem: Moonshot's bundled OpenAI-compatible replay policy strict-sanitizes tool_call IDs down to [a-zA-Z0-9], which rewrites Kimi K2.6's native IDs (functions.<name>:<index>, e.g., functions.read:0functionsread0). Kimi's serving layer then fails to match the mangled IDs back to the original tool definitions in multi-turn history.
  • Why it matters: All multi-turn agentic flows through Kimi K2.6 break after 2–3 tool-calling rounds, with finish_reason: "stop" returned instead of "tool_calls" ~80% of the time.
  • What changed: Added a sanitizeToolCallIds opt-out to the shared openai-compatible replay family helper (buildOpenAICompatibleReplayPolicy + buildProviderReplayFamilyHooks), and wired the Moonshot plugin to opt out. Default behavior for all other openai-compatible providers is unchanged.
  • What did NOT change (scope boundary): Not touching the generic ToolCallIdMode enum or sanitizeToolCallId() implementation. Not touching the kimi-coding plugin (its policy is minimal and runs on anthropic-messages, not openai-completions; the 2026-04-10 comment about it being "also affected" is not reproducible from the code and deserves its own repro before scope expansion). No @mariozechner/pi-ai patches.

Change Type (select all)

  • Bug fix
  • Refactor required for the fix

Scope (select all touched areas)

  • API / contracts
  • Integrations

(Additive opt-out on buildProviderReplayFamilyHooks / buildOpenAICompatibleReplayPolicy; Moonshot provider plugin wiring.)

Linked Issue/PR

  • Closes #62319
  • This PR fixes a bug or regression

Root Cause

  • Root cause: Moonshot's plugin entry spread ...OPENAI_COMPATIBLE_REPLAY_HOOKS, which unconditionally returns { sanitizeToolCallIds: true, toolCallIdMode: "strict" }. Kimi K2.6 returns IDs containing . and :, which are not in [a-zA-Z0-9], so strict sanitization drops them. The mangled ID is then sent back in conversation history; Kimi's serving layer can't match it, and emits text instead of a structured tool call.
  • Missing detection / guardrail: No family-level opt-out from ID sanitization existed for openai-compatible transports — the only per-provider dial was to re-implement the whole policy locally (as mistral does with "strict9"). That friction pushed Moonshot toward copying the default.
  • Contributing context (if known): OpenAI's own call_<uuid> and the generic alphanumeric tests pass through sanitization losslessly, so the bug only surfaces for providers whose native IDs contain non-alphanumeric structure. Kimi is the first such case in the bundled set.

Regression Test Plan

  • Coverage level that should have caught this:
    • Unit test
  • Target test or file:
    • src/plugins/provider-replay-helpers.test.ts — two new cases for buildOpenAICompatibleReplayPolicy(api, { sanitizeToolCallIds: false }) on both openai-completions and openai-responses.
    • src/plugin-sdk/provider-model-shared.test.ts — new case for buildProviderReplayFamilyHooks({ family: "openai-compatible", sanitizeToolCallIds: false }).
    • extensions/moonshot/index.test.ts — flipped the existing plugin-boundary assertion: Moonshot's policy must not carry sanitizeToolCallIds or toolCallIdMode on openai-completions, while keeping applyAssistantFirstOrderingFix/validateGeminiTurns/validateAnthropicTurns.
  • Scenario the test should lock in: with the Moonshot plugin registered against openai-completions, the resolved policy passes through tool_call IDs unchanged so native functions.<name>:<index> IDs survive round-trip.
  • Why this is the smallest reliable guardrail: the bug sits on a pure helper → family hook → plugin entry chain. Unit-level assertions at each of the three seams catch regressions without requiring full transcript-replay integration runs; the existing sanitize-session-history harness already owns the cross-cutting ID-rewriting contract.
  • Existing test that already covers this (if any): src/agents/pi-embedded-runner.openai-tool-id-preservation.test.ts owns the downstream replay behavior but is keyed on toolCallIdMode: "strict". With Moonshot opting out, those paths simply don't apply — no contradiction.

User-visible / Behavior Changes

  • Moonshot / Kimi K2.6 multi-turn tool calling over openai-completions now survives more than ~2 rounds.
  • No behavioral change for any other bundled or third-party provider on the openai-compatible family; the new option defaults to true (existing behavior) and must be explicitly opted out.

Diagram

Before (strict sanitization, openai-compatible family):
  Kimi returns: functions.read:0
    -> strict sanitize [^a-zA-Z0-9] -> functionsread0
    -> sent back as tool_call_id in next turn
    -> Kimi serving layer fails to match, emits finish_reason: "stop"

After (Moonshot opts out of sanitization):
  Kimi returns: functions.read:0
    -> pass-through (no sanitize step)
    -> sent back as tool_call_id in next turn
    -> Kimi serving layer matches; finish_reason: "tool_calls"

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No
  • If any Yes, explain risk + mitigation: N/A.

Repro + Verification

Environment

  • OS: macOS (Darwin 25.3.0, Apple Silicon)
  • Runtime/container: local Node 22 pnpm workspace checkout
  • Model/provider: Moonshot provider plugin with Kimi K2.6 on the openai-completions transport (issue reporter's CanopyWave endpoint; bug is endpoint-independent, governed by OpenClaw's own replay policy)
  • Integration/channel (if any): N/A (replay policy is transport-layer)
  • Relevant config (redacted): N/A

Steps

  1. Register the Moonshot plugin (bundled).
  2. Resolve its replay policy for { modelApi: "openai-completions", modelId: "kimi-k2.6" }.
  3. Confirm the resolved policy carries applyAssistantFirstOrderingFix / validateGeminiTurns / validateAnthropicTurns but does not carry sanitizeToolCallIds or toolCallIdMode.
  4. Confirm other openai-compatible providers (e.g., xai on openai-completions) continue to carry sanitizeToolCallIds: true and toolCallIdMode: "strict".

Expected

  • Moonshot policy passes native tool_call IDs through untouched.
  • All other openai-compatible provider defaults unchanged.

Actual

  • Matches expected. Three test suites lock this in at pure-helper, family-hook, and plugin-entry layers.

Evidence

  • Failing test/log before + passing after

Before the fix (on the new tests):

FAIL  extensions/moonshot/index.test.ts > moonshot provider plugin > owns replay policy for OpenAI-compatible Moonshot transports without mangling native Kimi tool_call IDs
AssertionError: expected { sanitizeToolCallIds: true, …(4) } to not have property "sanitizeToolCallIds"

After the fix:

Test Files  2 passed (2)
Tests       15 passed (15)   (unit-fast)

Test Files  1 passed (1)
Tests       2 passed (2)     (extension-providers)

Issue reporter's independent streaming-API measurements (pre-merge):

ID formatPass rate (5 trials, 66K system prompt)
functionsread0 (current strict)1/5 (20%)
functions.read:0 (native, as this PR preserves)5/5 (100%)
call_abc123def456 (OpenAI style)5/5 (100%)

Human Verification (required)

What I personally verified (not just CI), and how:

  • Verified scenarios:
    • Unit: buildOpenAICompatibleReplayPolicy("openai-completions", { sanitizeToolCallIds: false }) omits both sanitizeToolCallIds and toolCallIdMode and keeps the openai-completions-shaped fields.
    • Unit: same for "openai-responses" (which has the non-openai-completions shape).
    • Family hook: buildProviderReplayFamilyHooks({ family: "openai-compatible", sanitizeToolCallIds: false }) threads through to the helper.
    • Plugin entry: Moonshot plugin resolves a policy that matches the opted-out shape.
    • Ran pnpm build — bundled plugin dist emits cleanly, no [INEFFECTIVE_DYNAMIC_IMPORT] warnings.
    • Ran pnpm plugin-sdk:api:check — no baseline drift (the added optional field lives on a module-internal type).
    • Ran pnpm check:changed — typecheck (core + core tests + extensions + extension tests), lint (core + extensions), import cycles, webhook/pairing guards all green.
  • Edge cases checked:
    • Default behavior unchanged when no option passed (existing tests locked in).
    • Opt-out works independently of the openai-completions vs openai-responses branch (both covered by unit tests).
    • Other bundled openai-compatible providers (xAI, etc.) untouched because they continue to spread OPENAI_COMPATIBLE_REPLAY_HOOKS without the opt-out.
  • What I did not verify:
    • Live round-trip against a real Kimi K2.6 endpoint from this checkout. The issue reporter's pre-fix measurements (patched locally) already cover the empirical side; this PR is the structural delivery of that fix.
    • kimi-coding plugin behavior. The 2026-04-10 comment flagging it as "also affected" is not consistent with its code (api: "anthropic-messages", minimal KIMI_REPLAY_POLICY, default sanitizeToolCallIds: false merge). Left for a separate issue/PR with a fresh repro.
  • Unrelated CI signal: pnpm check:changed reports 2 failures in src/agents/tools/web-fetch.provider-fallback.test.ts (SSRF guard rejects DNS that resolves to a private IP in my local network environment). These fail identically on a stashed clean upstream/mainnot introduced by this PR. All guards relevant to this diff are green.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? Yes (sanitizeToolCallIds defaults to true; OPENAI_COMPATIBLE_REPLAY_HOOKS unchanged; only Moonshot opts out).
  • Config/env changes? No
  • Migration needed? No
  • If yes, exact upgrade steps: N/A.

Risks and Mitigations

  • Risk: A future bundled openai-compatible provider appears whose ID format does need sanitization and accidentally copies the Moonshot pattern.
    • Mitigation: The opt-out is explicit (sanitizeToolCallIds: false); defaults still sanitize. The inline comment at extensions/moonshot/index.ts documents the rationale so future readers understand the opt-out is Kimi-specific.
  • Risk: A third-party Moonshot-compatible gateway returns tool_call IDs that a different endpoint would reject.
    • Mitigation: OpenAI and Moonshot API docs both accept arbitrary string IDs on tool_call_id; Kimi's native format is already valid wire-level and works across all three formats tested in the issue.

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • extensions/moonshot/index.test.ts (modified, +10/-10)
  • extensions/moonshot/index.ts (modified, +8/-2)
  • extensions/xai/package.json (modified, +2/-1)
  • pnpm-lock.yaml (modified, +3/-0)
  • src/plugin-sdk/provider-model-shared.test.ts (modified, +17/-0)
  • src/plugin-sdk/provider-model-shared.ts (modified, +5/-3)
  • src/plugins/provider-replay-helpers.test.ts (modified, +26/-0)
  • src/plugins/provider-replay-helpers.ts (modified, +6/-2)

Code Example

400 Bad Request: thinking is enabled but reasoning_content is missing in assistant tool call message at index N
RAW_BUFFERClick to expand / collapse

Problem

When using moonshot/kimi-k2.6 (or any Moonshot thinking model) with multi-turn tool calls, the second tool-call turn always fails with:

400 Bad Request: thinking is enabled but reasoning_content is missing in assistant tool call message at index N

Root Cause

K2.6 is a thinking model. When it generates a tool call, the Moonshot API returns reasoning_content alongside the standard content and tool_calls fields. The API is stateless and requires clients to echo back the exact reasoning_content when sending the next turn with the tool result.

OpenClaw strips the reasoning_content field during conversation replay. The OPENAI_COMPATIBLE_REPLAY_HOOKS used by the Moonshot provider only handle sanitizeToolCallIds and applyAssistantFirstOrderingFix — it does not preserve reasoning_content on assistant messages.

While pi-ai's convertMessages() does convert thinking blocks back to reasoning_content via thinkingSignature, this only works if the thinking blocks survive through the replay/sanitize pipeline. For Moonshot models, they appear to be lost before reaching convertMessages().

Steps to Reproduce

  1. Configure OpenClaw with moonshot/kimi-k2.6 as primary or fallback model
  2. Start a session and send a message that triggers a tool call
  3. The tool call succeeds (first turn, no replay needed)
  4. Send another message that triggers a second tool call
  5. 400 error: reasoning_content is missing in assistant tool call message

This affects ANY session type (main or subagent). It is not related to history length.

Environment

  • OpenClaw: 2026.4.21 (f788c88)
  • Model: moonshot/kimi-k2.6
  • Provider: Moonshot AI (direct, not OpenRouter)
  • Auth: moonshot:default profile

Expected Behavior

The reasoning_content field from Moonshot assistant messages should be preserved during replay and sent back to the API on subsequent turns.

Workaround

Disable thinking mode by sending {"thinking": {"type": "disabled"}} in the request. This makes K2.6 behave as a standard non-thinking model and bypasses the reasoning_content requirement entirely.

Note: OpenClaw currently does not expose a per-model config option to pass this parameter. Previous attempts to set thinking, thinkingLevel, or params.thinking in agents.defaults.models were rejected as unrecognized keys.

References

extent analysis

TL;DR

Disable thinking mode by sending {"thinking": {"type": "disabled"}} in the request to bypass the reasoning_content requirement.

Guidance

  • Identify the root cause: OpenClaw strips the reasoning_content field during conversation replay, causing the Moonshot API to return a 400 error.
  • Verify the issue: Reproduce the error by following the steps outlined in the issue, specifically triggering a second tool call.
  • Mitigate the issue: Disable thinking mode as a temporary workaround, or explore modifying the OPENAI_COMPATIBLE_REPLAY_HOOKS to preserve reasoning_content.
  • Investigate long-term solutions: Consider updating OpenClaw to preserve reasoning_content or adding a per-model config option to pass the thinking parameter.

Example

No code snippet is provided as the issue does not imply a specific code change.

Notes

The provided workaround may not be ideal for all use cases, as it disables thinking mode entirely. A more permanent solution would involve modifying OpenClaw to preserve reasoning_content or adding a config option to pass the thinking parameter.

Recommendation

Apply the workaround by disabling thinking mode, as it is a straightforward and immediate solution to bypass the reasoning_content requirement. However, it is recommended to explore long-term solutions that preserve the functionality of thinking models.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Bug] Moonshot K2.6 multi-turn tool calls fail: reasoning_content stripped during replay [1 pull requests, 1 comments, 2 participants]