openclaw - ✅(Solved) Fix [Bug] Moonshot K2.6 multi-turn tool calls fail: reasoning_content stripped during replay [1 pull requests, 1 comments, 2 participants]

RoseKongPS · 2026-04-22T23:49:31Z

[openclaw] PR 70030: fix moonshot : preserve native Kimi tool call IDs in openai-completions replay - Repository: openclaw/openclaw - Author: LeoDu0314 - State… # PR #70030: fix(moonshot): preserve native Kimi tool_call IDs in openai-completions replay - Repository: openclaw/openclaw - Author: LeoDu0314 - State: closed | merged: True - Link: https://github.com/openclaw/openclaw/pull/70030 ## Description (problem / solution / changelog) ## Summary - **Problem**: Moonshot's bundled OpenAI-compatible replay policy strict-sanitizes tool_call IDs down to `[a-zA-Z0-9]`, which rewrites Kimi K2.6's native IDs (`functions. : `, e.g., `functions.read:0` → `functionsread0`). Kimi's serving layer then fails to match the mangled IDs back to the original tool definitions in multi-turn history. - **Why it matters**: All multi-turn agentic flows through Kimi K2.6 break after 2–3 tool-calling rounds, with `finish_reason: "stop"` returned instead of `"tool_calls"` ~80% of the time. - **What changed**: Added a `sanitizeToolCallIds` opt-out to the shared `openai-compatible` replay family helper (`buildOpenAICompatibleReplayPolicy` + `buildProviderReplayFamilyHooks`), and wired the Moonshot plugin to opt out. Default behavior for all other openai-compatible providers is unchanged. - **What did NOT change (scope boundary)**: Not touching the generic `ToolCallIdMode` enum or `sanitizeToolCallId()` implementation. Not touching the `kimi-coding` plugin (its policy is minimal and runs on `anthropic-messages`, not openai-completions; the 2026-04-10 comment about it being "also affected" is not reproducible from the code and deserves its own repro before scope expansion). No `@mariozechner/pi-ai` patches. ## Change Type (select all) - [x] Bug fix - [x] Refactor required for the fix ## Scope (select all touched areas) - [x] API / contracts - [x] Integrations (Additive opt-out on `buildProviderReplayFamilyHooks` / `buildOpenAICompatibleReplayPolicy`; Moonshot provider plugin wiring.) ## Linked Issue/PR - Closes #62319 - [x] This PR fixes a bug or regression ## Root Cause - **Root cause**: Moonshot's plugin entry spread `...OPENAI_COMPATIBLE_REPLAY_HOOKS`, which unconditionally returns `{ sanitizeToolCallIds: true, toolCallIdMode: "strict" }`. Kimi K2.6 returns IDs containing `.` and `:`, which are not in `[a-zA-Z0-9]`, so strict sanitization drops them. The mangled ID is then sent back in conversation history; Kimi's serving layer can't match it, and emits text instead of a structured tool call. - **Missing detection / guardrail**: No family-level opt-out from ID sanitization existed for openai-compatible transports — the only per-provider dial was to re-implement the whole policy locally (as `mistral` does with `"strict9"`). That friction pushed Moonshot toward copying the default. - **Contributing context (if known)**: OpenAI's own `call_ ` and the generic alphanumeric tests pass through sanitization losslessly, so the bug only surfaces for providers whose native IDs contain non-alphanumeric structure. Kimi is the first such case in the bundled set. ## Regression Test Plan - Coverage level that should have caught this: - [x] Unit test - Target test or file: - `src/plugins/provider-replay-helpers.test.ts` — two new cases for `buildOpenAICompatibleReplayPolicy(api, { sanitizeToolCallIds: false })` on both `openai-completions` and `openai-responses`. - `src/plugin-sdk/provider-model-shared.test.ts` — new case for `buildProviderReplayFamilyHooks({ family: "openai-compatible", sanitizeToolCallIds: false })`. - `extensions/moonshot/index.test.ts` — flipped the existing plugin-boundary assertion: Moonshot's policy must **not** carry `sanitizeToolCallIds` or `toolCallIdMode` on openai-completions, while keeping `applyAssistantFirstOrderingFix`/`validateGeminiTurns`/`validateAnthropicTurns`. - Scenario the test should lock in: with the Moonshot plugin registered against `openai-completions`, the resolved policy passes through tool_call IDs unchanged so native `functions. : ` IDs survive round-trip. - Why this is the smallest reliable guardrail: the bug sits on a pure helper → family hook → plugin entry chain. Unit-level assertions at each of the three seams catch regressions without requiring full transcript-replay integration runs; the existing sanitize-session-history harness already owns the cross-cutting ID-rewriting contract. - Existing test that already covers this (if any): `src/agents/pi-embedded-runner.openai-tool-id-preservation.test.ts` owns the downstream replay behavior but is keyed on `toolCallIdMode: "strict"`. With Moonshot opting out, those paths simply don't apply — no contradiction. ## User-visible / Behavior Changes - Moonshot / Kimi K2.6 multi-turn tool calling over `openai-completions` now survives more than ~2 rounds. - No behavioral change for any other bundled or third-party provider on the openai-compatible family; the new option defaults to `true` (existing behavior) and must be explicitly

openclaw2026-04-22 23:49:31

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#70392•Fetched 2026-04-23 07:25:24

View on GitHub

Comments

Participants

Timeline

Reactions

Author

RoseKongPS

Participants

RoseKongPS

steipete

Timeline (top)

cross-referenced ×2closed ×1commented ×1

Error Message

400 error: reasoning_content is missing in assistant tool call message

Root Cause

K2.6 is a thinking model. When it generates a tool call, the Moonshot API returns reasoning_content alongside the standard content and tool_calls fields. The API is stateless and requires clients to echo back the exact reasoning_content when sending the next turn with the tool result.

OpenClaw strips the reasoning_content field during conversation replay. The OPENAI_COMPATIBLE_REPLAY_HOOKS used by the Moonshot provider only handle sanitizeToolCallIds and applyAssistantFirstOrderingFix — it does not preserve reasoning_content on assistant messages.

While pi-ai's convertMessages() does convert thinking blocks back to reasoning_content via thinkingSignature, this only works if the thinking blocks survive through the replay/sanitize pipeline. For Moonshot models, they appear to be lost before reaching convertMessages().

Fix Action

Workaround

Disable thinking mode by sending {"thinking": {"type": "disabled"}} in the request. This makes K2.6 behave as a standard non-thinking model and bypasses the reasoning_content requirement entirely.

Note: OpenClaw currently does not expose a per-model config option to pass this parameter. Previous attempts to set thinking, thinkingLevel, or params.thinking in agents.defaults.models were rejected as unrecognized keys.

PR fix notes

PR #70030: fix(moonshot): preserve native Kimi tool_call IDs in openai-completions replay

Repository: openclaw/openclaw
Author: LeoDu0314
State: closed | merged: True
Link: https://github.com/openclaw/openclaw/pull/70030

Description (problem / solution / changelog)

Summary

Problem: Moonshot's bundled OpenAI-compatible replay policy strict-sanitizes tool_call IDs down to [a-zA-Z0-9], which rewrites Kimi K2.6's native IDs (functions.<name>:<index>, e.g., functions.read:0 → functionsread0). Kimi's serving layer then fails to match the mangled IDs back to the original tool definitions in multi-turn history.
Why it matters: All multi-turn agentic flows through Kimi K2.6 break after 2–3 tool-calling rounds, with finish_reason: "stop" returned instead of "tool_calls" ~80% of the time.
What changed: Added a sanitizeToolCallIds opt-out to the shared openai-compatible replay family helper (buildOpenAICompatibleReplayPolicy + buildProviderReplayFamilyHooks), and wired the Moonshot plugin to opt out. Default behavior for all other openai-compatible providers is unchanged.
What did NOT change (scope boundary): Not touching the generic ToolCallIdMode enum or sanitizeToolCallId() implementation. Not touching the kimi-coding plugin (its policy is minimal and runs on anthropic-messages, not openai-completions; the 2026-04-10 comment about it being "also affected" is not reproducible from the code and deserves its own repro before scope expansion). No @mariozechner/pi-ai patches.

Change Type (select all)

Bug fix
Refactor required for the fix

Scope (select all touched areas)

API / contracts
Integrations

(Additive opt-out on buildProviderReplayFamilyHooks / buildOpenAICompatibleReplayPolicy; Moonshot provider plugin wiring.)

Linked Issue/PR

Closes #62319
This PR fixes a bug or regression

Root Cause

Root cause: Moonshot's plugin entry spread ...OPENAI_COMPATIBLE_REPLAY_HOOKS, which unconditionally returns { sanitizeToolCallIds: true, toolCallIdMode: "strict" }. Kimi K2.6 returns IDs containing . and :, which are not in [a-zA-Z0-9], so strict sanitization drops them. The mangled ID is then sent back in conversation history; Kimi's serving layer can't match it, and emits text instead of a structured tool call.
Missing detection / guardrail: No family-level opt-out from ID sanitization existed for openai-compatible transports — the only per-provider dial was to re-implement the whole policy locally (as mistral does with "strict9"). That friction pushed Moonshot toward copying the default.
Contributing context (if known): OpenAI's own call_<uuid> and the generic alphanumeric tests pass through sanitization losslessly, so the bug only surfaces for providers whose native IDs contain non-alphanumeric structure. Kimi is the first such case in the bundled set.

Regression Test Plan

Coverage level that should have caught this:
- Unit test
Target test or file:
- src/plugins/provider-replay-helpers.test.ts — two new cases for buildOpenAICompatibleReplayPolicy(api, { sanitizeToolCallIds: false }) on both openai-completions and openai-responses.
- src/plugin-sdk/provider-model-shared.test.ts — new case for buildProviderReplayFamilyHooks({ family: "openai-compatible", sanitizeToolCallIds: false }).
- extensions/moonshot/index.test.ts — flipped the existing plugin-boundary assertion: Moonshot's policy must not carry sanitizeToolCallIds or toolCallIdMode on openai-completions, while keeping applyAssistantFirstOrderingFix/validateGeminiTurns/validateAnthropicTurns.
Scenario the test should lock in: with the Moonshot plugin registered against openai-completions, the resolved policy passes through tool_call IDs unchanged so native functions.<name>:<index> IDs survive round-trip.
Why this is the smallest reliable guardrail: the bug sits on a pure helper → family hook → plugin entry chain. Unit-level assertions at each of the three seams catch regressions without requiring full transcript-replay integration runs; the existing sanitize-session-history harness already owns the cross-cutting ID-rewriting contract.
Existing test that already covers this (if any): src/agents/pi-embedded-runner.openai-tool-id-preservation.test.ts owns the downstream replay behavior but is keyed on toolCallIdMode: "strict". With Moonshot opting out, those paths simply don't apply — no contradiction.

User-visible / Behavior Changes

Moonshot / Kimi K2.6 multi-turn tool calling over openai-completions now survives more than ~2 rounds.
No behavioral change for any other bundled or third-party provider on the openai-compatible family; the new option defaults to true (existing behavior) and must be explicitly opted out.

Diagram

Before (strict sanitization, openai-compatible family):
  Kimi returns: functions.read:0
    -> strict sanitize [^a-zA-Z0-9] -> functionsread0
    -> sent back as tool_call_id in next turn
    -> Kimi serving layer fails to match, emits finish_reason: "stop"

After (Moonshot opts out of sanitization):
  Kimi returns: functions.read:0
    -> pass-through (no sanitize step)
    -> sent back as tool_call_id in next turn
    -> Kimi serving layer matches; finish_reason: "tool_calls"

Security Impact (required)

New permissions/capabilities? No
Secrets/tokens handling changed? No
New/changed network calls? No
Command/tool execution surface changed? No
Data access scope changed? No
If any Yes, explain risk + mitigation: N/A.

Repro + Verification

Environment

OS: macOS (Darwin 25.3.0, Apple Silicon)
Runtime/container: local Node 22 pnpm workspace checkout
Model/provider: Moonshot provider plugin with Kimi K2.6 on the openai-completions transport (issue reporter's CanopyWave endpoint; bug is endpoint-independent, governed by OpenClaw's own replay policy)
Integration/channel (if any): N/A (replay policy is transport-layer)
Relevant config (redacted): N/A

Steps

Register the Moonshot plugin (bundled).
Resolve its replay policy for { modelApi: "openai-completions", modelId: "kimi-k2.6" }.
Confirm the resolved policy carries applyAssistantFirstOrderingFix / validateGeminiTurns / validateAnthropicTurns but does not carry sanitizeToolCallIds or toolCallIdMode.
Confirm other openai-compatible providers (e.g., xai on openai-completions) continue to carry sanitizeToolCallIds: true and toolCallIdMode: "strict".

Expected

Moonshot policy passes native tool_call IDs through untouched.
All other openai-compatible provider defaults unchanged.

Actual

Matches expected. Three test suites lock this in at pure-helper, family-hook, and plugin-entry layers.

Evidence

Failing test/log before + passing after

Before the fix (on the new tests):

FAIL  extensions/moonshot/index.test.ts > moonshot provider plugin > owns replay policy for OpenAI-compatible Moonshot transports without mangling native Kimi tool_call IDs
AssertionError: expected { sanitizeToolCallIds: true, …(4) } to not have property "sanitizeToolCallIds"

After the fix:

Test Files  2 passed (2)
Tests       15 passed (15)   (unit-fast)

Test Files  1 passed (1)
Tests       2 passed (2)     (extension-providers)

Issue reporter's independent streaming-API measurements (pre-merge):

ID format	Pass rate (5 trials, 66K system prompt)
`functionsread0` (current strict)	1/5 (20%)
`functions.read:0` (native, as this PR preserves)	5/5 (100%)
`call_abc123def456` (OpenAI style)	5/5 (100%)

Human Verification (required)

What I personally verified (not just CI), and how:

Verified scenarios:
- Unit: buildOpenAICompatibleReplayPolicy("openai-completions", { sanitizeToolCallIds: false }) omits both sanitizeToolCallIds and toolCallIdMode and keeps the openai-completions-shaped fields.
- Unit: same for "openai-responses" (which has the non-openai-completions shape).
- Family hook: buildProviderReplayFamilyHooks({ family: "openai-compatible", sanitizeToolCallIds: false }) threads through to the helper.
- Plugin entry: Moonshot plugin resolves a policy that matches the opted-out shape.
- Ran pnpm build — bundled plugin dist emits cleanly, no [INEFFECTIVE_DYNAMIC_IMPORT] warnings.
- Ran pnpm plugin-sdk:api:check — no baseline drift (the added optional field lives on a module-internal type).
- Ran pnpm check:changed — typecheck (core + core tests + extensions + extension tests), lint (core + extensions), import cycles, webhook/pairing guards all green.
Edge cases checked:
- Default behavior unchanged when no option passed (existing tests locked in).
- Opt-out works independently of the openai-completions vs openai-responses branch (both covered by unit tests).
- Other bundled openai-compatible providers (xAI, etc.) untouched because they continue to spread OPENAI_COMPATIBLE_REPLAY_HOOKS without the opt-out.
What I did not verify:
- Live round-trip against a real Kimi K2.6 endpoint from this checkout. The issue reporter's pre-fix measurements (patched locally) already cover the empirical side; this PR is the structural delivery of that fix.
- kimi-coding plugin behavior. The 2026-04-10 comment flagging it as "also affected" is not consistent with its code (api: "anthropic-messages", minimal KIMI_REPLAY_POLICY, default sanitizeToolCallIds: false merge). Left for a separate issue/PR with a fresh repro.
Unrelated CI signal: pnpm check:changed reports 2 failures in src/agents/tools/web-fetch.provider-fallback.test.ts (SSRF guard rejects DNS that resolves to a private IP in my local network environment). These fail identically on a stashed clean upstream/main — not introduced by this PR. All guards relevant to this diff are green.

Review Conversations

I replied to or resolved every bot review conversation I addressed in this PR.
I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

Backward compatible? Yes (sanitizeToolCallIds defaults to true; OPENAI_COMPATIBLE_REPLAY_HOOKS unchanged; only Moonshot opts out).
Config/env changes? No
Migration needed? No
If yes, exact upgrade steps: N/A.

Risks and Mitigations

Risk: A future bundled openai-compatible provider appears whose ID format does need sanitization and accidentally copies the Moonshot pattern.
- Mitigation: The opt-out is explicit (sanitizeToolCallIds: false); defaults still sanitize. The inline comment at extensions/moonshot/index.ts documents the rationale so future readers understand the opt-out is Kimi-specific.
Risk: A third-party Moonshot-compatible gateway returns tool_call IDs that a different endpoint would reject.
- Mitigation: OpenAI and Moonshot API docs both accept arbitrary string IDs on tool_call_id; Kimi's native format is already valid wire-level and works across all three formats tested in the issue.

Changed files

CHANGELOG.md (modified, +1/-0)
extensions/moonshot/index.test.ts (modified, +10/-10)
extensions/moonshot/index.ts (modified, +8/-2)
extensions/xai/package.json (modified, +2/-1)
pnpm-lock.yaml (modified, +3/-0)
src/plugin-sdk/provider-model-shared.test.ts (modified, +17/-0)
src/plugin-sdk/provider-model-shared.ts (modified, +5/-3)
src/plugins/provider-replay-helpers.test.ts (modified, +26/-0)
src/plugins/provider-replay-helpers.ts (modified, +6/-2)

Code Example

400 Bad Request: thinking is enabled but reasoning_content is missing in assistant tool call message at index N

RAW_BUFFERClick to expand / collapse

Problem

When using moonshot/kimi-k2.6 (or any Moonshot thinking model) with multi-turn tool calls, the second tool-call turn always fails with:

400 Bad Request: thinking is enabled but reasoning_content is missing in assistant tool call message at index N

Root Cause

Steps to Reproduce

Configure OpenClaw with moonshot/kimi-k2.6 as primary or fallback model
Start a session and send a message that triggers a tool call
The tool call succeeds (first turn, no replay needed)
Send another message that triggers a second tool call
400 error: reasoning_content is missing in assistant tool call message

This affects ANY session type (main or subagent). It is not related to history length.

Environment

OpenClaw: 2026.4.21 (f788c88)
Model: moonshot/kimi-k2.6
Provider: Moonshot AI (direct, not OpenRouter)
Auth: moonshot:default profile

Expected Behavior

The reasoning_content field from Moonshot assistant messages should be preserved during replay and sent back to the API on subsequent turns.

Workaround

References

LiteLLM fixed the same issue: https://github.com/BerriAI/litellm/issues/21672 (v1.82.2)
Nanobot: https://github.com/HKUDS/nanobot/issues/1313
Cherry Studio: https://github.com/CherryHQ/cherry-studio/issues/12619
Moonshot docs: https://platform.kimi.ai/docs/guide/use-kimi-k2-thinking-model

extent analysis

TL;DR

Disable thinking mode by sending {"thinking": {"type": "disabled"}} in the request to bypass the reasoning_content requirement.

Guidance

Identify the root cause: OpenClaw strips the reasoning_content field during conversation replay, causing the Moonshot API to return a 400 error.
Verify the issue: Reproduce the error by following the steps outlined in the issue, specifically triggering a second tool call.
Mitigate the issue: Disable thinking mode as a temporary workaround, or explore modifying the OPENAI_COMPATIBLE_REPLAY_HOOKS to preserve reasoning_content.
Investigate long-term solutions: Consider updating OpenClaw to preserve reasoning_content or adding a per-model config option to pass the thinking parameter.

Example

No code snippet is provided as the issue does not imply a specific code change.

Notes

The provided workaround may not be ideal for all use cases, as it disables thinking mode entirely. A more permanent solution would involve modifying OpenClaw to preserve reasoning_content or adding a config option to pass the thinking parameter.

Recommendation

Apply the workaround by disabling thinking mode, as it is a straightforward and immediate solution to bypass the reasoning_content requirement. However, it is recommended to explore long-term solutions that preserve the functionality of thinking models.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #prompt issue #agent setup #task chaining #parallel task

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix [Bug] Moonshot K2.6 multi-turn tool calls fail: reasoning_content stripped during replay [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Workaround

PR fix notes

PR #70030: fix(moonshot): preserve native Kimi tool_call IDs in openai-completions replay

Description (problem / solution / changelog)

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Root Cause

Regression Test Plan

User-visible / Behavior Changes

Diagram

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Human Verification (required)

Review Conversations

Compatibility / Migration

Risks and Mitigations

Changed files

Code Example

Problem

Root Cause

Steps to Reproduce

Environment

Expected Behavior

Workaround

References

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING