openclaw - 💡(How to fix) Fix DeepSeek-v4-pro thinking mode breaks on multi-turn tool-call flows (4.24 fix is incomplete) [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#72044Fetched 2026-04-27 05:35:39
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
2

DeepSeek-v4-pro with thinking-level high consistently fails on multi-turn tool-call flows (e.g. sessions_spawn subagent paths) with a 400 error from the DeepSeek server. The agent then falls back to the configured fallback model (e.g. openai-codex/gpt-5.4), so the user never gets DeepSeek-quality output. Single-turn flows (direct chat with the same agent) work fine — only multi-turn tool-call flows trigger this.

The 4.24 release notes claim "DeepSeek thinking/replay behavior is fixed for follow-up tool-call turns", but the fix only covers short tool-call chains — long chains in sessions_spawn subagent paths still break.

Error Message

DeepSeek-v4-pro with thinking-level high consistently fails on multi-turn tool-call flows (e.g. sessions_spawn subagent paths) with a 400 error from the DeepSeek server. The agent then falls back to the configured fallback model (e.g. openai-codex/gpt-5.4), so the user never gets DeepSeek-quality output. Single-turn flows (direct chat with the same agent) work fine — only multi-turn tool-call flows trigger this. error=LLM request failed: provider rejected the request schema or tool payload. Same agent in single-turn chat (no tool calls): no error, DeepSeek answers normally. Tried removing the empty-string fallback (commenting out the four lines above) and restarting the gateway. Result: DeepSeek still returns the same 400 error. So DeepSeek does not accept the field being absent either — it requires the actual previous-turn reasoning content to be replayed.

Root Cause

Root cause analysis (likely in pi-ai)

Fix Action

Fix / Workaround

Workaround attempted (failed)

This means the proper fix needs to traverse transformedMessages history and copy real reasoning_content from prior turns into messages that don't carry their own thinking blocks, not just patch with empty/missing values.

Code Example

[agent/embedded] embedded run agent end: ... isError=true model=deepseek-v4-pro provider=deepseek
  error=LLM request failed: provider rejected the request schema or tool payload.
  rawError=400 The `reasoning_content` in the thinking mode must be passed back to the API.

[agent/embedded] auth profile failure state updated: ... profile=sha256:9eec02c17471
  provider=deepseek reason=format window=cooldown reused=false

[model-fallback/decision] model fallback decision:
  decision=candidate_failed requested=deepseek/deepseek-v4-pro candidate=deepseek/deepseek-v4-pro
  reason=format next=openai-codex/gpt-5.4
  detail=400 The `reasoning_content` in the thinking mode must be passed back to the API.

[model-fallback/decision] model fallback decision:
  decision=candidate_succeeded requested=deepseek/deepseek-v4-pro candidate=openai-codex/gpt-5.4

---

if (compat.requiresReasoningContentOnAssistantMessages &&
    model.reasoning &&
    assistantMsg.reasoning_content === undefined) {
    assistantMsg.reasoning_content = "";
}
RAW_BUFFERClick to expand / collapse

Summary

DeepSeek-v4-pro with thinking-level high consistently fails on multi-turn tool-call flows (e.g. sessions_spawn subagent paths) with a 400 error from the DeepSeek server. The agent then falls back to the configured fallback model (e.g. openai-codex/gpt-5.4), so the user never gets DeepSeek-quality output. Single-turn flows (direct chat with the same agent) work fine — only multi-turn tool-call flows trigger this.

The 4.24 release notes claim "DeepSeek thinking/replay behavior is fixed for follow-up tool-call turns", but the fix only covers short tool-call chains — long chains in sessions_spawn subagent paths still break.

Environment

  • OpenClaw 2026.4.24 (commit cbcfdf6)
  • pi-ai @mariozechner/[email protected]
  • macOS Darwin 25.4.0 (ARM64)
  • Model: deepseek/deepseek-v4-pro, thinkingDefault high

Reproduction

  1. Configure an agent (e.g. reasoning) with model.primary = "deepseek/deepseek-v4-pro" and thinkingDefault = "high".
  2. From a parent agent (e.g. main), call sessions_spawn with agentId="reasoning", thinking="high", and a task that requires several tool calls (e.g. read 2-3 files, create a feishu doc, write a bitable record, then return result).
  3. Observe gateway.err.log:
[agent/embedded] embedded run agent end: ... isError=true model=deepseek-v4-pro provider=deepseek
  error=LLM request failed: provider rejected the request schema or tool payload.
  rawError=400 The `reasoning_content` in the thinking mode must be passed back to the API.

[agent/embedded] auth profile failure state updated: ... profile=sha256:9eec02c17471
  provider=deepseek reason=format window=cooldown reused=false

[model-fallback/decision] model fallback decision:
  decision=candidate_failed requested=deepseek/deepseek-v4-pro candidate=deepseek/deepseek-v4-pro
  reason=format next=openai-codex/gpt-5.4
  detail=400 The `reasoning_content` in the thinking mode must be passed back to the API.

[model-fallback/decision] model fallback decision:
  decision=candidate_succeeded requested=deepseek/deepseek-v4-pro candidate=openai-codex/gpt-5.4

This pattern repeats every time DeepSeek is requested in the multi-turn flow — it always falls back. The DeepSeek auth profile enters cooldown, blocking subsequent attempts.

Same agent in single-turn chat (no tool calls): no error, DeepSeek answers normally.

Root cause analysis (likely in pi-ai)

The relevant code is in pi-ai (@mariozechner/pi-ai), file dist/providers/openai-completions.js lines 684-688:

if (compat.requiresReasoningContentOnAssistantMessages &&
    model.reasoning &&
    assistantMsg.reasoning_content === undefined) {
    assistantMsg.reasoning_content = "";
}

When an assistant message has no thinking block (e.g. a tool-call-only follow-up turn), this fallback sets reasoning_content = "" (empty string). DeepSeek's server rejects this with 400 — it requires the actual previous reasoning content to be replayed in subsequent turns, not an empty string.

Workaround attempted (failed)

Tried removing the empty-string fallback (commenting out the four lines above) and restarting the gateway. Result: DeepSeek still returns the same 400 error. So DeepSeek does not accept the field being absent either — it requires the actual previous-turn reasoning content to be replayed.

This means the proper fix needs to traverse transformedMessages history and copy real reasoning_content from prior turns into messages that don't carry their own thinking blocks, not just patch with empty/missing values.

Impact

  • DeepSeek-v4-pro is unusable in any multi-turn tool-call flow in OpenClaw, including:
    • sessions_spawn subagent flows (e.g. content-factory writing skill calling Thinker)
    • Any agent that issues 3+ tool calls per turn
  • Users hit silent fallback to openai-codex/gpt-5.4 (or whatever fallback is configured), which dramatically changes output quality and language style without obvious indication.
  • The DeepSeek auth profile getting put into cooldown also blocks parallel sessions.

Suggested fix direction (for upstream pi-ai)

In openai-completions.js, instead of the empty-string fallback at line 684-688:

  1. When building assistantMsg.reasoning_content for a turn without thinking blocks, search backwards in transformedMessages for the most recent assistant message with non-empty thinking blocks, and replay that content.
  2. Or maintain a per-conversation lastReasoningContent cache that gets refreshed every time a real thinking block is processed, and use it as fallback.
  3. Add an integration test using DeepSeek-v4-pro thinking-mode + ≥3 tool-call turns.

The bug really lives in pi-mono (https://github.com/badlogic/pi-mono, packages/ai) — happy to file there too if preferred.

Trace evidence available

Full trajectory files and gateway logs available on request. Multiple recurrences observed on 2026-04-25 and 2026-04-26 in independent sessions.

extent analysis

TL;DR

Modify the openai-completions.js file in pi-ai to properly handle reasoning_content in multi-turn tool-call flows by replaying the actual previous reasoning content.

Guidance

  • Identify the lines of code in openai-completions.js (684-688) that set reasoning_content to an empty string and modify them to search for the most recent assistant message with non-empty thinking blocks.
  • Implement a cache to store the last reasoning content for each conversation, updating it whenever a real thinking block is processed, and use this cache as a fallback.
  • Add integration tests to verify the fix using DeepSeek-v4-pro with thinking mode and multiple tool-call turns.
  • Consider filing the bug in pi-mono (https://github.com/badlogic/pi-mono, packages/ai) for a more comprehensive solution.

Example

// Example of how to modify the openai-completions.js file
if (compat.requiresReasoningContentOnAssistantMessages &&
    model.reasoning &&
    assistantMsg.reasoning_content === undefined) {
    // Search for the most recent assistant message with non-empty thinking blocks
    const lastReasoningContent = findLastReasoningContent(transformedMessages);
    assistantMsg.reasoning_content = lastReasoningContent;
}

// Function to find the last reasoning content
function findLastReasoningContent(messages) {
    for (let i = messages.length - 1; i >= 0; i--) {
        if (messages[i].reasoning_content !== undefined && messages[i].reasoning_content !== "") {
            return messages[i].reasoning_content;
        }
    }
    return null;
}

Notes

The provided code snippet is a simplified example and may require adjustments to fit the actual implementation. The fix should be thoroughly tested to ensure it works correctly in all scenarios.

Recommendation

Apply the workaround by modifying the

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING