openclaw - 💡(How to fix) Fix DeepSeek-v4-pro thinking mode breaks on multi-turn tool-call flows (4.24 fix is incomplete) [1 participants]

openclaw2026-04-26 08:02:41

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#72044•Fetched 2026-04-27 05:35:39

View on GitHub

Comments

Participants

Timeline

Reactions

Author

yuxiaoyang2007-prog

Participants

yuxiaoyang2007-prog

The 4.24 release notes claim "DeepSeek thinking/replay behavior is fixed for follow-up tool-call turns", but the fix only covers short tool-call chains — long chains in sessions_spawn subagent paths still break.

Error Message

DeepSeek-v4-pro with thinking-level high consistently fails on multi-turn tool-call flows (e.g. sessions_spawn subagent paths) with a 400 error from the DeepSeek server. The agent then falls back to the configured fallback model (e.g. openai-codex/gpt-5.4), so the user never gets DeepSeek-quality output. Single-turn flows (direct chat with the same agent) work fine — only multi-turn tool-call flows trigger this. error=LLM request failed: provider rejected the request schema or tool payload. Same agent in single-turn chat (no tool calls): no error, DeepSeek answers normally. Tried removing the empty-string fallback (commenting out the four lines above) and restarting the gateway. Result: DeepSeek still returns the same 400 error. So DeepSeek does not accept the field being absent either — it requires the actual previous-turn reasoning content to be replayed.

Root Cause

Root cause analysis (likely in pi-ai)

Fix Action

Fix / Workaround

Workaround attempted (failed)

This means the proper fix needs to traverse transformedMessages history and copy real reasoning_content from prior turns into messages that don't carry their own thinking blocks, not just patch with empty/missing values.

Code Example

[agent/embedded] embedded run agent end: ... isError=true model=deepseek-v4-pro provider=deepseek
  error=LLM request failed: provider rejected the request schema or tool payload.
  rawError=400 The `reasoning_content` in the thinking mode must be passed back to the API.

[agent/embedded] auth profile failure state updated: ... profile=sha256:9eec02c17471
  provider=deepseek reason=format window=cooldown reused=false

[model-fallback/decision] model fallback decision:
  decision=candidate_failed requested=deepseek/deepseek-v4-pro candidate=deepseek/deepseek-v4-pro
  reason=format next=openai-codex/gpt-5.4
  detail=400 The `reasoning_content` in the thinking mode must be passed back to the API.

[model-fallback/decision] model fallback decision:
  decision=candidate_succeeded requested=deepseek/deepseek-v4-pro candidate=openai-codex/gpt-5.4

---

if (compat.requiresReasoningContentOnAssistantMessages &&
    model.reasoning &&
    assistantMsg.reasoning_content === undefined) {
    assistantMsg.reasoning_content = "";
}

RAW_BUFFERClick to expand / collapse

Summary

Environment

OpenClaw 2026.4.24 (commit cbcfdf6)
pi-ai @mariozechner/[email protected]
macOS Darwin 25.4.0 (ARM64)
Model: deepseek/deepseek-v4-pro, thinkingDefault high

Reproduction

Configure an agent (e.g. reasoning) with model.primary = "deepseek/deepseek-v4-pro" and thinkingDefault = "high".
From a parent agent (e.g. main), call sessions_spawn with agentId="reasoning", thinking="high", and a task that requires several tool calls (e.g. read 2-3 files, create a feishu doc, write a bitable record, then return result).
Observe gateway.err.log:

[agent/embedded] embedded run agent end: ... isError=true model=deepseek-v4-pro provider=deepseek
  error=LLM request failed: provider rejected the request schema or tool payload.
  rawError=400 The `reasoning_content` in the thinking mode must be passed back to the API.

[agent/embedded] auth profile failure state updated: ... profile=sha256:9eec02c17471
  provider=deepseek reason=format window=cooldown reused=false

[model-fallback/decision] model fallback decision:
  decision=candidate_failed requested=deepseek/deepseek-v4-pro candidate=deepseek/deepseek-v4-pro
  reason=format next=openai-codex/gpt-5.4
  detail=400 The `reasoning_content` in the thinking mode must be passed back to the API.

[model-fallback/decision] model fallback decision:
  decision=candidate_succeeded requested=deepseek/deepseek-v4-pro candidate=openai-codex/gpt-5.4

This pattern repeats every time DeepSeek is requested in the multi-turn flow — it always falls back. The DeepSeek auth profile enters cooldown, blocking subsequent attempts.

Same agent in single-turn chat (no tool calls): no error, DeepSeek answers normally.

Root cause analysis (likely in pi-ai)

The relevant code is in pi-ai (@mariozechner/pi-ai), file dist/providers/openai-completions.js lines 684-688:

if (compat.requiresReasoningContentOnAssistantMessages &&
    model.reasoning &&
    assistantMsg.reasoning_content === undefined) {
    assistantMsg.reasoning_content = "";
}

When an assistant message has no thinking block (e.g. a tool-call-only follow-up turn), this fallback sets reasoning_content = "" (empty string). DeepSeek's server rejects this with 400 — it requires the actual previous reasoning content to be replayed in subsequent turns, not an empty string.

Workaround attempted (failed)

Tried removing the empty-string fallback (commenting out the four lines above) and restarting the gateway. Result: DeepSeek still returns the same 400 error. So DeepSeek does not accept the field being absent either — it requires the actual previous-turn reasoning content to be replayed.

Impact

DeepSeek-v4-pro is unusable in any multi-turn tool-call flow in OpenClaw, including:
- sessions_spawn subagent flows (e.g. content-factory writing skill calling Thinker)
- Any agent that issues 3+ tool calls per turn
Users hit silent fallback to openai-codex/gpt-5.4 (or whatever fallback is configured), which dramatically changes output quality and language style without obvious indication.
The DeepSeek auth profile getting put into cooldown also blocks parallel sessions.

Suggested fix direction (for upstream pi-ai)

In openai-completions.js, instead of the empty-string fallback at line 684-688:

When building assistantMsg.reasoning_content for a turn without thinking blocks, search backwards in transformedMessages for the most recent assistant message with non-empty thinking blocks, and replay that content.
Or maintain a per-conversation lastReasoningContent cache that gets refreshed every time a real thinking block is processed, and use it as fallback.
Add an integration test using DeepSeek-v4-pro thinking-mode + ≥3 tool-call turns.

The bug really lives in pi-mono (https://github.com/badlogic/pi-mono, packages/ai) — happy to file there too if preferred.

Trace evidence available

Full trajectory files and gateway logs available on request. Multiple recurrences observed on 2026-04-25 and 2026-04-26 in independent sessions.

extent analysis

TL;DR

Modify the openai-completions.js file in pi-ai to properly handle reasoning_content in multi-turn tool-call flows by replaying the actual previous reasoning content.

Guidance

Identify the lines of code in openai-completions.js (684-688) that set reasoning_content to an empty string and modify them to search for the most recent assistant message with non-empty thinking blocks.
Implement a cache to store the last reasoning content for each conversation, updating it whenever a real thinking block is processed, and use this cache as a fallback.
Add integration tests to verify the fix using DeepSeek-v4-pro with thinking mode and multiple tool-call turns.
Consider filing the bug in pi-mono (https://github.com/badlogic/pi-mono, packages/ai) for a more comprehensive solution.

Example

// Example of how to modify the openai-completions.js file
if (compat.requiresReasoningContentOnAssistantMessages &&
    model.reasoning &&
    assistantMsg.reasoning_content === undefined) {
    // Search for the most recent assistant message with non-empty thinking blocks
    const lastReasoningContent = findLastReasoningContent(transformedMessages);
    assistantMsg.reasoning_content = lastReasoningContent;
}

// Function to find the last reasoning content
function findLastReasoningContent(messages) {
    for (let i = messages.length - 1; i >= 0; i--) {
        if (messages[i].reasoning_content !== undefined && messages[i].reasoning_content !== "") {
            return messages[i].reasoning_content;
        }
    }
    return null;
}

Notes

The provided code snippet is a simplified example and may require adjustments to fit the actual implementation. The fix should be thoroughly tested to ensure it works correctly in all scenarios.

Recommendation

Apply the workaround by modifying the

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #GPU setup #container setup #orchestration issue #cache issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix DeepSeek-v4-pro thinking mode breaks on multi-turn tool-call flows (4.24 fix is incomplete) [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Root cause analysis (likely in pi-ai)

Fix Action

Fix / Workaround

Workaround attempted (failed)

Code Example

Summary

Environment

Reproduction

Root cause analysis (likely in pi-ai)

Workaround attempted (failed)

Impact

Suggested fix direction (for upstream pi-ai)

Trace evidence available

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix DeepSeek-v4-pro thinking mode breaks on multi-turn tool-call flows (4.24 fix is incomplete) [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Root cause analysis (likely in pi-ai)

Fix Action

Fix / Workaround

Workaround attempted (failed)

Code Example

Summary

Environment

Reproduction

Root cause analysis (likely in pi-ai)

Workaround attempted (failed)

Impact

Suggested fix direction (for upstream pi-ai)

Trace evidence available

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING