openclaw - 💡(How to fix) Fix DeepSeek V4 Flash streaming with reasoning_content always triggers fallback to fallback model [1 comments, 2 participants]

laobang · 2026-04-29T14:46:44Z

[openclaw] When using deepseek/deepseek-v4-flash as the primary model with fallbacks configured e.g., minimax-cn/MiniMax-M2.7 , every response generation trigg… When using `deepseek/deepseek-v4-flash` as the primary model with fallbacks configured (e.g., `minimax-cn/MiniMax-M2.7`), **every** response generation triggers fallback to the fallback model, even though DeepSeek API responds correctly when tested directly. This happens 100% of the time. The model only works for the first request after a session reset (`session_status(model="default")`), but falls back immediately on the next response generation. ## Description When using `deepseek/deepseek-v4-flash` as the primary model with fallbacks configured (e.g., `minimax-cn/MiniMax-M2.7`), **every** response generation triggers fallback to the fallback model, even though DeepSeek API responds correctly when tested directly. This happens 100% of the time. The model only works for the first request after a session reset (`session_status(model="default")`), but falls back immediately on the next response generation. ## Steps to Reproduce 1. Configure `deepseek/deepseek-v4-flash` as primary model with `minimax-cn/MiniMax-M2.7` as fallback 2. Session model shows `deepseek/deepseek-v4-flash` correctly 3. Generate any response (even a simple heartbeat reply) 4. Session model switches to `minimax-cn/MiniMax-M2.7` 5. `session_status(model="default")` resets back to deepseek, but next response falls back again ## Evidence ### DeepSeek API is healthy - Direct curl to DeepSeek streaming API: 10/10 requests succeed - Latency: TTFB 0.09-0.22s - Auth state shows errorCount=0, no cooldown ### No fallback logs - No "model-fallback/decision" log entries for the failing requests - No timeout/auth errors logged - The fallback happens silently with zero log evidence ### Session status pattern Before: ``` Model: deepseek/deepseek-v4-flash (1M context) ``` After same-turn HEARTBEAT_OK: ``` Model: minimax-cn/MiniMax-M2.7 (200K context) ``` ### DeepSeek streaming format DeepSeek V4 Flash outputs reasoning tokens before visible content. The streaming chunks have: ```json // First chunk {"delta":{"content":null,"reasoning_content":""}} // During reasoning {"delta":{"content":null,"reasoning_content":"thinking..."}} // Final content {"delta":{"content":"answer","reasoning_content":null}} ``` The `reasoning_content` field is a DeepSeek-specific extension of OpenAI streaming. `delta.content` is `null` during the thinking phase. ## Suspected Root Cause OpenClaw's OpenAI-completions streaming parser may not handle `delta.content: null` during streaming chunks, interpreting it as a stream error rather than "content not yet available". This would cause fallback on every request with reasoning-enabled models. Notably, direct user chat works fine - the issue only manifests during automated/system event responses (heartbeat), suggesting different code paths for streaming parsing. ## Environment - OpenClaw: 2026.4.26 - Runtime: Pi Default - Model: deepseek/deepseek-v4-flash (openai-completions API) - Fallback: minimax-cn/MiniMax-M2.7 - Auth: api_key

openclaw2026-04-29 14:46:44

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#74416•Fetched 2026-04-30 06:24:08

View on GitHub

Comments

Participants

Timeline

Reactions

Author

laobang

Participants

clawsweeper[bot]

laobang

Timeline (top)

closed ×1commented ×1mentioned ×1subscribed ×1

When using deepseek/deepseek-v4-flash as the primary model with fallbacks configured (e.g., minimax-cn/MiniMax-M2.7), every response generation triggers fallback to the fallback model, even though DeepSeek API responds correctly when tested directly.

This happens 100% of the time. The model only works for the first request after a session reset (session_status(model="default")), but falls back immediately on the next response generation.

Error Message

OpenClaw's OpenAI-completions streaming parser may not handle delta.content: null during streaming chunks, interpreting it as a stream error rather than "content not yet available". This would cause fallback on every request with reasoning-enabled models.

Root Cause

Suspected Root Cause

Code Example

Model: deepseek/deepseek-v4-flash (1M context)

---

Model: minimax-cn/MiniMax-M2.7 (200K context)

---

// First chunk
{"delta":{"content":null,"reasoning_content":""}}
// During reasoning
{"delta":{"content":null,"reasoning_content":"thinking..."}}
// Final content
{"delta":{"content":"answer","reasoning_content":null}}

RAW_BUFFERClick to expand / collapse

Description

This happens 100% of the time. The model only works for the first request after a session reset (session_status(model="default")), but falls back immediately on the next response generation.

Steps to Reproduce

Configure deepseek/deepseek-v4-flash as primary model with minimax-cn/MiniMax-M2.7 as fallback
Session model shows deepseek/deepseek-v4-flash correctly
Generate any response (even a simple heartbeat reply)
Session model switches to minimax-cn/MiniMax-M2.7
session_status(model="default") resets back to deepseek, but next response falls back again

Evidence

DeepSeek API is healthy

Direct curl to DeepSeek streaming API: 10/10 requests succeed
Latency: TTFB 0.09-0.22s
Auth state shows errorCount=0, no cooldown

No fallback logs

No "model-fallback/decision" log entries for the failing requests
No timeout/auth errors logged
The fallback happens silently with zero log evidence

Session status pattern

Before:

Model: deepseek/deepseek-v4-flash (1M context)

After same-turn HEARTBEAT_OK:

Model: minimax-cn/MiniMax-M2.7 (200K context)

DeepSeek streaming format

DeepSeek V4 Flash outputs reasoning tokens before visible content. The streaming chunks have:

// First chunk
{"delta":{"content":null,"reasoning_content":""}}
// During reasoning
{"delta":{"content":null,"reasoning_content":"thinking..."}}
// Final content
{"delta":{"content":"answer","reasoning_content":null}}

The reasoning_content field is a DeepSeek-specific extension of OpenAI streaming. delta.content is null during the thinking phase.

Suspected Root Cause

Notably, direct user chat works fine - the issue only manifests during automated/system event responses (heartbeat), suggesting different code paths for streaming parsing.

Environment

OpenClaw: 2026.4.26
Runtime: Pi Default
Model: deepseek/deepseek-v4-flash (openai-completions API)
Fallback: minimax-cn/MiniMax-M2.7
Auth: api_key

extent analysis

TL;DR

The issue can be fixed by modifying OpenClaw's OpenAI-completions streaming parser to handle delta.content: null during streaming chunks.

Guidance

Verify that the issue is indeed caused by the streaming parser by checking the parser's behavior when encountering delta.content: null.
Check the OpenClaw version and see if there are any updates or patches available that address this issue.
Consider adding a temporary workaround to the parser to ignore delta.content: null during streaming chunks.
Investigate the different code paths for streaming parsing during automated/system event responses and direct user chat to understand why the issue only manifests in one case.

Example

// Modified parser to handle delta.content: null
if (delta.content === null && delta.reasoning_content !== null) {
  // Ignore null content during reasoning phase
  return;
}

Notes

The issue seems to be specific to the OpenClaw version and the DeepSeek model, so the fix may not be applicable to other models or versions.

Recommendation

Apply workaround: Modify the OpenClaw's OpenAI-completions streaming parser to handle delta.content: null during streaming chunks, as this is the most likely cause of the issue and a fix is not available in the current version.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #serialization error #model compatibility #GPU setup #container setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix DeepSeek V4 Flash streaming with reasoning_content always triggers fallback to fallback model [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Suspected Root Cause

Code Example

Description

Steps to Reproduce

Evidence

DeepSeek API is healthy

No fallback logs

Session status pattern

DeepSeek streaming format

Suspected Root Cause

Environment

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix DeepSeek V4 Flash streaming with reasoning_content always triggers fallback to fallback model [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Suspected Root Cause

Code Example

Description

Steps to Reproduce

Evidence

DeepSeek API is healthy

No fallback logs

Session status pattern

DeepSeek streaming format

Suspected Root Cause

Environment

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING