openclaw - 💡(How to fix) Fix [Request]: Backport fix for Gemma 4 reasoning_content replay bug (#68704) to v2026.5.x stable [1 comments, 2 participants]

Root Cause

Root cause (supported by commit diff): Commit 2fd1e7b added normalizeLmstudioTransportReasoningCompat() which enables reasoning_effort round-trip for all LM Studio models. For Gemma 4, this causes prior-turn thinking blocks to be included in replay messages. LM Studio's OpenAI-compat Gemma 4 endpoint silently rejects or mishandles these blocks, corrupting message history.

Fix Action

Fix / Workaround

Workaround: /reasoning off (with LM Studio Reasoning Parsing still ON) prevents corruption. Agent still reasons internally; OpenClaw simply does not manage the thinking blocks.

Affected: Users running OpenClaw with LM Studio + Gemma 4 models (any variant) on v2026.5.2. Severity: High — agent produces hallucinated or context-mismatched responses, making the model unusable for multi-turn conversations. Frequency: Always reproducible after 3–5 turns with reasoning enabled. Consequence: Effectively blocks use of Gemma 4 via LM Studio until workaround (/reasoning off) is applied or 556c3e8 is released in stable.

Request: Please backport 556c3e8 to a v2026.5.x patch release.

Code Example

Version comparison (cross-version control):
- v2026.4.26: Qwen 3.6 35B (LM Studio) — reasoning NOT displayed. Gemma 4 26B — reasoning NOT displayed. No context corruption on either.
- v2026.5.2: Qwen 3.6 35B — reasoning NOW displayed (new). Gemma 4 26B — reasoning still NOT displayed. Context corruption appears on Gemma 4.

The Qwen change confirms commit 2fd1e7b ("fix: normalize LM Studio binary reasoning efforts") introduced a universal normalize layer affecting all LM Studio local models — not Gemma-specific.

Related upstream fix (not yet in stable release):
- Commit 556c3e8 "fix(agents): strip Gemma reasoning from local replay" (v2026.5.4-beta.1) — adds dropReasoningFromHistory policy for isStrictOpenAiCompatible + isGemma4ModelRequiringReasoningStrip models.

Session snapshot:
- OpenClaw v2026.5.2 (8b2a6e5)
- Model: lmstudio/gemma-4-26b-a4b-it
- Think: high · elevated
- Context grew from 21% → 22% (55k–104k tokens) across observed sessions
- Compactions: 0

Bug type

Regression (worked before, now fails)

Beta release blocker

Summary

Issue #68704 was fixed in commit 556c3e8 and merged to main, but has not been backported to stable. Users on v2026.5.2 (current stable) are still affected — after 3–5 turns with reasoning enabled, agent responses become incoherent due to prior-turn thinking blocks being re-sent to the LM Studio OpenAI-compat Gemma 4 endpoint.

Steps to reproduce

Set up OpenClaw with LM Studio provider, model: google/gemma-4-26b-a4b-it (or similar Gemma 4 variant).
Enable reasoning: /reasoning on (or leave at default with Think: high).
Keep LM Studio's Reasoning Parsing enabled in Inference settings.
Send 3–5 messages in a single session.
Observe agent responses become incoherent — answers reference unrelated content from earlier turns instead of the current message.

Workaround: /reasoning off (with LM Studio Reasoning Parsing still ON) prevents corruption. Agent still reasons internally; OpenClaw simply does not manage the thinking blocks.

Expected behavior

Prior-turn thinking blocks should be stripped from replay history before being sent to the LM Studio Gemma 4 endpoint, as Gemma 4's OpenAI-compat API does not accept reasoning_content in conversation history. On v2026.4.26, Gemma 4 worked correctly (reasoning blocks were not injected). Commit 556c3e8 implements this stripping policy for main; backporting to v2026.5.x stable should restore correct behavior.

Actual behavior

After 3–5 turns with reasoning enabled, agent responses become incoherent — the model pattern-matches from stale earlier turns instead of responding to the current message. The context window grows normally (21%→22%), but response content references unrelated earlier conversation content. Disabling /reasoning off stops the corruption immediately.

OpenClaw version

v2026.5.2 (8b2a6e5)

Operating system

WSL2 Ubuntu on Windows 11

Install method

npm global

Model

lmstudio/gemma-4-26b-a4b-it

Provider / routing chain

openclaw -> LM Studio (OpenAI-compat local endpoint)

Additional provider/model setup details

Provider config: LM Studio local server at http://localhost:1234/v1, api: openai-responses, reasoning: true. Model entry: { id: "google/gemma-4-26b-a4b-it", reasoning: true, input: ["text", "image"] }

Logs, screenshots, and evidence

Version comparison (cross-version control):
- v2026.4.26: Qwen 3.6 35B (LM Studio) — reasoning NOT displayed. Gemma 4 26B — reasoning NOT displayed. No context corruption on either.
- v2026.5.2: Qwen 3.6 35B — reasoning NOW displayed (new). Gemma 4 26B — reasoning still NOT displayed. Context corruption appears on Gemma 4.

The Qwen change confirms commit 2fd1e7b ("fix: normalize LM Studio binary reasoning efforts") introduced a universal normalize layer affecting all LM Studio local models — not Gemma-specific.

Related upstream fix (not yet in stable release):
- Commit 556c3e8 "fix(agents): strip Gemma reasoning from local replay" (v2026.5.4-beta.1) — adds dropReasoningFromHistory policy for isStrictOpenAiCompatible + isGemma4ModelRequiringReasoningStrip models.

Session snapshot:
- OpenClaw v2026.5.2 (8b2a6e5)
- Model: lmstudio/gemma-4-26b-a4b-it
- Think: high · elevated
- Context grew from 21% → 22% (55k–104k tokens) across observed sessions
- Compactions: 0

Impact and severity

Additional information

Last known good version: v2026.4.26 (Gemma 4 worked without context corruption, reasoning simply not displayed). First known bad version: v2026.5.2.

Commit 556c3e8 (already merged to main, not yet in stable) addresses this by stripping prior-turn reasoning from Gemma 4 replay via dropReasoningFromHistory policy.

Related: #68704 (locked as resolved — fix in main but not backported to stable).

Request: Please backport 556c3e8 to a v2026.5.x patch release.

extent analysis

TL;DR

The most likely fix is to backport commit 556c3e8 to the stable release v2026.5.x to strip prior-turn reasoning from Gemma 4 replay messages.

Guidance

The root cause is the introduction of normalizeLmstudioTransportReasoningCompat() in commit 2fd1e7b, which enables reasoning effort round-trip for all LM Studio models, causing prior-turn thinking blocks to be included in replay messages for Gemma 4.
To verify the issue, reproduce the steps provided, and observe the agent responses becoming incoherent after 3-5 turns with reasoning enabled.
A temporary workaround is to disable reasoning using /reasoning off, which prevents corruption but still allows the agent to reason internally.
To mitigate the issue, consider upgrading to a version that includes the fix (e.g., v2026.5.4-beta.1) or waiting for the backport of commit 556c3e8 to the stable release.

Notes

The fix is already merged to the main branch but not yet available in the stable release. The backport of commit 556c3e8 to v2026.5.x is requested to address the issue.

Recommendation

Apply the workaround by disabling reasoning using /reasoning off until the fix is backported to the stable release. This will prevent corruption and allow the agent to reason internally, although it may not be the ideal solution.

FAQ

openclaw - 💡(How to fix) Fix [Request]: Backport fix for Gemma 4 reasoning_content replay bug (#68704) to v2026.5.x stable [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Code Example

Bug type

Beta release blocker

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

extent analysis

TL;DR

Guidance

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING