openclaw - ✅(Solved) Fix Session-file repair drops blank user-role messages, breaking strict OpenAI-compat providers (Qwen3.6 / mlx-vlm) [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#75313Fetched 2026-05-01 05:35:22
View on GitHub
Comments
1
Participants
2
Timeline
4
Reactions
2
Timeline (top)
referenced ×2commented ×1cross-referenced ×1

repairUserEntryWithBlankTextContent in OpenClaw's session-file repair returns { kind: "drop" } for user-role entries whose only text content is blank. The repair function then removes those entries from the on-disk session entirely. When session-memory rehydration produces a session whose post-repair message array is [system, asst, asst, asst, asst] (no user role at all), the array passes OpenAI's permissive Chat Completions API but is rejected by stricter providers.

With mlx-vlm serving Qwen3.6 (mlx-community/Qwen3.6-35B-A3B-8bit), the Jinja chat template raises TemplateError: No user query found in messages. and the server returns HTTP 500. The gateway then either fails over to the next provider (cloud cost + latency) or surfaces the error.

OpenClaw 2026.4.27 (same code path was present in 2026.4.26). The function lives in dist/compaction-successor-transcript-*.js.

Error Message

With mlx-vlm serving Qwen3.6 (mlx-community/Qwen3.6-35B-A3B-8bit), the Jinja chat template raises TemplateError: No user query found in messages. and the server returns HTTP 500. The gateway then either fails over to the next provider (cloud cost + latency) or surfaces the error. {"detail":"An unexpected error occurred: No user query found in messages."}

  • #32936 / #68868 (closed) — Qwen "No user query found" misclassified as context overflow; same upstream provider error string, but those issues were about diagnosis/classification, not the repair-drop root cause.

Root Cause

  • #73472 (closed) — same file, opposite-direction concern (sanitizing empty text blocks for Anthropic). This is the sibling bug: the function DOES handle them, just destructively.
  • #75305 (open) — pre-compaction memory flush sends empty user message to Anthropic; same family of "session-memory produces broken user shape," different code path.
  • #75235 (open) — leading-assistant transcript causes infinite loop; adjacent (the corruption that leaves no leading user message).
  • #32936 / #68868 (closed) — Qwen "No user query found" misclassified as context overflow; same upstream provider error string, but those issues were about diagnosis/classification, not the repair-drop root cause.

Fix Action

Fix / Workaround

Local workaround applied

Three-layer defense-in-depth patch applied locally and verified end-to-end:

  1. Source: repairUserEntryWithBlankTextContent rewrite-don't-drop (same as proposed fix above).
  2. Transport guard inside dist/openai-transport-stream-*.js (createOpenAICompletionsTransportStreamFn), placed AFTER the onPayload override — pre-onPayload placement was silently undone by middleware that returned a fresh params object.
  3. Defensive guard in pi-ai (@mariozechner/pi-ai/dist/providers/openai-completions.js) for any path that calls pi-ai's exported streamOpenAICompletions directly.

Happy to share the full patch script + STATUS document if useful — patches are content-based and survive content-hashed bundle renames across openclaw upgrades.

PR fix notes

PR #75368: fix(agents): keep blank user entries during session repair instead of dropping them (#75313)

Description (problem / solution / changelog)

Summary

Fixes #75313. Session repair previously dropped blank user-role entries entirely, which could leave a session with no user role at all (e.g. [system, assistant, assistant, …]). Strict OpenAI-compatible chat templates (Qwen3.6, mlx-vlm, etc.) reject such message arrays with No user query found in messages. → HTTP 500.

Fix

Instead of { kind: "drop" }, blank user entries are now rewritten with a synthetic "(continue)" text placeholder, preserving the user role in the session transcript.

Changes

  • src/agents/session-file-repair.ts: repairUserEntryWithBlankTextContent returns { kind: "rewrite" } with BLANK_USER_FALLBACK_TEXT = "(continue)" instead of { kind: "drop" }
  • Report field renamed: droppedBlankUserMessagesrewrittenBlankUserMessages
  • Test updates: 10/10 tests pass

Impact

  • Affects all users of repairSessionFileIfNeeded (compaction, session resume)
  • Prevents HTTP 500 from strict providers when session repair encounters blank user messages
  • Related to #75305 (pre-compaction flush sending empty user messages to Anthropic)

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • src/agents/session-file-repair.test.ts (modified, +69/-9)
  • src/agents/session-file-repair.ts (modified, +55/-20)

Code Example

# 1) seed a JSONL session matching the bug pattern: blank user + N rehydrated assistants
SESS=~/.openclaw/agents/main/sessions/repro-$(date +%s).jsonl
cat > "$SESS" <<'EOF'
{"type":"session","sessionId":"repro","createdAt":"2026-04-30T00:00:00Z"}
{"type":"model_change","model":"mlx-community/Qwen3.6-35B-A3B-8bit"}
{"type":"thinking_level_change","level":"off"}
{"type":"custom","payload":{}}
{"type":"message","message":{"role":"user","content":[{"type":"text","text":""}]}}
{"type":"message","message":{"role":"assistant","content":[{"type":"text","text":"rehydrated turn 1"}]}}
{"type":"message","message":{"role":"assistant","content":[{"type":"text","text":"rehydrated turn 2"}]}}
{"type":"message","message":{"role":"assistant","content":[{"type":"text","text":"rehydrated turn 3"}]}}
{"type":"message","message":{"role":"assistant","content":[{"type":"text","text":"rehydrated turn 4"}]}}
EOF

# 2) send a turn forcing the strict provider
openclaw agent --json --thinking off \
  --model "mbpro/mlx-community/Qwen3.6-35B-A3B-8bit" \
  --message "Reply with one word: ok" \
  --session-id "$(basename $SESS .jsonl)"

---

$ curl -s http://<mlx-server>/v1/chat/completions \
      -H 'content-type: application/json' \
      -d '{"model":"mlx-community/Qwen3.6-35B-A3B-8bit","messages":[
            {"role":"system","content":"helper"},
            {"role":"assistant","content":"prior A"},
            {"role":"assistant","content":"prior B"}],"max_tokens":4}'
  {"detail":"An unexpected error occurred: No user query found in messages."}
  HTTP 500

---

function repairUserEntryWithBlankTextContent(entry) {
    const PLACEHOLDER_TEXT = "[Continuing previous conversation. Please proceed.]";
    const placeholder = () => ({
        ...entry,
        message: {
            ...entry.message,
            content: [{ type: "text", text: PLACEHOLDER_TEXT }]
        }
    });
    const content = entry.message.content;
    if (typeof content === "string") {
        return content.trim()
            ? { kind: "keep" }
            : { kind: "rewrite", entry: placeholder() };  // was: { kind: "drop" }
    }
    if (!Array.isArray(content)) return { kind: "keep" };
    let touched = false;
    const nextContent = content.filter((block) => {
        if (!block || typeof block !== "object") return true;
        if (block.type !== "text") return true;
        const text = block.text;
        if (typeof text !== "string" || text.trim().length > 0) return true;
        touched = true;
        return false;
    });
    if (nextContent.length === 0) {
        return { kind: "rewrite", entry: placeholder() };  // was: { kind: "drop" }
    }
    if (!touched) return { kind: "keep" };
    return {
        kind: "rewrite",
        entry: { ...entry, message: { ...entry.message, content: nextContent } }
    };
}
RAW_BUFFERClick to expand / collapse

Summary

repairUserEntryWithBlankTextContent in OpenClaw's session-file repair returns { kind: "drop" } for user-role entries whose only text content is blank. The repair function then removes those entries from the on-disk session entirely. When session-memory rehydration produces a session whose post-repair message array is [system, asst, asst, asst, asst] (no user role at all), the array passes OpenAI's permissive Chat Completions API but is rejected by stricter providers.

With mlx-vlm serving Qwen3.6 (mlx-community/Qwen3.6-35B-A3B-8bit), the Jinja chat template raises TemplateError: No user query found in messages. and the server returns HTTP 500. The gateway then either fails over to the next provider (cloud cost + latency) or surfaces the error.

OpenClaw 2026.4.27 (same code path was present in 2026.4.26). The function lives in dist/compaction-successor-transcript-*.js.

Reproduction

# 1) seed a JSONL session matching the bug pattern: blank user + N rehydrated assistants
SESS=~/.openclaw/agents/main/sessions/repro-$(date +%s).jsonl
cat > "$SESS" <<'EOF'
{"type":"session","sessionId":"repro","createdAt":"2026-04-30T00:00:00Z"}
{"type":"model_change","model":"mlx-community/Qwen3.6-35B-A3B-8bit"}
{"type":"thinking_level_change","level":"off"}
{"type":"custom","payload":{}}
{"type":"message","message":{"role":"user","content":[{"type":"text","text":""}]}}
{"type":"message","message":{"role":"assistant","content":[{"type":"text","text":"rehydrated turn 1"}]}}
{"type":"message","message":{"role":"assistant","content":[{"type":"text","text":"rehydrated turn 2"}]}}
{"type":"message","message":{"role":"assistant","content":[{"type":"text","text":"rehydrated turn 3"}]}}
{"type":"message","message":{"role":"assistant","content":[{"type":"text","text":"rehydrated turn 4"}]}}
EOF

# 2) send a turn forcing the strict provider
openclaw agent --json --thinking off \
  --model "mbpro/mlx-community/Qwen3.6-35B-A3B-8bit" \
  --message "Reply with one word: ok" \
  --session-id "$(basename $SESS .jsonl)"

Observed

  • gateway.err.log: session file repaired: rewrote 4 assistant message(s), dropped 1 blank user message(s)
  • ~10 s later: embedded run failover decision: runId=… stage=assistant decision=surface_error reason=timeout from=mbpro/mlx-community/Qwen3.6-35B-A3B-8bit profile=- rawError=500 status code (no body)
  • Direct curl confirms the wire-level rejection:
    $ curl -s http://<mlx-server>/v1/chat/completions \
        -H 'content-type: application/json' \
        -d '{"model":"mlx-community/Qwen3.6-35B-A3B-8bit","messages":[
              {"role":"system","content":"helper"},
              {"role":"assistant","content":"prior A"},
              {"role":"assistant","content":"prior B"}],"max_tokens":4}'
    {"detail":"An unexpected error occurred: No user query found in messages."}
    HTTP 500
    The same array with one user message appended → HTTP 200.

Expected

After the repair runs, the on-disk session should always contain at least one user-role message. Dropping the only user entry violates the hard requirements of multiple provider chat templates (Qwen3.6 confirmed; expected to surface on other strict templates over time).

Suggested fix

Smallest possible change: in repairUserEntryWithBlankTextContent, replace both { kind: "drop" } returns with { kind: "rewrite", entry: <placeholder> }. The existing repair loop already handles { kind: "rewrite", entry } correctly, so no caller changes are needed.

function repairUserEntryWithBlankTextContent(entry) {
    const PLACEHOLDER_TEXT = "[Continuing previous conversation. Please proceed.]";
    const placeholder = () => ({
        ...entry,
        message: {
            ...entry.message,
            content: [{ type: "text", text: PLACEHOLDER_TEXT }]
        }
    });
    const content = entry.message.content;
    if (typeof content === "string") {
        return content.trim()
            ? { kind: "keep" }
            : { kind: "rewrite", entry: placeholder() };  // was: { kind: "drop" }
    }
    if (!Array.isArray(content)) return { kind: "keep" };
    let touched = false;
    const nextContent = content.filter((block) => {
        if (!block || typeof block !== "object") return true;
        if (block.type !== "text") return true;
        const text = block.text;
        if (typeof text !== "string" || text.trim().length > 0) return true;
        touched = true;
        return false;
    });
    if (nextContent.length === 0) {
        return { kind: "rewrite", entry: placeholder() };  // was: { kind: "drop" }
    }
    if (!touched) return { kind: "keep" };
    return {
        kind: "rewrite",
        entry: { ...entry, message: { ...entry.message, content: nextContent } }
    };
}

The repair summary log line then changes from dropped N blank user message(s) to rewrote N user message(s), which is also more honest about what the function actually did.

Related issues (none are exact duplicates)

  • #73472 (closed) — same file, opposite-direction concern (sanitizing empty text blocks for Anthropic). This is the sibling bug: the function DOES handle them, just destructively.
  • #75305 (open) — pre-compaction memory flush sends empty user message to Anthropic; same family of "session-memory produces broken user shape," different code path.
  • #75235 (open) — leading-assistant transcript causes infinite loop; adjacent (the corruption that leaves no leading user message).
  • #32936 / #68868 (closed) — Qwen "No user query found" misclassified as context overflow; same upstream provider error string, but those issues were about diagnosis/classification, not the repair-drop root cause.

Local workaround applied

Three-layer defense-in-depth patch applied locally and verified end-to-end:

  1. Source: repairUserEntryWithBlankTextContent rewrite-don't-drop (same as proposed fix above).
  2. Transport guard inside dist/openai-transport-stream-*.js (createOpenAICompletionsTransportStreamFn), placed AFTER the onPayload override — pre-onPayload placement was silently undone by middleware that returned a fresh params object.
  3. Defensive guard in pi-ai (@mariozechner/pi-ai/dist/providers/openai-completions.js) for any path that calls pi-ai's exported streamOpenAICompletions directly.

Happy to share the full patch script + STATUS document if useful — patches are content-based and survive content-hashed bundle renames across openclaw upgrades.

extent analysis

TL;DR

The most likely fix is to modify the repairUserEntryWithBlankTextContent function to rewrite blank user messages instead of dropping them.

Guidance

  • Identify and modify the repairUserEntryWithBlankTextContent function in dist/compaction-successor-transcript-*.js to replace { kind: "drop" } returns with { kind: "rewrite", entry: <placeholder> }.
  • Verify the fix by checking the repair summary log line, which should change from dropped N blank user message(s) to rewrote N user message(s).
  • Test the fix using the provided reproduction steps to ensure that the on-disk session always contains at least one user-role message after repair.
  • Consider applying additional defensive guards, such as the transport guard in dist/openai-transport-stream-*.js or the defensive guard in pi-ai, to prevent similar issues in the future.

Example

The suggested fix provides an example of how to modify the repairUserEntryWithBlankTextContent function to rewrite blank user messages:

function repairUserEntryWithBlankTextContent(entry) {
    // ...
    return { kind: "rewrite", entry: placeholder() };  // was: { kind: "drop" }
    // ...
}

Notes

The provided fix is a minimal change that should resolve the issue, but additional testing and verification may be necessary to ensure that it does not introduce any new problems.

Recommendation

Apply the suggested fix to the repairUserEntryWithBlankTextContent function to rewrite blank user messages instead of dropping them, as it is a targeted and minimal change that should resolve the issue.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix Session-file repair drops blank user-role messages, breaking strict OpenAI-compat providers (Qwen3.6 / mlx-vlm) [1 pull requests, 1 comments, 2 participants]