openclaw - 💡(How to fix) Fix `opencode-go/kimi-k2.6`: every multi-turn task rejected with opaque 400 "Provider returned error" (reason=format), rotates to fallback (2026.5.26+5.27)

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Multi-turn agent runs against opencode-go/kimi-k2.6 (any agent that calls a tool and then continues the turn) hit a generic provider 400 on the second model call and rotate to fallback. The fallback decision logs classify the failure as reason: "format", but the actual failure detail from the provider is opaque — just "400 Error from provider: Provider returned error" with no field, schema name, or other diagnostic. Stable across 2026.5.26 and 2026.5.27.

Distinct from #81988 (closed 2026-05-17, was a messages[N].reasoning field replay issue with reason: model_not_found and an explicit field name in the failure detail). That fix shipped in 5.18 and resolved the specific symptom. This is a different failure mode — the failure body carries no field info and the reason is format instead of model_not_found. Whatever sanitizer or replay path the 5.18 fix added isn't covering this case.

Error Message

The first turn succeeds (Write tool fires, file lands on disk). The second turn — when the gateway replays history including the turn-1 assistant + tool-result blocks — fails with:

Root Cause

Distinct from #81988 (closed 2026-05-17, was a messages[N].reasoning field replay issue with reason: model_not_found and an explicit field name in the failure detail). That fix shipped in 5.18 and resolved the specific symptom. This is a different failure mode — the failure body carries no field info and the reason is format instead of model_not_found. Whatever sanitizer or replay path the 5.18 fix added isn't covering this case.

Fix Action

Workaround

The configured fallback (opencode-go/deepseek-v4-pro in our setup) handles it. Tasks complete with ~13s overhead per multi-turn call. No data loss, no operator action required per call — but the failure adds noise to logs (model fallback decision events on every multi-turn coder task) and could mask real model failures.

Code Example

openclaw agent --agent coder --local --json --message \
  'In your sandbox at /workspace, create probe/hello.txt with contents "READY". \
   Use the Write tool. Reply with exactly: PROBE_DONE.'

---

fallbackStepFromModel: "opencode-go/kimi-k2.6"
fallbackStepToModel:   "opencode-go/deepseek-v4-pro"
fallbackStepFromFailureReason: "format"
fallbackStepFromFailureDetail: "400 Error from provider: Provider returned error"
fallbackStepFinalOutcome: "succeeded" (deepseek-v4-pro completes the task)
RAW_BUFFERClick to expand / collapse

Summary

Multi-turn agent runs against opencode-go/kimi-k2.6 (any agent that calls a tool and then continues the turn) hit a generic provider 400 on the second model call and rotate to fallback. The fallback decision logs classify the failure as reason: "format", but the actual failure detail from the provider is opaque — just "400 Error from provider: Provider returned error" with no field, schema name, or other diagnostic. Stable across 2026.5.26 and 2026.5.27.

Distinct from #81988 (closed 2026-05-17, was a messages[N].reasoning field replay issue with reason: model_not_found and an explicit field name in the failure detail). That fix shipped in 5.18 and resolved the specific symptom. This is a different failure mode — the failure body carries no field info and the reason is format instead of model_not_found. Whatever sanitizer or replay path the 5.18 fix added isn't covering this case.

Environment

  • OpenClaw 2026.5.26 stable (also reproduced on 2026.5.27 stable)
  • Provider plugin: opencode-go
  • Model: opencode-go/kimi-k2.6
  • Agent: coder (sandboxed, node:22-bookworm docker image, thinkingDefault: medium)
  • Identical fallback chain: opencode-go/kimi-k2.6opencode-go/deepseek-v4-pro
  • No models.providers block in ~/.openclaw/openclaw.json; opencode-go plugin defaults

Reproduction

openclaw agent --agent coder --local --json --message \
  'In your sandbox at /workspace, create probe/hello.txt with contents "READY". \
   Use the Write tool. Reply with exactly: PROBE_DONE.'

This is a 2-turn flow: turn 1 the model issues a Write tool call, turn 2 it produces the final assistant text. Single-turn probes against the same model (Reply with exactly: PONG) succeed every time on the primary — the bug only manifests when there is at least one assistant→tool→assistant cycle in the replay.

Observed behavior

The first turn succeeds (Write tool fires, file lands on disk). The second turn — when the gateway replays history including the turn-1 assistant + tool-result blocks — fails with:

fallbackStepFromModel: "opencode-go/kimi-k2.6"
fallbackStepToModel:   "opencode-go/deepseek-v4-pro"
fallbackStepFromFailureReason: "format"
fallbackStepFromFailureDetail: "400 Error from provider: Provider returned error"
fallbackStepFinalOutcome: "succeeded" (deepseek-v4-pro completes the task)

Per attempt, the request is retried twice on the primary (both 400) before rotation; total kimi overhead ~8–13s, then ~25s for the fallback to complete. End result: finalAssistantVisibleText: PROBE_DONE, failures: 0, winnerModel: "deepseek-v4-pro", fallbackUsed: false (5.12+ semantic: watchdog rotation doesn't flip fallbackUsed).

Reproduced 3 times across 2 OpenClaw versions (2× on 5.26 between 2026-05-27 20:23–20:24 WIB, 1× on 5.27 at 2026-05-28 19:53 WIB). No transient pattern — happens on every multi-turn run.

Expected behavior

Either:

  1. The replay sanitizer that fixed #81988 (PR #82380, src/agents/openai-transport-stream.ts:2832) catches whatever the new failing field is, OR
  2. The provider 400 surfaces a specific field/schema diagnostic so operators can identify what kimi is rejecting now, OR
  3. The opencode-go plugin transforms the replay payload to a form kimi accepts.

Today, all three are missing — the failure is opaque and consistent.

Suggested investigation

  • The fallback classifier labels this reason: format, which suggests OpenClaw is interpreting the 400 body as a schema-format issue. Whatever heuristic does that classification has more information than what reaches fallbackStepFromFailureDetail. Surfacing the underlying provider body (or the specific JSON-Schema path that failed validation) in the failure detail would let operators / clawsweeper identify the field.
  • If the bug is in the same family as #81988, it likely involves a new field name the provider has started rejecting that the post-#82380 sanitizer doesn't strip. Candidates: cache_control, signature, tool_use_id shape, or another assistant-message field that's valid in the OpenAI Chat Completions contract but kimi-via-opencode-go now validates strictly.
  • Worth checking against the explicit kimi-k2.6 allowlist in src/agents/openai-transport-stream.ts:2841 — that's where the post-#82380 sanitizer applies. The opencode-go plugin path (extensions/opencode-go/index.js) installs PASSTHROUGH_GEMINI_REPLAY_HOOKS separately; both code paths should reach the same sanitizer for openai-completions models, but there may be a hole.

Workaround

The configured fallback (opencode-go/deepseek-v4-pro in our setup) handles it. Tasks complete with ~13s overhead per multi-turn call. No data loss, no operator action required per call — but the failure adds noise to logs (model fallback decision events on every multi-turn coder task) and could mask real model failures.

Cross-references

  • #81988 (closed 2026-05-17, fixed by PR #82380 in 5.18): predecessor with messages[N].reasoning rejection. Symptom-distinct from this one (different field, different reason classification, opaque error body).
  • #87170 (open): similar-looking "Provider returned error" symptom but different version/provider/scope (OpenRouter on 2026.3.2, every message fails). Not the same issue.

I have full session JSONL excerpts and the failing API request payload available on request.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Either:

  1. The replay sanitizer that fixed #81988 (PR #82380, src/agents/openai-transport-stream.ts:2832) catches whatever the new failing field is, OR
  2. The provider 400 surfaces a specific field/schema diagnostic so operators can identify what kimi is rejecting now, OR
  3. The opencode-go plugin transforms the replay payload to a form kimi accepts.

Today, all three are missing — the failure is opaque and consistent.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING