openclaw - ✅(Solved) Fix Corrupt session model overrides return in 2026.5.3-1: providerOverride/modelOverride stored as ('anthropic', 'claude-haiku-4.5') for OpenRouter ids [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#78161Fetched 2026-05-06 06:16:26
View on GitHub
Comments
1
Participants
2
Timeline
2
Reactions
2
Timeline (top)
commented ×1cross-referenced ×1

On 2026.5.3-1, the bench gateway returned FailoverError: Unknown model: anthropic/claude-haiku-4.5 on every agent run while agents.defaults.model.primary was correctly set to openrouter/anthropic/claude-haiku-4.5. The session entry for agent:main:main held providerOverride: "anthropic" and modelOverride: "claude-haiku-4.5", the back half of the OpenRouter id with the openrouter/ wrapper missing.

The dispatcher in agent-command reads the override before the global default. Since the per agent allowlist is empty for most operators, allowAnyModel is true and the existing self heal block at the top of the override handling never fires. The corrupt override is then passed to runWithModelFallback, the resolver looks up an anthropic provider that is not registered, and the run fails with FailoverError: Unknown model: anthropic/claude-haiku-4.5. openclaw models list --agent <id> continues to show the configured model as valid the entire time, which makes the bug hard to find.

This is the same bug class as #70572 (closed 2026-04-25 as not reproducible). The session state layer is still vulnerable in 2026.5.3-1.

Error Message

1. Write side guard in applyModelOverrideToSessionEntry (src/sessions/model-overrides.ts): refuse to persist when selection.provider is not present in cfg.models.providers. Return {updated: false} with a warning. The single sessions.patch caller should pass cfg and surface the refusal as an invalid response so the UI sees an error rather than a silent write.

Root Cause

On 2026.5.3-1, the bench gateway returned FailoverError: Unknown model: anthropic/claude-haiku-4.5 on every agent run while agents.defaults.model.primary was correctly set to openrouter/anthropic/claude-haiku-4.5. The session entry for agent:main:main held providerOverride: "anthropic" and modelOverride: "claude-haiku-4.5", the back half of the OpenRouter id with the openrouter/ wrapper missing.

The dispatcher in agent-command reads the override before the global default. Since the per agent allowlist is empty for most operators, allowAnyModel is true and the existing self heal block at the top of the override handling never fires. The corrupt override is then passed to runWithModelFallback, the resolver looks up an anthropic provider that is not registered, and the run fails with FailoverError: Unknown model: anthropic/claude-haiku-4.5. openclaw models list --agent <id> continues to show the configured model as valid the entire time, which makes the bug hard to find.

This is the same bug class as #70572 (closed 2026-04-25 as not reproducible). The session state layer is still vulnerable in 2026.5.3-1.

Fix Action

Fix / Workaround

The dispatcher in agent-command reads the override before the global default. Since the per agent allowlist is empty for most operators, allowAnyModel is true and the existing self heal block at the top of the override handling never fires. The corrupt override is then passed to runWithModelFallback, the resolver looks up an anthropic provider that is not registered, and the run fails with FailoverError: Unknown model: anthropic/claude-haiku-4.5. openclaw models list --agent <id> continues to show the configured model as valid the entire time, which makes the bug hard to find.

Either the writer rejects the corrupt provider/model pair, or the dispatcher self heals when it reads an override whose provider is not registered in cfg.models.providers. Ideally both, since the writer side prevents new corruption while the read side fixes existing corrupt sessions on the next run.

Two narrow patches close the loop. Verified locally against the 2026.5.3-1 dist build.

PR fix notes

PR #78174: fix: guard corrupt session model overrides

Description (problem / solution / changelog)

Fixes #78161.

Summary

  • reject non-default session model overrides whose provider is absent from cfg.models.providers
  • make sessions.patch surface that refusal as an invalid model patch instead of persisting corrupt state
  • clear already-stored overrides with unconfigured providers during agent dispatch, even when allowAnyModel=true, so old OpenRouter-split sessions fall back to the configured default

Tests

  • PATH="/tmp/openclaw-pnpm-shim:$PATH" node scripts/run-vitest.mjs run src/sessions/model-overrides.test.ts src/gateway/sessions-patch.test.ts src/agents/agent-command.live-model-switch.test.ts
  • git diff --check
  • PATH="/tmp/openclaw-pnpm-shim:$PATH" node scripts/check-changed.mjs
  • PATH="/tmp/openclaw-pnpm-shim:$PATH" pnpm oxfmt --check src/sessions/model-overrides.ts src/sessions/model-overrides.test.ts src/gateway/sessions-patch.ts src/gateway/sessions-patch.test.ts src/agents/agent-command.ts src/agents/agent-command.live-model-switch.test.ts

Changed files

  • src/agents/agent-command.live-model-switch.test.ts (modified, +2/-0)
  • src/agents/agent-command.ts (modified, +13/-2)
  • src/gateway/sessions-patch.test.ts (modified, +22/-0)
  • src/gateway/sessions-patch.ts (modified, +6/-1)
  • src/sessions/model-overrides.test.ts (modified, +69/-1)
  • src/sessions/model-overrides.ts (modified, +33/-1)

Code Example

python3 -c "import json, pathlib; sj = pathlib.Path.home()/'.openclaw/agents/main/sessions/sessions.json'; d = json.loads(sj.read_text()); d['agent:main:main'].update({'providerOverride':'anthropic','modelOverride':'claude-haiku-4.5','modelOverrideSource':'user'}); sj.write_text(json.dumps(d, indent=2))"
RAW_BUFFERClick to expand / collapse

Summary

On 2026.5.3-1, the bench gateway returned FailoverError: Unknown model: anthropic/claude-haiku-4.5 on every agent run while agents.defaults.model.primary was correctly set to openrouter/anthropic/claude-haiku-4.5. The session entry for agent:main:main held providerOverride: "anthropic" and modelOverride: "claude-haiku-4.5", the back half of the OpenRouter id with the openrouter/ wrapper missing.

The dispatcher in agent-command reads the override before the global default. Since the per agent allowlist is empty for most operators, allowAnyModel is true and the existing self heal block at the top of the override handling never fires. The corrupt override is then passed to runWithModelFallback, the resolver looks up an anthropic provider that is not registered, and the run fails with FailoverError: Unknown model: anthropic/claude-haiku-4.5. openclaw models list --agent <id> continues to show the configured model as valid the entire time, which makes the bug hard to find.

This is the same bug class as #70572 (closed 2026-04-25 as not reproducible). The session state layer is still vulnerable in 2026.5.3-1.

Reproduction

The corrupt state can be created directly:

python3 -c "import json, pathlib; sj = pathlib.Path.home()/'.openclaw/agents/main/sessions/sessions.json'; d = json.loads(sj.read_text()); d['agent:main:main'].update({'providerOverride':'anthropic','modelOverride':'claude-haiku-4.5','modelOverrideSource':'user'}); sj.write_text(json.dumps(d, indent=2))"

Then openclaw agent --agent main -m ping produces FailoverError: Unknown model: anthropic/claude-haiku-4.5. The same shape can also be produced by the Control UI per session model picker against an OpenRouter id, in some flows; the deterministic reproduction above is the safer regression case.

Side observation: the same agent:main:main entry has modelProvider: "openrouter" and model: "anthropic/claude-haiku-4.5" (the runtime fields, correctly shaped) right next to the corrupt providerOverride/modelOverride pair. Two writers, disagreeing on shape.

Expected

Either the writer rejects the corrupt provider/model pair, or the dispatcher self heals when it reads an override whose provider is not registered in cfg.models.providers. Ideally both, since the writer side prevents new corruption while the read side fixes existing corrupt sessions on the next run.

Suggested fix

Two narrow patches close the loop. Verified locally against the 2026.5.3-1 dist build.

1. Write side guard in applyModelOverrideToSessionEntry (src/sessions/model-overrides.ts): refuse to persist when selection.provider is not present in cfg.models.providers. Return {updated: false} with a warning. The single sessions.patch caller should pass cfg and surface the refusal as an invalid response so the UI sees an error rather than a silent write.

2. Read side self heal in agent-command (the override cleanup block right before override consumption): the existing block currently only runs when !allowAnyModel. Extend the predicate to also clear the override when its provider is not registered in cfg.models.providers, regardless of allowAnyModel mode. This auto repairs sessions corrupted before the write side guard exists.

I applied both against the local dist build (dist/model-overrides-*.js, dist/sessions-patch-*.js, dist/agent-command-*.js). Unit tested by importing the writer directly and confirmed:

  • broken ("anthropic", "claude-haiku-4.5") is refused with a [openclaw-local-patch] refused session override write warning,
  • valid ("openrouter", "anthropic/claude-haiku-4.5") writes through,
  • callers that do not pass cfg keep the previous behavior, so failover bookkeeping and subagent spawn paths are unaffected.

Regression tested by reinjecting the corrupt pattern into sessions.json and running the agent: the entry auto cleared, the run completed against the global default, no FailoverError, stderr showed [openclaw-local-patch] clearing session override with unregistered provider "anthropic". Happy to send a PR if useful, just point me at whichever entry points should also be covered (sessions.patch is the obvious one; subagent spawn already builds qualified ids via splitModelRef so it should be safe to leave alone).

Environment

openclaw 2026.5.3-1 (commit 2eae30e), node 22.x, macOS 14, gateway run via launchd. 8 agent bench, OpenRouter key in per agent auth-profiles.json.

extent analysis

TL;DR

The most likely fix is to apply two narrow patches: a write side guard to refuse corrupt provider/model pairs and a read side self-heal to clear unregistered provider overrides.

Guidance

  • Implement a write side guard in applyModelOverrideToSessionEntry to refuse persisting corrupt provider/model pairs by checking if selection.provider is present in cfg.models.providers.
  • Extend the read side self-heal predicate in agent-command to clear overrides with unregistered providers, regardless of allowAnyModel mode.
  • Verify the fix by regression testing with the corrupt pattern injected into sessions.json and checking for the absence of FailoverError.
  • Review the code changes to ensure they do not affect other parts of the system, such as subagent spawn paths.

Example

No code snippet is provided as the issue already includes a suggested fix with specific code changes.

Notes

The provided fix is specific to the openclaw 2026.5.3-1 version, and its applicability to other versions is uncertain. The suggested fix should be reviewed and tested thoroughly before applying it to production environments.

Recommendation

Apply the suggested workaround by implementing the two narrow patches, as it directly addresses the identified issue and has been verified locally against the 2026.5.3-1 dist build.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING