openclaw - ✅(Solved) Fix [Bug]: Isolated cron sessions silently ignore model override — modelApplied: true returned but wrong model runs [4 pull requests, 7 comments, 4 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#59257Fetched 2026-04-08 02:26:49
View on GitHub
Comments
7
Participants
4
Timeline
14
Reactions
1
Timeline (top)
commented ×7subscribed ×3labeled ×2mentioned ×2

When a cron job specifies a model override (e.g. "model": "ollama/nemotron-3-super"), the gateway returns modelApplied: true but silently runs the session on the default cloud model instead.

Error Message

  • No error, no warning, no log entry indicating the fallback

Root Cause

When a cron job specifies a model override (e.g. "model": "ollama/nemotron-3-super"), the gateway returns modelApplied: true but silently runs the session on the default cloud model instead.

PR fix notes

PR #57094: fix(agents): fix inverted model fallback order and spurious live session switches

Description (problem / solution / changelog)

Summary

  • Problem: In v2026.3.28, model fallback logic appears inverted for child/thread/cron sessions. The system attempts the fallback model first, then immediately triggers a "live session model switch" back to the primary model. This causes every request to be processed twice, leading to massive latency spikes (e.g., 393ms gateway latency, 59s nested lane queueing) and leaking session modelOverrides into child sessions. The issue is tracked in #57063 and #56788. Affected files: src/agents/live-model-switch.ts, src/agents/model-fallback.ts

  • Root Cause: The root cause lies in resolveLiveSessionModelSelection() (src/agents/live-model-switch.ts). When a child/heartbeat session has no explicit modelOverride in its session store, the function falls back to resolveDefaultModelForAgent() (which returns the config-level default model). It completely ignores the defaultProvider and defaultModel parameters passed by the caller (pi-embedded-runner/run.ts), which actually contain the correctly resolved model for the current run (including parent-session inherited overrides and heartbeat configs).

    Because the live selection incorrectly returns the config default instead of the inherited model, hasDifferentLiveSessionModelSelection returns true, throwing a LiveSessionModelSwitchError.

    Compounding this, in the agent-command fallback path (src/agents/model-fallback.ts), LiveSessionModelSwitchError is caught by runFallbackCandidate but not rethrown. It is treated as a candidate failure, causing the fallback loop to skip the correct inherited model and move to the next candidate (which happens to be the config default), effectively inverting the fallback order.

  • Fix:

    1. In live-model-switch.ts: Modified resolveLiveSessionModelSelection to honour the caller-supplied defaultProvider and defaultModel when the session entry lacks an explicit modelOverride. The config-level default is now only used if the caller defaults are missing. This ensures the live selection matches the runtime-resolved model, preventing spurious switch errors.
    2. In model-fallback.ts: Added an explicit check in runFallbackCandidate to immediately rethrow LiveSessionModelSwitchError. This error indicates a state change, not a model availability failure, and must never be swallowed by the fallback loop.
  • What changed:

    • src/agents/live-model-switch.ts: Updated fallback logic in resolveLiveSessionModelSelection to prioritize caller defaults over config defaults.
    • src/agents/model-fallback.ts: Added rethrow logic for LiveSessionModelSwitchError in runFallbackCandidate.
    • src/agents/live-model-switch.test.ts: Added 6 new test cases covering child sessions, heartbeat sessions, and explicit overrides.
    • src/agents/model-fallback.test.ts: Added 2 new test cases ensuring LiveSessionModelSwitchError is not swallowed.
  • What did NOT change (scope boundary):

    • The core auto-reply model resolution logic (resolveStoredModelOverride, resolveThreadParentSessionKey) remains untouched.
    • The session store data structure and merge logic are unchanged.
    • The general FailoverError coercion and retry backoff logic remain unchanged.

Reproduction

  1. Configure a primary model (e.g., anthropic/claude-opus-4-6) and a fallback model (e.g., google/gemini-3-flash-preview) in config.yaml.
  2. Start a session and explicitly set the model to the fallback via /model google/gemini-3-flash-preview.
  3. Trigger a nested subagent run or wait for a heartbeat/cron job in that session.
  4. Before fix: The logs will show live session model switch detected before attempt... google/gemini-3-flash-preview -> anthropic/claude-opus-4-6. The run will fail on the first attempt and retry, doubling the latency.
  5. After fix: The nested/cron session correctly inherits and uses google/gemini-3-flash-preview without triggering a live switch or fallback loop.

Risk / Mitigation

  • Risk: Changes to live model selection could potentially break explicit user /model commands if the override is ignored.
  • Mitigation: The fix explicitly checks Boolean(entry?.modelOverride?.trim()) and continues to honour explicit session store overrides. Comprehensive unit tests were added to verify that explicit overrides still take precedence over caller defaults, while inherited/heartbeat models correctly use the caller defaults.

Change Type (select all)

  • Bug fix

Scope (select all touched areas)

  • Gateway / orchestration
  • App: web-ui

Linked Issue/PR

Fixes #57063 Fixes #56788

Changed files

  • src/agents/live-model-switch.test.ts (modified, +225/-0)
  • src/agents/live-model-switch.ts (modified, +38/-8)
  • src/agents/model-fallback.test.ts (modified, +80/-0)
  • src/agents/model-fallback.ts (modified, +39/-0)
  • src/auto-reply/reply/agent-runner-execution.test.ts (modified, +53/-0)
  • src/auto-reply/reply/agent-runner-execution.ts (modified, +24/-3)

PR #58992: fix(cron): clear stale model state on new isolated sessions

Description (problem / solution / changelog)

Summary

  • Clear model, modelProvider, modelOverride, and providerOverride when creating new isolated cron sessions
  • Prevents stale model state from shadowing payload.model override and triggering LiveSessionModelSwitchError

Details

When resolveCronSession creates a new session for an isolated cron job, it spreads ...entry from the previous session to preserve per-session overrides. However, this also preserves stale model fields:

  • modelOverride/providerOverride from a previous /model command can shadow the cron payload's model override in resolveCronModelSelection
  • model/modelProvider from a previous fallback can trigger LiveSessionModelSwitchError when the next run tries to use the primary model

The fix adds these four fields to the isNewSession cleanup block, alongside the existing delivery context cleanup.

Fixes #58575 Fixes #58585

Test plan

  • Create cron job with --model haiku --session isolated; verify session uses haiku, not agent default
  • Trigger model fallback in one cron run; verify next run uses primary model without error
  • Verify non-isolated (reused) sessions still preserve model state correctly

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.6 (1M context) [email protected]

Changed files

  • src/cron/isolated-agent/delivery-target.ts (modified, +65/-35)
  • src/cron/isolated-agent/session.ts (modified, +8/-0)

PR #67765: fix(cron): loud error when payload.model silently falls back to wrong provider (#67756)

Description (problem / solution / changelog)

Problem

When a cron job payload explicitly declares "model": "ollama/llama3.2:3b", the gateway silently falls back to the default provider/model (openai/gpt-4o-mini) if the model is not in the allowlist. There is a warning field returned, but it does not include enough detail to diagnose the issue.

Impact: User configures cron job to use ollama, job actually hits OpenAI, user has no idea without deep debugging.

Root Cause

In resolveCronModelSelection() (src/cron/isolated-agent/model-selection.ts), when resolveAllowedModelRef() returns "model not allowed" for an explicit payload.model, the code returns defaults with only a vague warning — effectively hiding the misconfiguration.

Fix

  1. Added getLog().error() with full diagnostics (intended provider/model, fallback target, resolved key, fix suggestion)
  2. Enhanced warning string to include actual fallback provider/model
  3. Added test coverage for both resolution and fallback paths

AI Usage Disclosure: None — human-authored.

Changed files

  • .agents/maintainers.md (added, +1/-0)
  • .agents/skills/openclaw-ghsa-maintainer/SKILL.md (added, +87/-0)
  • .agents/skills/openclaw-parallels-smoke/SKILL.md (added, +150/-0)
  • .agents/skills/openclaw-pr-maintainer/SKILL.md (added, +75/-0)
  • .agents/skills/openclaw-qa-testing/SKILL.md (added, +148/-0)
  • .agents/skills/openclaw-qa-testing/agents/openai.yaml (added, +4/-0)
  • .agents/skills/openclaw-release-maintainer/SKILL.md (added, +303/-0)
  • .agents/skills/openclaw-secret-scanning-maintainer/SKILL.md (added, +220/-0)
  • .agents/skills/openclaw-secret-scanning-maintainer/scripts/secret-scanning.mjs (added, +790/-0)
  • .agents/skills/openclaw-test-heap-leaks/SKILL.md (added, +75/-0)
  • .agents/skills/openclaw-test-heap-leaks/agents/openai.yaml (added, +4/-0)
  • .agents/skills/openclaw-test-heap-leaks/scripts/heapsnapshot-delta.mjs (added, +553/-0)
  • .agents/skills/parallels-discord-roundtrip/SKILL.md (added, +62/-0)
  • .agents/skills/security-triage/SKILL.md (added, +111/-0)
  • .codex (added, +0/-0)
  • .detect-secrets.cfg (added, +45/-0)
  • .dockerignore (added, +72/-0)
  • .env.example (added, +83/-0)
  • .gitattributes (added, +3/-0)
  • .github/CODEOWNERS (added, +54/-0)
  • .github/ISSUE_TEMPLATE/bug_report.yml (added, +148/-0)
  • .github/ISSUE_TEMPLATE/config.yml (added, +8/-0)
  • .github/ISSUE_TEMPLATE/feature_request.yml (added, +70/-0)
  • .github/actionlint.yaml (added, +24/-0)
  • .github/actions/detect-docs-changes/action.yml (added, +53/-0)
  • .github/actions/ensure-base-commit/action.yml (added, +61/-0)
  • .github/actions/setup-node-env/action.yml (added, +99/-0)
  • .github/actions/setup-pnpm-store-cache/action.yml (added, +76/-0)
  • .github/codeql/codeql-javascript-typescript.yml (added, +18/-0)
  • .github/dependabot.yml (added, +127/-0)
  • .github/instructions/copilot.instructions.md (added, +64/-0)
  • .github/labeler.yml (added, +371/-0)
  • .github/pr-assets/compaction-checkpoints/sessions-checkpoints-inline.png (added, +0/-0)
  • .github/pr-assets/compaction-checkpoints/sessions-overview-inline.png (added, +0/-0)
  • .github/pull_request_template.md (added, +147/-0)
  • .github/workflows/auto-response.yml (added, +534/-0)
  • .github/workflows/ci.yml (added, +1405/-0)
  • .github/workflows/codeql.yml (added, +137/-0)
  • .github/workflows/control-ui-locale-refresh.yml (added, +172/-0)
  • .github/workflows/docker-release.yml (added, +389/-0)
  • .github/workflows/docs-sync-publish.yml (added, +70/-0)
  • .github/workflows/docs-translate-trigger-release.yml (added, +42/-0)
  • .github/workflows/install-smoke.yml (added, +216/-0)
  • .github/workflows/labeler.yml (added, +877/-0)
  • .github/workflows/macos-release.yml (added, +93/-0)
  • .github/workflows/openclaw-cross-os-release-checks-reusable.yml (added, +320/-0)
  • .github/workflows/openclaw-npm-release.yml (added, +406/-0)
  • .github/workflows/openclaw-release-checks.yml (added, +120/-0)
  • .github/workflows/parity-gate.yml (added, +93/-0)
  • .github/workflows/plugin-clawhub-release.yml (added, +276/-0)
  • .github/workflows/plugin-npm-release.yml (added, +217/-0)
  • .github/workflows/sandbox-common-smoke.yml (added, +64/-0)
  • .github/workflows/stale.yml (added, +217/-0)
  • .github/workflows/workflow-sanity.yml (added, +98/-0)
  • .gitignore (added, +152/-0)
  • .jscpd.json (added, +16/-0)
  • .mailmap (added, +13/-0)
  • .markdownlint-cli2.jsonc (added, +55/-0)
  • .npmignore (added, +3/-0)
  • .npmrc (added, +4/-0)
  • .oxfmtrc.jsonc (added, +27/-0)
  • .oxlintrc.json (added, +67/-0)
  • .pi/extensions/diff.ts (added, +117/-0)
  • .pi/extensions/files.ts (added, +134/-0)
  • .pi/extensions/prompt-url-widget.ts (added, +190/-0)
  • .pi/extensions/redraws.ts (added, +26/-0)
  • .pi/extensions/ui/paged-select.ts (added, +82/-0)
  • .pi/git/.gitignore (added, +2/-0)
  • .pi/prompts/cl.md (added, +58/-0)
  • .pi/prompts/is.md (added, +22/-0)
  • .pi/prompts/landpr.md (added, +73/-0)
  • .pi/prompts/reviewpr.md (added, +134/-0)
  • .pre-commit-config.yaml (added, +157/-0)
  • .prettierignore (added, +1/-0)
  • .secrets.baseline (added, +13017/-0)
  • .shellcheckrc (added, +25/-0)
  • .swiftformat (added, +51/-0)
  • .swiftlint.yml (added, +150/-0)
  • .vscode/extensions.json (added, +3/-0)
  • .vscode/settings.json (added, +21/-0)
  • AGENTS.md (added, +318/-0)
  • CHANGELOG.md (added, +6277/-0)
  • CLAUDE.md (added, +1/-0)
  • CONTRIBUTING.md (added, +229/-0)
  • Dockerfile (added, +282/-0)
  • Dockerfile.sandbox (added, +24/-0)
  • Dockerfile.sandbox-browser (added, +36/-0)
  • Dockerfile.sandbox-common (added, +48/-0)
  • INCIDENT_RESPONSE.md (added, +52/-0)
  • LICENSE (added, +21/-0)
  • Makefile (added, +4/-0)
  • README.md (added, +534/-0)
  • SECURITY.md (added, +325/-0)
  • Swabble/.github/workflows/ci.yml (added, +54/-0)
  • Swabble/.gitignore (added, +33/-0)
  • Swabble/.swiftformat (added, +8/-0)
  • Swabble/.swiftlint.yml (added, +43/-0)
  • Swabble/CHANGELOG.md (added, +11/-0)
  • Swabble/LICENSE (added, +21/-0)
  • Swabble/Package.resolved (added, +69/-0)

PR #68277: docs(troubleshooting): document isolated-cron payload.model silent-fallback workaround

Description (problem / solution / changelog)

Summary

Adds a troubleshooting section for users hitting the silent-fallback bug cluster (#59257, #58575, #49168) where an isolated cron job with payload.model: "ollama/..." completes with lastStatus: "ok" but the actual inference ran on the configured agent default, silently billing the wrong provider.

The root-cause fixes are currently split across #57094 (caller-supplied defaults seam) and #58992 (stale runtime-model-state seam), both open. Until they land, affected users need a working configuration pattern — this doc section documents it so people can find it during triage rather than having to dig through the long issue comment thread.

What's included

  • Symptoms and detection steps (log grep for embedded run start: runId=... provider=... model=..., compare against payload.model)
  • Working workaround pattern: per-agent profile in agents.list with fallbacks: [], pointing the affected cron's agentId at the worker profile
  • Auth profile + allowlist setup steps

Provenance

The workaround pattern was originally posted by @mschultz77-gravelroad in https://github.com/openclaw/openclaw/issues/59257#issuecomment-4196017313. This PR surfaces it into the canonical troubleshooting doc location.

I've independently verified the pattern on a local install at v2026.4.15: 5 affected cron jobs routed cleanly to Ollama after applying the fix (real 258s+ local inference durations, correct sessionKey=agent:worker-qwen:cron:... in logs).

Cleanup plan

Once #57094 + #58992 both merge, this section should be updated (or removed in favor of a simpler note) since the workaround won't be needed anymore.

Test plan

  • Mintlify link rules: all internal links root-relative, no .md suffix
  • Docs content rules: uses placeholder paths (~/.openclaw/...), no personal hostnames
  • Section styling matches existing troubleshooting.md sections (symptoms / detection / fix / related)

🤖 Generated with Claude Code

Changed files

  • docs/gateway/troubleshooting.md (modified, +81/-0)
  • src/cron/isolated-agent/session.ts (modified, +11/-0)
RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Beta release blocker

No

Summary

When a cron job specifies a model override (e.g. "model": "ollama/nemotron-3-super"), the gateway returns modelApplied: true but silently runs the session on the default cloud model instead.

Steps to reproduce

  1. Configure Ollama provider in openclaw.json with baseUrl, api key, model definition
  2. Create an isolated cron job with "model": "ollama/nemotron-3-super" in payload
  3. Let the cron fire
  4. Check the actual model used in session logs

Expected behavior

  • If the requested model cannot be used, modelApplied should be false
  • Gateway should log a warning when falling back from a requested model
  • Ideally: fail the session rather than silently billing a different provider

Actual behavior

  • Cron fires, isolated session starts
  • API response includes modelApplied: true
  • Actual inference runs on the default cloud model (claude-sonnet-4-6), NOT the requested local model
  • Verified across 19 consecutive cron runs dating back to March 20 — every single one ran on Sonnet despite requesting Nemotron
  • No error, no warning, no log entry indicating the fallback

OpenClaw version

  • OpenClaw 2026.3.13

Operating system

  • macOS (arm64)

Install method

npm global

Model

  • Ollama provider configured with local Nemotron 3 Super (120B MoE) - Ollama running and responsive (ollama run nemotron-3-super works in terminal)

Provider / routing chain

openclaw

Additional provider/model setup details

  • #5769 (Ollama streaming drops tool calls)
  • #13159 (model override in isolated sessions)
  • #59224 (exec-approvals not honored in isolated cron sessions)

Logs, screenshots, and evidence

Impact and severity

No response

Additional information

Update: Gateway ignores empty fallbacks array — falls back to Haiku even when removed from config

Additional reproduction (April 1, 2026)

To isolate whether the local model was actually running or silently falling back, I performed a controlled test:

  1. Set ollama/nemotron-3-super as the only model in Nate's dedicated gateway config (~/Nate/configs/nate.json)
  2. Set "fallbacks": [] — empty array, no fallback models
  3. Removed anthropic/claude-haiku-4-5 from the models block entirely
  4. Restarted the gateway via launchctl kickstart -k
  5. Verified gateway was up (HTTP 200 on port 19450)
  6. Sent a simple tool-calling task: "List the files in ~/Projects/mission-control/app/"

Result

The gateway ignored the empty fallbacks array and fell back to anthropic/claude-haiku-4-5 anyway. The UI displayed an orange banner: "Fallback active: anthropic/claude-haiku-4-5"

The task executed successfully on Haiku (tool calls worked, exec returned real output). This confirms:

  • The gateway has a hardcoded fallback behavior that overrides the config
  • Setting "fallbacks": [] does not prevent cloud model fallback
  • Removing the fallback model from the models block does not prevent it either
  • There is no config combination that forces the gateway to use only the specified local model
  • modelApplied: true continues to be unreliable — the user has no way to verify which model actually ran without checking the UI fallback banner (which isn't available in cron/isolated sessions)

Impact (compounded)

This means every isolated cron session specifying a local model has been silently billed to Anthropic since setup. The user cannot prevent this through configuration alone. The gateway will always find a cloud model to fall back to, even when explicitly told not to.

[Screenshot of orange "Fallback active: anthropic/claude-haiku-4-5" banner attached]

<img width="935" height="411" alt="Image" src="https://github.com/user-attachments/assets/f6856e42-4379-481b-97b4-88df48ac4f38" /> <img width="935" height="411" alt="Image" src="https://github.com/user-attachments/assets/d861c0b9-e058-4b79-a3e4-3c0c6604dbf7" />

extent analysis

TL;DR

The gateway's hardcoded fallback behavior is causing it to override the configured model and use a default cloud model instead, and setting an empty fallbacks array or removing the fallback model from the config does not prevent this behavior.

Guidance

  • Verify that the fallbacks array is correctly set to an empty array ([]) in the gateway config to ensure that no fallback models are specified.
  • Check the gateway logs for any warnings or errors related to model fallback to understand why the configured model is not being used.
  • Consider setting up a custom logging mechanism to track which model is actually being used for each session, as the modelApplied flag is unreliable.
  • Review the OpenClaw documentation and configuration options to see if there are any other settings that can be used to prevent the gateway from falling back to a default cloud model.

Example

No code example is provided as the issue is related to configuration and gateway behavior rather than code.

Notes

The issue seems to be related to a limitation in the OpenClaw gateway's configuration options, and it may not be possible to completely prevent the fallback behavior without modifying the gateway's code or waiting for an update that addresses this issue.

Recommendation

Apply a workaround by setting up a custom logging mechanism to track which model is actually being used for each session, and consider reaching out to the OpenClaw support team to report the issue and request a fix.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

  • If the requested model cannot be used, modelApplied should be false
  • Gateway should log a warning when falling back from a requested model
  • Ideally: fail the session rather than silently billing a different provider

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING