openclaw - 💡(How to fix) Fix Cron model preflight skips entire run when local primary is unreachable, ignoring configured cloud fallbacks [AI] [1 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

When a cron job uses an agent whose model.primary is a local provider (e.g. ollama/gemma4:26b-nvfp4) and model.fallbacks lists a cloud provider (e.g. openrouter/nvidia/nemotron-3-super-120b-a12b:free), and the local provider endpoint is temporarily unreachable, the cron run is silently skipped with status: skipped instead of falling back to the cloud provider that is healthy.

The skip happens at preflight time inside the cron isolated-agent runner, before any model invocation. The fallback chain configured on the agent is never consulted.

This is distinct from the post-invocation fallback failures discussed in #44353 (provider-level errors) and #74985 (embedded agent timeout): here, the failure happens earlier — the preflight short-circuits the run.

Net effect for operators relying on local→cloud failover: when Ollama hiccups (busy, paused for upgrade, momentary network blip on 127.0.0.1), entire scheduled cron runs disappear with no retry until the next scheduled tick. For 6 daily scheduled runs over a transient 5-min Ollama outage, you can lose 1 entire run silently — and operationally pay it back the next morning when a watchdog finally fires.

Error Message

const preflight = await (await loadCronModelPreflightRuntime()).preflightCronModelProvider({ cfg: cfgWithAgentDefaults, provider, // resolved primary only model, }); if (preflight.status === "unavailable") { logWarn([cron:${input.job.id}] ${preflight.reason}); return { ok: false, result: withRunSession({ status: "skipped", error: preflight.reason, diagnostics: createCronRunDiagnosticsFromError("model-preflight", preflight.reason, { severity: "warn" }), provider, model, }) }; }

Root Cause

When a cron job uses an agent whose model.primary is a local provider (e.g. ollama/gemma4:26b-nvfp4) and model.fallbacks lists a cloud provider (e.g. openrouter/nvidia/nemotron-3-super-120b-a12b:free), and the local provider endpoint is temporarily unreachable, the cron run is silently skipped with status: skipped instead of falling back to the cloud provider that is healthy.

The skip happens at preflight time inside the cron isolated-agent runner, before any model invocation. The fallback chain configured on the agent is never consulted.

This is distinct from the post-invocation fallback failures discussed in #44353 (provider-level errors) and #74985 (embedded agent timeout): here, the failure happens earlier — the preflight short-circuits the run.

Net effect for operators relying on local→cloud failover: when Ollama hiccups (busy, paused for upgrade, momentary network blip on 127.0.0.1), entire scheduled cron runs disappear with no retry until the next scheduled tick. For 6 daily scheduled runs over a transient 5-min Ollama outage, you can lose 1 entire run silently — and operationally pay it back the next morning when a watchdog finally fires.

Fix Action

Fixed

Code Example

// preflightCronModelProvider — params.provider/model = the *resolved primary*.
async function preflightCronModelProvider(params) {
    const providerConfig = resolveProviderConfig(params.cfg, params.provider);
    if (!providerConfig) return { status: "available" };
    const baseUrl = normalizeBaseUrl(providerConfig.baseUrl);
    const api = normalizeProbeApi(providerConfig);
    if (!baseUrl || !api || !isLocalProviderBaseUrl(baseUrl)) return { status: "available" };
    // ...probes baseUrl with 2.5s timeout, returns "unavailable" on failure...
}

---

const preflight = await (await loadCronModelPreflightRuntime()).preflightCronModelProvider({
    cfg: cfgWithAgentDefaults,
    provider,    // resolved primary only
    model,
});
if (preflight.status === "unavailable") {
    logWarn(`[cron:${input.job.id}] ${preflight.reason}`);
    return {
        ok: false,
        result: withRunSession({
            status: "skipped",
            error: preflight.reason,
            diagnostics: createCronRunDiagnosticsFromError("model-preflight", preflight.reason, { severity: "warn" }),
            provider,
            model,
        })
    };
}

---

import { preflightCronModelProvider } from "/opt/homebrew/lib/node_modules/openclaw/dist/model-preflight.runtime.js";

// Port libre → preflight TimeoutError → status:"unavailable"
const cfg = {
    models: {
        providers: {
            ollama:     { api: "ollama", baseUrl: "http://127.0.0.1:11999" },
            openrouter: { api: "openai-completions", baseUrl: "https://openrouter.ai/api/v1" },
        },
    },
    agents: {
        list: [{
            id: "bourse",
            model: {
                primary: "ollama/gemma4:26b-nvfp4",
                fallbacks: ["openrouter/nvidia/nemotron-3-super-120b-a12b:free"],
            },
        }],
    },
};

const r = await preflightCronModelProvider({
    cfg, provider: "ollama", model: "gemma4:26b-nvfp4",
});
console.log(r);

---

{
  status: 'unavailable',
  provider: 'ollama',
  model: 'gemma4:26b-nvfp4',
  baseUrl: 'http://127.0.0.1:11999',
  retryAfterMs: 300000,
  reason: 'Agent cron job uses ollama/gemma4:26b-nvfp4 but the local provider endpoint is not reachable at http://127.0.0.1:11999. Skipping this cron run; OpenClaw will retry the provider preflight on a later scheduled run. Last error: TypeError: fetch failed'
}

---

{
  "id": "bourse",
  "model": {
    "primary":  "ollama/gemma4:26b-nvfp4",
    "fallbacks": ["openrouter/nvidia/nemotron-3-super-120b-a12b:free"]
  }
}

---

{
  "ts": 1778132782311,
  "action": "finished",
  "status": "skipped",
  "error": "Agent cron job uses ollama/gemma4:26b-nvfp4 but the local provider endpoint is not reachable at http://127.0.0.1:11434. Skipping this cron run; OpenClaw will retry the provider preflight on a later scheduled run. Last error: TimeoutError: request timed out",
  "diagnostics": {
    "summary": "Agent cron job uses ollama/gemma4:26b-nvfp4 but the local provider endpoint is not reachable at http://127.0.0.1:11434. Skipping this cron run; OpenClaw will retry the provider preflight on a later scheduled run. Last error: TimeoutError: request timed out",
    "entries": [{
      "source": "model-preflight",
      "severity": "warn",
      "message": "Agent cron job uses ollama/gemma4:26b-nvfp4 but the local provider endpoint is not reachable at http://127.0.0.1:11434. Skipping this cron run; OpenClaw will retry the provider preflight on a later scheduled run. Last error: TimeoutError: request timed out"
    }]
  },
  "model": "gemma4:26b-nvfp4",
  "provider": "ollama"
}
RAW_BUFFERClick to expand / collapse

Cron model preflight skips entire run when local primary is unreachable, ignoring configured cloud fallbacks [AI]

Summary

When a cron job uses an agent whose model.primary is a local provider (e.g. ollama/gemma4:26b-nvfp4) and model.fallbacks lists a cloud provider (e.g. openrouter/nvidia/nemotron-3-super-120b-a12b:free), and the local provider endpoint is temporarily unreachable, the cron run is silently skipped with status: skipped instead of falling back to the cloud provider that is healthy.

The skip happens at preflight time inside the cron isolated-agent runner, before any model invocation. The fallback chain configured on the agent is never consulted.

This is distinct from the post-invocation fallback failures discussed in #44353 (provider-level errors) and #74985 (embedded agent timeout): here, the failure happens earlier — the preflight short-circuits the run.

Net effect for operators relying on local→cloud failover: when Ollama hiccups (busy, paused for upgrade, momentary network blip on 127.0.0.1), entire scheduled cron runs disappear with no retry until the next scheduled tick. For 6 daily scheduled runs over a transient 5-min Ollama outage, you can lose 1 entire run silently — and operationally pay it back the next morning when a watchdog finally fires.

Real behavior proof

A. Source-level proof

The preflight is implemented in src/cron/isolated-agent/model-preflight.runtime.ts (compiled at dist/model-preflight.runtime-D3BkBmU5.js):

// preflightCronModelProvider — params.provider/model = the *resolved primary*.
async function preflightCronModelProvider(params) {
    const providerConfig = resolveProviderConfig(params.cfg, params.provider);
    if (!providerConfig) return { status: "available" };
    const baseUrl = normalizeBaseUrl(providerConfig.baseUrl);
    const api = normalizeProbeApi(providerConfig);
    if (!baseUrl || !api || !isLocalProviderBaseUrl(baseUrl)) return { status: "available" };
    // ...probes baseUrl with 2.5s timeout, returns "unavailable" on failure...
}

The function only ever consults cfg.models.providers[params.provider].baseUrl. It never reads cfg.agents.list[*].model.fallbacks nor cfg.agents.defaults.model.fallbacks.

The caller in src/cron/isolated-agent/run.ts (compiled at dist/isolated-agent-DPJcOmiU.js:485-502) consumes only the status and short-circuits:

const preflight = await (await loadCronModelPreflightRuntime()).preflightCronModelProvider({
    cfg: cfgWithAgentDefaults,
    provider,    // resolved primary only
    model,
});
if (preflight.status === "unavailable") {
    logWarn(`[cron:${input.job.id}] ${preflight.reason}`);
    return {
        ok: false,
        result: withRunSession({
            status: "skipped",
            error: preflight.reason,
            diagnostics: createCronRunDiagnosticsFromError("model-preflight", preflight.reason, { severity: "warn" }),
            provider,
            model,
        })
    };
}

The return happens before any fallback resolver runs. The skip is final for this scheduled tick.

B. Standalone repro (no infra needed)

Save as repro.mjs and run with node repro.mjs:

import { preflightCronModelProvider } from "/opt/homebrew/lib/node_modules/openclaw/dist/model-preflight.runtime.js";

// Port libre → preflight TimeoutError → status:"unavailable"
const cfg = {
    models: {
        providers: {
            ollama:     { api: "ollama", baseUrl: "http://127.0.0.1:11999" },
            openrouter: { api: "openai-completions", baseUrl: "https://openrouter.ai/api/v1" },
        },
    },
    agents: {
        list: [{
            id: "bourse",
            model: {
                primary: "ollama/gemma4:26b-nvfp4",
                fallbacks: ["openrouter/nvidia/nemotron-3-super-120b-a12b:free"],
            },
        }],
    },
};

const r = await preflightCronModelProvider({
    cfg, provider: "ollama", model: "gemma4:26b-nvfp4",
});
console.log(r);

Output (verbatim):

{
  status: 'unavailable',
  provider: 'ollama',
  model: 'gemma4:26b-nvfp4',
  baseUrl: 'http://127.0.0.1:11999',
  retryAfterMs: 300000,
  reason: 'Agent cron job uses ollama/gemma4:26b-nvfp4 but the local provider endpoint is not reachable at http://127.0.0.1:11999. Skipping this cron run; OpenClaw will retry the provider preflight on a later scheduled run. Last error: TypeError: fetch failed'
}

The cfg.agents.list[0].model.fallbacks is fully populated and points to a healthy cloud provider, but the preflight result does not look at it.

C. Production trace (real cron run, redacted IDs)

Cron marche-preopen-eu (agent bourse), agent config at the time:

{
  "id": "bourse",
  "model": {
    "primary":  "ollama/gemma4:26b-nvfp4",
    "fallbacks": ["openrouter/nvidia/nemotron-3-super-120b-a12b:free"]
  }
}

Run history entry (Ollama briefly busy at 07:45 due to concurrent cron consuming RAM):

{
  "ts": 1778132782311,
  "action": "finished",
  "status": "skipped",
  "error": "Agent cron job uses ollama/gemma4:26b-nvfp4 but the local provider endpoint is not reachable at http://127.0.0.1:11434. Skipping this cron run; OpenClaw will retry the provider preflight on a later scheduled run. Last error: TimeoutError: request timed out",
  "diagnostics": {
    "summary": "Agent cron job uses ollama/gemma4:26b-nvfp4 but the local provider endpoint is not reachable at http://127.0.0.1:11434. Skipping this cron run; OpenClaw will retry the provider preflight on a later scheduled run. Last error: TimeoutError: request timed out",
    "entries": [{
      "source": "model-preflight",
      "severity": "warn",
      "message": "Agent cron job uses ollama/gemma4:26b-nvfp4 but the local provider endpoint is not reachable at http://127.0.0.1:11434. Skipping this cron run; OpenClaw will retry the provider preflight on a later scheduled run. Last error: TimeoutError: request timed out"
    }]
  },
  "model": "gemma4:26b-nvfp4",
  "provider": "ollama"
}

Three minutes later, Ollama responded normally; OpenRouter Nemotron was healthy throughout. The configured fallback would have run the cron successfully.

Verification

Reproducing the bug (no Ollama interference required)

  1. Confirm OpenClaw version: openclaw --version (tested on 2026.5.4).
  2. Save the standalone repro above as repro.mjs.
  3. Run: node repro.mjs.
  4. Observe status: "unavailable" with no consultation of the agents.list[*].model.fallbacks from the cfg.

End-to-end live verification (optional, requires controlled outage)

  1. Configure an agent with model.primary: "ollama/<model>" and model.fallbacks: ["<healthy cloud provider>/<model>"].
  2. Schedule a one-shot cron: openclaw cron add --agent <agent> --at 1m --message "..." --tools exec.
  3. Briefly stop Ollama (launchctl kill TERM gui/$UID/com.ollama on macOS, or systemctl stop ollama on Linux) ≈ 30s before the cron fires.
  4. Restart Ollama after the cron has fired.
  5. Inspect run with openclaw cron runs --id <id>.

Expected (current): status: "skipped", diagnostic source model-preflight, no fallback attempted. Desired: status: "ok" with the fallback model used; or at minimum status: "skipped" only after the fallback chain has been exhausted.

Suggested fix (sketch — feedback welcome)

Two non-exclusive options:

  1. Defer preflight until after fallback resolution. Extend preflightCronModelProvider to receive the full fallback chain and walk it in order, returning available as soon as one candidate's local probe succeeds (or as soon as a cloud candidate is hit, since cloud preflight is currently a no-op). This keeps the existing semantic of "we only probe local providers".

  2. On unavailable, attempt fallback before returning skipped. In cron/isolated-agent/run.ts, when preflight is unavailable, look up the agent's model.fallbacks and rotate to the next candidate (re-running preflight for it if local). Only emit skipped when no candidate passes preflight.

Option 1 is preferred — it keeps the failure path centralized in one runtime and avoids racing with the in-flight fallback resolver used during agent invocation.

Related

  • #44353 — Fallback models not triggered on provider-level errors. Different code path: that issue is about runtime fallback after invocation; this issue is about preflight skip before any invocation. Fixing this issue would also help #44353-style cases when the failure is detectable as "provider unreachable" rather than "provider returned bad response".
  • #74985 — Embedded agent Kimi timeout with no fallback. Different code path: that one is in pi-embedded-runner; this one is in cron isolated-agent.
  • #63229 — Gateway falsely marks healthy local vLLM endpoints as timed out. Related but distinct: that one concerns false positives from the timeout heuristic; this one concerns legitimate unavailable results that should not skip the run when fallbacks exist.

Environment

  • OpenClaw 2026.5.4 (commit 325df3e)
  • macOS Darwin arm64 (Apple Silicon M4 Pro)
  • Node.js 25.x (npm global install)
  • Affected runtime: dist/model-preflight.runtime-D3BkBmU5.js + dist/isolated-agent-DPJcOmiU.js
  • Tested with: provider: ollama baseUrl http://127.0.0.1:11999 (port libre, garantit fetch failed)

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Cron model preflight skips entire run when local primary is unreachable, ignoring configured cloud fallbacks [AI] [1 pull requests]