openclaw - 💡(How to fix) Fix Fallback chain becomes empty during deferred config reload, causing session interruption on 429

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

When a config change is detected during an embedded run, the restart is deferred until the run completes. However, during this deferred period, the model fallback chain becomes empty (next=none), so when a 429 rate limit error occurs, no fallback models are attempted and the session is interrupted.

Root Cause

To Reproduce

  1. Have an embedded run in progress (e.g., agent executing exec commands)
  2. Trigger a config change (e.g., via config.patch or agent exec modifying config)
  3. Restart gets deferred because embedded run is running
  4. While deferred, primary model hits 429 (rate limit / quota exceeded)
  5. Fallback decision returns next=none — no fallback models are attempted
  6. FailoverError is returned to user, session interrupted

Fix Action

Fix / Workaround

To Reproduce

  1. Have an embedded run in progress (e.g., agent executing exec commands)
  2. Trigger a config change (e.g., via config.patch or agent exec modifying config)
  3. Restart gets deferred because embedded run is running
  4. While deferred, primary model hits 429 (rate limit / quota exceeded)
  5. Fallback decision returns next=none — no fallback models are attempted
  6. FailoverError is returned to user, session interrupted

Timeline

  • 22:52:24 — Gateway restart completed, agent model set to minimax/MiniMax-M2.7
  • 22:52:54 — Feishu message arrived, dispatched to agent, embedded run started
  • 22:54:35 — Config change detected (acp), restart deferred (embedded run in progress)
  • 22:55:22 — Liveness warning: model_call in progress
  • 22:58:01 — First 429 from bailian/qwen3.6-plus
  • 22:58:23 — Fallback decision: next=none, FailoverError returned
  • 22:59:36 — SIGUSR1 forced restart

Code Example

[08:01:02] candidate=minimax/MiniMax-M2.7 reason=timeout next=bailian/qwen3.6-plus  ✓
[08:01:45] candidate=bailian/qwen3.6-plus reason=unknown next=none → succeeded         ✓

---

[22:58:23.136] model-fallback/decision:
  decision=candidate_failed
  requested=bailian/qwen3.6-plus
  candidate=bailian/qwen3.6-plus
  reason=rate_limit
  next=none                    ← fallback chain is empty
RAW_BUFFERClick to expand / collapse

Title

Fallback chain becomes empty during deferred config reload, causing session interruption on 429

Describe the bug

When a config change is detected during an embedded run, the restart is deferred until the run completes. However, during this deferred period, the model fallback chain becomes empty (next=none), so when a 429 rate limit error occurs, no fallback models are attempted and the session is interrupted.

To Reproduce

  1. Have an embedded run in progress (e.g., agent executing exec commands)
  2. Trigger a config change (e.g., via config.patch or agent exec modifying config)
  3. Restart gets deferred because embedded run is running
  4. While deferred, primary model hits 429 (rate limit / quota exceeded)
  5. Fallback decision returns next=none — no fallback models are attempted
  6. FailoverError is returned to user, session interrupted

Expected behavior

The fallback chain (minimax/MiniMax-M2.7bailian/qwen3.6-plusdeepseek/deepseek-v4-pro) should remain intact during a deferred restart, so that when the primary model fails with 429, fallbacks are properly tried.

Actual logs (from ~/.openclaw/logs/)

Normal fallback (same day, 08:01):

[08:01:02] candidate=minimax/MiniMax-M2.7 reason=timeout next=bailian/qwen3.6-plus  ✓
[08:01:45] candidate=bailian/qwen3.6-plus reason=unknown next=none → succeeded         ✓

Broken fallback (22:58, during deferred restart):

[22:58:23.136] model-fallback/decision:
  decision=candidate_failed
  requested=bailian/qwen3.6-plus
  candidate=bailian/qwen3.6-plus
  reason=rate_limit
  next=none                    ← fallback chain is empty

Timeline

  • 22:52:24 — Gateway restart completed, agent model set to minimax/MiniMax-M2.7
  • 22:52:54 — Feishu message arrived, dispatched to agent, embedded run started
  • 22:54:35 — Config change detected (acp), restart deferred (embedded run in progress)
  • 22:55:22 — Liveness warning: model_call in progress
  • 22:58:01 — First 429 from bailian/qwen3.6-plus
  • 22:58:23 — Fallback decision: next=none, FailoverError returned
  • 22:59:36 — SIGUSR1 forced restart

Environment

  • OpenClaw version: 2026.5.7
  • Node.js: v22.22.2
  • OS: macOS Darwin 25.4.0 (arm64)
  • Agent config: "model": { "primary": "bailian/qwen3.6-plus", "fallbacks": ["minimax/MiniMax-M2.7", "deepseek/deepseek-v4-pro"] }

Root cause hypothesis

The deferred config reload likely corrupts or clears the embedded run's internal model configuration snapshot. The fallback chain was correctly configured and working earlier (08:01), but during the deferred restart window, next=none suggests the fallback list became inaccessible.

Impact

When quota is exhausted on the primary model, the assistant session is completely interrupted instead of falling back to alternative models. This breaks continuity for any in-progress task.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

The fallback chain (minimax/MiniMax-M2.7bailian/qwen3.6-plusdeepseek/deepseek-v4-pro) should remain intact during a deferred restart, so that when the primary model fails with 429, fallbacks are properly tried.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Fallback chain becomes empty during deferred config reload, causing session interruption on 429