openclaw - ✅(Solved) Fix Feature: Degraded mode for model fallback — prevent death spiral when cloud model is throttled [1 pull requests, 1 participants]

openclaw2026-04-04 16:26:29

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#60947•Fetched 2026-04-08 02:45:15

View on GitHub

Comments

Participants

Timeline

Reactions

Author

andyk-ms

Participants

andyk-ms

Timeline (top)

cross-referenced ×1

When the primary cloud model is throttled or unavailable and OpenClaw falls back to a smaller local model, the fallback model receives the same full context (often 100K+ tokens) that was designed for a large cloud model. This creates a death spiral:

Cloud model returns 429/overloaded → fallback to small local model
Local model (e.g. 7B/3B with 8-32K context) receives 100K+ token session
Local model generates garbage or fails silently
Garbage triggers compaction attempt
Compaction model (also small) can't summarize the massive context → fails
Session grows larger → next attempt is even worse
Repeat until session file is corrupted or gateway crashes
User must manually delete session file to recover

Root Cause

Cloud model returns 429/overloaded → fallback to small local model
Local model (e.g. 7B/3B with 8-32K context) receives 100K+ token session
Local model generates garbage or fails silently
Garbage triggers compaction attempt
Compaction model (also small) can't summarize the massive context → fails
Session grows larger → next attempt is even worse
Repeat until session file is corrupted or gateway crashes
User must manually delete session file to recover

PR fix notes

PR #60990: fix(macos): prevent menubar icon from freezing when AI providers overload

Repository: openclaw/openclaw
Author: xantorres
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/60990

Description (problem / solution / changelog)

Prevents menubar icon from freezing when AI providers return overload/timeout errors and the gateway never sends run-end events.

IMPORTANT: This only affects menubar display state. Background jobs continue running normally.

Implementation

Fast watchdog timer runs every 30 seconds to update menubar display
3-minute eviction threshold clears stale UI state to prevent frozen menubar
Minimum age protection (10 seconds) prevents premature eviction of recent activities
Job refresh on activity streaming updates and tool starts refresh display timestamps
Role recomputation fix streaming jobs now properly update role when main session changes
MainActor race condition fixes with proper async/await patterns and error handling
Immediate cleanup method clearAllActivities() for gateway disconnection UI cleanup
Safe collection pattern avoids race conditions during UI state updates
Proper task lifecycle with TaskStorage helper for clean cancellation
Configurable constants magic numbers extracted for maintainability

What This Does NOT Do

Does not kill or interrupt actual background processes
Does not stop running shell commands, model requests, or other work
Only updates the menubar icon state for better user experience

Philosophy

The menubar should always show current status. If we lose track of job state due to API issues, it's better to show "idle" than a frozen "working" state that never updates.

Priority: The menubar must NEVER freeze - user experience is #1

Testing

Comprehensive test coverage with 15 test cases including 6 new edge case tests
Validates UI state cleanup, role recomputation, timestamp refresh, and race conditions
Tests minimum age protection, error handling, and MainActor thread safety
Tests ensure stale display state is cleared to prevent frozen menubar
All tests pass with proper public API usage

Fixes Addressed

✅ Role recomputation bug in streaming job updates
✅ MainActor race conditions in background tasks
✅ Error handling gaps in watchdog loop
✅ Missing minimum age protection for eviction
✅ Magic numbers extracted to constants
✅ Comprehensive test coverage for all scenarios

Result

Guarantees the menubar never stays frozen while preserving all actual background work.

Changed files

apps/macos/Sources/OpenClaw/WorkActivityStore.swift (modified, +112/-8)
apps/macos/Tests/OpenClawIPCTests/WorkActivityStoreTests.swift (modified, +94/-0)

RAW_BUFFERClick to expand / collapse

Summary

Cloud model returns 429/overloaded → fallback to small local model
Local model (e.g. 7B/3B with 8-32K context) receives 100K+ token session
Local model generates garbage or fails silently
Garbage triggers compaction attempt
Compaction model (also small) can't summarize the massive context → fails
Session grows larger → next attempt is even worse
Repeat until session file is corrupted or gateway crashes
User must manually delete session file to recover

Real-world frequency

This happens daily for heavy users who:

Run 50M+ tokens/day through cloud providers
Get throttled by provider rate limits (Anthropic 429s, overloaded_error)
Have local fallback models configured (Ollama, etc.)
Have large session contexts from workspace files, tool outputs, etc.

Proposed: Degraded Mode

When falling back to a smaller model, OpenClaw should automatically enter a degraded mode that adapts to the fallback model's capabilities:

Context reduction

Strip workspace files (SOUL.md, USER.md, TOOLS.md, MEMORY.md) from context
Use lightContext: true automatically
Truncate conversation history to fit fallback model's context window
Skip loading skills/knowledge files

Behavior adaptation

Auto-reply to user: "⚠️ Cloud model unavailable, running in degraded mode. I can run scripts and check status, but complex analysis will wait."
Limit to tool execution only (exec, read, write) — no complex reasoning
Queue complex requests for when cloud returns
Shorter response generation (avoid runaway token generation)

Compaction protection

If fallback model is active, skip compaction entirely (prevent cascade failure)
Or use a separate compaction strategy with aggressive truncation

Recovery

Periodically probe the cloud model (every 5 min) to check if it's back
When cloud returns, seamlessly resume with full context
Log the degraded period for the user

Related issues

#58556 — Live model switch check prevents model fallback from working
Anthropic billing change (April 4, 2026) — subscription no longer covers third-party harness traffic, increasing throttling likelihood

Environment

OpenClaw 2026.4.2
Primary: github-copilot/claude-opus-4.6
Fallbacks: ollama/granite4:7b-a1b-h, ollama/granite4:3b-h
Session sizes: 2-4MB JSONL, 100K+ token context
Token usage: 50M+/day

extent analysis

TL;DR

Implement a degraded mode in OpenClaw that adapts to the fallback model's capabilities when the primary cloud model is throttled or unavailable.

Guidance

Identify the fallback model's context window and truncate conversation history accordingly to prevent overload.
Implement context reduction by stripping workspace files and using lightContext: true automatically.
Limit the fallback model to tool execution only and queue complex requests for when the cloud model returns.
Consider skipping compaction entirely when the fallback model is active to prevent cascade failure.

Example

// Pseudo-code example of degraded mode implementation
if (cloudModelUnavailable) {
  // Enter degraded mode
  setLightContext(true);
  truncateConversationHistory(fallbackModelContextWindow);
  limitExecutionToToolsOnly();
  queueComplexRequests();
  // Skip compaction or use aggressive truncation strategy
}

Notes

The proposed degraded mode should help mitigate the death spiral issue, but its effectiveness may depend on the specific fallback model and usage patterns. Additional testing and tuning may be necessary to ensure a seamless user experience.

Recommendation

Apply the workaround by implementing the degraded mode, as it provides a more robust and adaptive solution to the throttling issue, allowing for a better user experience and preventing session corruption or gateway crashes.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#conversation history #request error #file not found #serialization error #model compatibility

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.