openclaw - ✅(Solved) Fix Feature: Degraded mode for model fallback — prevent death spiral when cloud model is throttled [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#60947Fetched 2026-04-08 02:45:15
View on GitHub
Comments
0
Participants
1
Timeline
1
Reactions
0
Author
Participants
Timeline (top)
cross-referenced ×1

When the primary cloud model is throttled or unavailable and OpenClaw falls back to a smaller local model, the fallback model receives the same full context (often 100K+ tokens) that was designed for a large cloud model. This creates a death spiral:

  1. Cloud model returns 429/overloaded → fallback to small local model
  2. Local model (e.g. 7B/3B with 8-32K context) receives 100K+ token session
  3. Local model generates garbage or fails silently
  4. Garbage triggers compaction attempt
  5. Compaction model (also small) can't summarize the massive context → fails
  6. Session grows larger → next attempt is even worse
  7. Repeat until session file is corrupted or gateway crashes
  8. User must manually delete session file to recover

Root Cause

When the primary cloud model is throttled or unavailable and OpenClaw falls back to a smaller local model, the fallback model receives the same full context (often 100K+ tokens) that was designed for a large cloud model. This creates a death spiral:

  1. Cloud model returns 429/overloaded → fallback to small local model
  2. Local model (e.g. 7B/3B with 8-32K context) receives 100K+ token session
  3. Local model generates garbage or fails silently
  4. Garbage triggers compaction attempt
  5. Compaction model (also small) can't summarize the massive context → fails
  6. Session grows larger → next attempt is even worse
  7. Repeat until session file is corrupted or gateway crashes
  8. User must manually delete session file to recover

PR fix notes

PR #60990: fix(macos): prevent menubar icon from freezing when AI providers overload

Description (problem / solution / changelog)

Prevents menubar icon from freezing when AI providers return overload/timeout errors and the gateway never sends run-end events.

IMPORTANT: This only affects menubar display state. Background jobs continue running normally.

Implementation

  • Fast watchdog timer runs every 30 seconds to update menubar display
  • 3-minute eviction threshold clears stale UI state to prevent frozen menubar
  • Minimum age protection (10 seconds) prevents premature eviction of recent activities
  • Job refresh on activity streaming updates and tool starts refresh display timestamps
  • Role recomputation fix streaming jobs now properly update role when main session changes
  • MainActor race condition fixes with proper async/await patterns and error handling
  • Immediate cleanup method clearAllActivities() for gateway disconnection UI cleanup
  • Safe collection pattern avoids race conditions during UI state updates
  • Proper task lifecycle with TaskStorage helper for clean cancellation
  • Configurable constants magic numbers extracted for maintainability

What This Does NOT Do

  • Does not kill or interrupt actual background processes
  • Does not stop running shell commands, model requests, or other work
  • Only updates the menubar icon state for better user experience

Philosophy

The menubar should always show current status. If we lose track of job state due to API issues, it's better to show "idle" than a frozen "working" state that never updates.

Priority: The menubar must NEVER freeze - user experience is #1

Testing

  • Comprehensive test coverage with 15 test cases including 6 new edge case tests
  • Validates UI state cleanup, role recomputation, timestamp refresh, and race conditions
  • Tests minimum age protection, error handling, and MainActor thread safety
  • Tests ensure stale display state is cleared to prevent frozen menubar
  • All tests pass with proper public API usage

Fixes Addressed

  • ✅ Role recomputation bug in streaming job updates
  • ✅ MainActor race conditions in background tasks
  • ✅ Error handling gaps in watchdog loop
  • ✅ Missing minimum age protection for eviction
  • ✅ Magic numbers extracted to constants
  • ✅ Comprehensive test coverage for all scenarios

Result

Guarantees the menubar never stays frozen while preserving all actual background work.

Changed files

  • apps/macos/Sources/OpenClaw/WorkActivityStore.swift (modified, +112/-8)
  • apps/macos/Tests/OpenClawIPCTests/WorkActivityStoreTests.swift (modified, +94/-0)
RAW_BUFFERClick to expand / collapse

Summary

When the primary cloud model is throttled or unavailable and OpenClaw falls back to a smaller local model, the fallback model receives the same full context (often 100K+ tokens) that was designed for a large cloud model. This creates a death spiral:

  1. Cloud model returns 429/overloaded → fallback to small local model
  2. Local model (e.g. 7B/3B with 8-32K context) receives 100K+ token session
  3. Local model generates garbage or fails silently
  4. Garbage triggers compaction attempt
  5. Compaction model (also small) can't summarize the massive context → fails
  6. Session grows larger → next attempt is even worse
  7. Repeat until session file is corrupted or gateway crashes
  8. User must manually delete session file to recover

Real-world frequency

This happens daily for heavy users who:

  • Run 50M+ tokens/day through cloud providers
  • Get throttled by provider rate limits (Anthropic 429s, overloaded_error)
  • Have local fallback models configured (Ollama, etc.)
  • Have large session contexts from workspace files, tool outputs, etc.

Proposed: Degraded Mode

When falling back to a smaller model, OpenClaw should automatically enter a degraded mode that adapts to the fallback model's capabilities:

Context reduction

  • Strip workspace files (SOUL.md, USER.md, TOOLS.md, MEMORY.md) from context
  • Use lightContext: true automatically
  • Truncate conversation history to fit fallback model's context window
  • Skip loading skills/knowledge files

Behavior adaptation

  • Auto-reply to user: "⚠️ Cloud model unavailable, running in degraded mode. I can run scripts and check status, but complex analysis will wait."
  • Limit to tool execution only (exec, read, write) — no complex reasoning
  • Queue complex requests for when cloud returns
  • Shorter response generation (avoid runaway token generation)

Compaction protection

  • If fallback model is active, skip compaction entirely (prevent cascade failure)
  • Or use a separate compaction strategy with aggressive truncation

Recovery

  • Periodically probe the cloud model (every 5 min) to check if it's back
  • When cloud returns, seamlessly resume with full context
  • Log the degraded period for the user

Related issues

  • #58556 — Live model switch check prevents model fallback from working
  • Anthropic billing change (April 4, 2026) — subscription no longer covers third-party harness traffic, increasing throttling likelihood

Environment

  • OpenClaw 2026.4.2
  • Primary: github-copilot/claude-opus-4.6
  • Fallbacks: ollama/granite4:7b-a1b-h, ollama/granite4:3b-h
  • Session sizes: 2-4MB JSONL, 100K+ token context
  • Token usage: 50M+/day

extent analysis

TL;DR

Implement a degraded mode in OpenClaw that adapts to the fallback model's capabilities when the primary cloud model is throttled or unavailable.

Guidance

  • Identify the fallback model's context window and truncate conversation history accordingly to prevent overload.
  • Implement context reduction by stripping workspace files and using lightContext: true automatically.
  • Limit the fallback model to tool execution only and queue complex requests for when the cloud model returns.
  • Consider skipping compaction entirely when the fallback model is active to prevent cascade failure.

Example

// Pseudo-code example of degraded mode implementation
if (cloudModelUnavailable) {
  // Enter degraded mode
  setLightContext(true);
  truncateConversationHistory(fallbackModelContextWindow);
  limitExecutionToToolsOnly();
  queueComplexRequests();
  // Skip compaction or use aggressive truncation strategy
}

Notes

The proposed degraded mode should help mitigate the death spiral issue, but its effectiveness may depend on the specific fallback model and usage patterns. Additional testing and tuning may be necessary to ensure a seamless user experience.

Recommendation

Apply the workaround by implementing the degraded mode, as it provides a more robust and adaptive solution to the throttling issue, allowing for a better user experience and preventing session corruption or gateway crashes.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING