openclaw - 💡(How to fix) Fix Feature request: configurable OVERLOAD_FAILOVER_BACKOFF_POLICY [1 comments, 2 participants]

openclaw2026-03-18 16:11:49

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#49912•Fetched 2026-04-08 01:01:19

View on GitHub

Comments

Participants

Timeline

Reactions

Author

alfkaech-heartforge

Participants

alfkaech-heartforge

astroclaw

Timeline (top)

commented ×1

Fix Action

Fix / Workaround

Current workaround

Manually patching the dist files:

# In dist/reply-*.js, dist/compact-*.js, dist/plugin-sdk/dispatch-*.js
const OVERLOAD_FAILOVER_BACKOFF_POLICY = {
  initialMs: 2000,   # was 250
  maxMs: 15000,      # was 1500
  factor: 2,
  jitter: .2
};

Code Example

# In dist/reply-*.js, dist/compact-*.js, dist/plugin-sdk/dispatch-*.js
const OVERLOAD_FAILOVER_BACKOFF_POLICY = {
  initialMs: 2000,   # was 250
  maxMs: 15000,      # was 1500
  factor: 2,
  jitter: .2
};

---

{
  "overloadFailoverBackoffPolicy": {
    "initialMs": 2000,
    "maxMs": 15000,
    "factor": 2,
    "jitter": 0.2
  }
}

RAW_BUFFERClick to expand / collapse

Problem

The hardcoded OVERLOAD_FAILOVER_BACKOFF_POLICY uses an initial backoff of 250ms and a max of 1500ms. This is too aggressive for agents with large system prompts (e.g. ~40K tokens), where API overload events are more likely and token processing time is longer.

What happens in practice

When Anthropic returns an overload response during normal operation, OpenClaw retries with 250ms → 500ms → 1000ms → 1500ms backoffs. For a large-prompt agent, the retry itself can also hit overload (the system hasn't recovered yet), causing a cascading failure loop — a restart death spiral where the agent continuously crashes and restarts, never successfully making a request.

Current workaround

Manually patching the dist files:

# In dist/reply-*.js, dist/compact-*.js, dist/plugin-sdk/dispatch-*.js
const OVERLOAD_FAILOVER_BACKOFF_POLICY = {
  initialMs: 2000,   # was 250
  maxMs: 15000,      # was 1500
  factor: 2,
  jitter: .2
};

This workaround is fragile — it gets silently overwritten on every npm update openclaw, requiring manual re-patching after every update.

Feature Request

Expose overloadFailoverBackoffPolicy as a configurable option in agent config (e.g., clawdbot.json):

{
  "overloadFailoverBackoffPolicy": {
    "initialMs": 2000,
    "maxMs": 15000,
    "factor": 2,
    "jitter": 0.2
  }
}

Requirements

Keep 250ms as the default for backward compatibility — this change should be opt-in
Allow per-agent override so agents with large system prompts can use longer backoffs
Alternatively, auto-scale the backoff based on estimated prompt token count

Impact

Agents with large system prompts (~40K+ tokens) are significantly more vulnerable to overload death spirals. The 250ms default was likely designed for lightweight agents and doesn't account for the variance in prompt complexity across different deployments.

Environment

OpenClaw version: latest (npm)
Affected agent sizes: ~40K token system prompts
Workaround patch applied to: dist/reply-*.js, dist/compact-*.js, dist/plugin-sdk/dispatch-*.js

extent analysis

Fix Plan

To address the issue, we will expose overloadFailoverBackoffPolicy as a configurable option in the agent config. Here are the steps:

Update the clawdbot.json config file to include the overloadFailoverBackoffPolicy option:

{
  "overloadFailoverBackoffPolicy": {
    "initialMs": 2000,
    "maxMs": 15000,
    "factor": 2,
    "jitter": 0.2
  }
}

In the OpenClaw code, add a check to load the overloadFailoverBackoffPolicy from the agent config:

const config = require('./clawdbot.json');
const overloadFailoverBackoffPolicy = config.overloadFailoverBackoffPolicy || {
  initialMs: 250,
  maxMs: 1500,
  factor: 2,
  jitter: 0.2
};

Use the loaded overloadFailoverBackoffPolicy in the retry logic:

const backoff = require('backoff');
const retry = backoff.fibonacci({
  initialMs: overloadFailoverBackoffPolicy.initialMs,
  maxMs: overloadFailoverBackoffPolicy.maxMs,
  factor: overloadFailoverBackoffPolicy.factor,
  jitter: overloadFailoverBackoffPolicy.jitter
});

Verification

To verify that the fix worked, you can test the agent with a large system prompt (~40K tokens) and check that it no longer enters a restart death spiral. You can also monitor the agent's logs to ensure that the retry backoff policy is being applied correctly.

Extra Tips

Make sure to update the clawdbot.json config file for each agent that requires a custom overloadFailoverBackoffPolicy.
Consider adding a warning or error message if the overloadFailoverBackoffPolicy is not configured correctly.
You can also explore auto-scaling the backoff based on estimated prompt token count to further improve the agent's resilience.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #parallel task #integration issue #index setup #retrieval issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix Feature request: configurable OVERLOAD_FAILOVER_BACKOFF_POLICY [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fix / Workaround

Current workaround

Code Example

Problem

What happens in practice

Current workaround

Feature Request

Requirements

Impact

Environment

extent analysis

Fix Plan

Verification

Extra Tips

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix Feature request: configurable OVERLOAD_FAILOVER_BACKOFF_POLICY [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fix / Workaround

Current workaround

Code Example

Problem

What happens in practice

Current workaround

Feature Request

Requirements

Impact

Environment

extent analysis

Fix Plan

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING