When an API call times out: 1. The plugin should stop retrying after a configurable number of attempts 2. Token consumption should be minimized on timeout 3. The system should gracefully degrade (skip memory recall) instead of infinite retry

openclaw - ✅(Solved) Fix [Bug]: active-memory plugin infinite retry on API timeout causes token waste [1 pull requests, 2 comments, 3 participants]

ariesy · 2026-04-29T03:57:07Z

[openclaw] When using xiaomi/mimo-v2-flash model with the active-memory plugin, API calls frequently timeout 15s threshold , but the plugin continues to retry… When using `xiaomi/mimo-v2-flash` model with the `active-memory` plugin, API calls frequently timeout (15s threshold), but the plugin continues to retry indefinitely without any retry limit configuration. This causes: 1. Token waste (API requests are sent and tokens consumed even on timeout) 2. Context overflow when messages pile up 3. Stuck sessions that remain in `processing` state for 10+ minutes # PR #74158: feat(active-memory): add timeout circuit breaker to skip recall after consecutive failures - Repository: openclaw/openclaw - Author: yelog - State: closed | merged: True - Link: https://github.com/openclaw/openclaw/pull/74158 ## Description (problem / solution / changelog) ## Summary - Add a per-agent/model circuit breaker to the Active Memory plugin that skips recall after consecutive timeouts, preventing token waste and stuck sessions. - Add two new configurable options: `circuitBreakerMaxTimeouts` (default: 3) and `circuitBreakerCooldownMs` (default: 60s). - Add 4 focused tests covering circuit breaker trip, reset, and config normalization. Closes #74054 ## Problem When using a slow-responding model (e.g., `xiaomi/mimo-v2-flash` with 15s+ response times) with the Active Memory plugin, API calls frequently timeout but the plugin continues to attempt recall on every eligible prompt. Since timeout results are deliberately not cached, each new message triggers another full recall attempt against the same slow model — consuming tokens, causing context overflow, and leaving sessions stuck in `processing` state for 10+ minutes. The root cause: Active Memory has no plugin-owned mechanism to stop trying after repeated failures against the same provider/model. ## Design ### Circuit Breaker State A module-level `Map ` keyed by `agentId:provider/model` tracks: - `consecutiveTimeouts`: count of consecutive timeout/timeout_partial results - `lastTimeoutAt`: timestamp of the last timeout ### Behavior 1. **Before each recall attempt**: Check if the circuit breaker is open (consecutive timeouts >= threshold AND within cooldown window). If open, return `{ status: "timeout", elapsedMs: 0 }` immediately without calling the subagent. 2. **After timeout/timeout_partial**: Increment the counter via `recordCircuitBreakerTimeout()`. 3. **After ok/empty (success)**: Reset the counter via `resetCircuitBreaker()`. 4. **After cooldown expires**: The next attempt goes through (one retry allowed), and if it succeeds the breaker resets; if it times out again the breaker re-trips. ### Config Options | Option | Type | Default | Range | |--------|------|---------|-------| | `circuitBreakerMaxTimeouts` | integer | 3 | 1–20 | | `circuitBreakerCooldownMs` | integer | 60000 | 5000–600000 | Both are added to the plugin manifest schema with `additionalProperties: false` preserved, and include UI hints. ## Files Changed | File | Change | |------|--------| | `extensions/active-memory/index.ts` | Circuit breaker state, helpers, integration into `maybeResolveActiveRecall`, config normalization, `__testing` exports | | `extensions/active-memory/index.test.ts` | 4 new tests: breaker trips after consecutive timeouts, breaker resets on success after cooldown, config defaults, config clamping | | `extensions/active-memory/openclaw.plugin.json` | New config schema properties and UI hints | ## Test Plan - `pnpm test extensions/active-memory` — 99 tests (95 existing + 4 new), all pass - New test cases: - Circuit breaker trips after `maxTimeouts` consecutive timeouts (subagent not called again) - Circuit breaker resets after cooldown + successful recall - Config normalization produces correct defaults (3 / 60000) - Config clamping enforces min bounds (1 / 5000) ## Notes - Circuit breaker is extension-owned, not a generic runner feature — this follows the architecture principle that extension-specific behavior stays in the extension. - The breaker keys are scoped by `agentId:provider/model` to avoid cross-agent or cross-model interference. - The existing behavior of not caching timeout results is preserved; the circuit breaker is an additional layer above the cache. - Backward compatible: new config options are optional with sensible defaults. ## Changed files - `extensions/active-memory/index.test.ts` (modified, +143/-0) - `extensions/active-memory/index.ts` (modified, +98/-0) - `extensions/active-memory/openclaw.plugin.json` (modified, +10/-0) ## Temporary Workaround Disable active-memory plugin or use a faster model: ```json "active-memory": { "enabled": false } ``` Or configure with a faster fallback model: ```json "active-memory": { "config": { "model": "deepseek/deepseek-v4-flash", "modelFallback": "deepseek/deepseek-v4-flash" } } ``` ## Bug type Behavior bug (incorrect output/state without crash) ## Beta release blocker No ## Summary When using `xiaomi/mimo-v

openclaw2026-04-29 03:57:07

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#74054•Fetched 2026-04-30 06:29:21

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

cross-referenced ×3commented ×2closed ×1referenced ×1

When using xiaomi/mimo-v2-flash model with the active-memory plugin, API calls frequently timeout (15s threshold), but the plugin continues to retry indefinitely without any retry limit configuration. This causes:

Token waste (API requests are sent and tokens consumed even on timeout)
Context overflow when messages pile up
Stuck sessions that remain in processing state for 10+ minutes

Error Message

error=Context overflow: estimated context size exceeds safe threshold during tool loop

Root Cause

From source code analysis (/dist/extensions/active-memory/index.js):

Plugin has no retry limit configuration: The plugin only handles single-call timeout (15s), returns status: "timeout", and does not retry internally.
Agent-runner layer has no retry limit: When active-memory times out, the agent-runner triggers failover decisions but has no configurable retry limit.
Token already consumed on timeout: The API request is sent and tokens are consumed even when the response times out.

Fix Action

Temporary Workaround

Disable active-memory plugin or use a faster model:

"active-memory": {
  "enabled": false
}

Or configure with a faster fallback model:

"active-memory": {
  "config": {
    "model": "deepseek/deepseek-v4-flash",
    "modelFallback": "deepseek/deepseek-v4-flash"
  }
}

PR fix notes

PR #74158: feat(active-memory): add timeout circuit breaker to skip recall after consecutive failures

Repository: openclaw/openclaw
Author: yelog
State: closed | merged: True
Link: https://github.com/openclaw/openclaw/pull/74158

Description (problem / solution / changelog)

Summary

Add a per-agent/model circuit breaker to the Active Memory plugin that skips recall after consecutive timeouts, preventing token waste and stuck sessions.
Add two new configurable options: circuitBreakerMaxTimeouts (default: 3) and circuitBreakerCooldownMs (default: 60s).
Add 4 focused tests covering circuit breaker trip, reset, and config normalization.

Closes #74054

Problem

When using a slow-responding model (e.g., xiaomi/mimo-v2-flash with 15s+ response times) with the Active Memory plugin, API calls frequently timeout but the plugin continues to attempt recall on every eligible prompt. Since timeout results are deliberately not cached, each new message triggers another full recall attempt against the same slow model — consuming tokens, causing context overflow, and leaving sessions stuck in processing state for 10+ minutes.

The root cause: Active Memory has no plugin-owned mechanism to stop trying after repeated failures against the same provider/model.

Design

Circuit Breaker State

A module-level Map<string, CircuitBreakerEntry> keyed by agentId:provider/model tracks:

consecutiveTimeouts: count of consecutive timeout/timeout_partial results
lastTimeoutAt: timestamp of the last timeout

Behavior

Before each recall attempt: Check if the circuit breaker is open (consecutive timeouts >= threshold AND within cooldown window). If open, return { status: "timeout", elapsedMs: 0 } immediately without calling the subagent.
After timeout/timeout_partial: Increment the counter via recordCircuitBreakerTimeout().
After ok/empty (success): Reset the counter via resetCircuitBreaker().
After cooldown expires: The next attempt goes through (one retry allowed), and if it succeeds the breaker resets; if it times out again the breaker re-trips.

Config Options

Option	Type	Default	Range
`circuitBreakerMaxTimeouts`	integer	3	1–20
`circuitBreakerCooldownMs`	integer	60000	5000–600000

Both are added to the plugin manifest schema with additionalProperties: false preserved, and include UI hints.

Files Changed

File	Change
`extensions/active-memory/index.ts`	Circuit breaker state, helpers, integration into `maybeResolveActiveRecall`, config normalization, `__testing` exports
`extensions/active-memory/index.test.ts`	4 new tests: breaker trips after consecutive timeouts, breaker resets on success after cooldown, config defaults, config clamping
`extensions/active-memory/openclaw.plugin.json`	New config schema properties and UI hints

Test Plan

pnpm test extensions/active-memory — 99 tests (95 existing + 4 new), all pass
New test cases:
- Circuit breaker trips after maxTimeouts consecutive timeouts (subagent not called again)
- Circuit breaker resets after cooldown + successful recall
- Config normalization produces correct defaults (3 / 60000)
- Config clamping enforces min bounds (1 / 5000)

Notes

Circuit breaker is extension-owned, not a generic runner feature — this follows the architecture principle that extension-specific behavior stays in the extension.
The breaker keys are scoped by agentId:provider/model to avoid cross-agent or cross-model interference.
The existing behavior of not caching timeout results is preserved; the circuit breaker is an additional layer above the cache.
Backward compatible: new config options are optional with sensible defaults.

Changed files

extensions/active-memory/index.test.ts (modified, +143/-0)
extensions/active-memory/index.ts (modified, +98/-0)
extensions/active-memory/openclaw.plugin.json (modified, +10/-0)

Code Example

"active-memory": {
  "enabled": true,
  "config": {
    "agents": ["main"],
    "queryMode": "recent",
    "promptStyle": "balanced",
    "maxSummaryChars": 220,
    "logging": true,
    "timeoutMs": 15000
  }
}

---

[02:16:05.824] active-memory: agent=main session=agent:main:main 
               activeProvider=xiaomi activeModel=mimo-v2-flash 
               start timeoutMs=15000 queryChars=962

[02:16:24.172] active-memory: agent=main session=agent:main:main 
               activeProvider=xiaomi activeModel=mimo-v2-flash 
               done status=timeout elapsedMs=18350 summaryChars=0

---

[03:01:58.709] active-memory: agent=main session=... 
               activeProvider=xiaomi activeModel=mimo-v2-flash 
               start timeoutMs=15000 queryChars=434

[03:02:33.908] active-memory: agent=main session=... 
               activeProvider=xiaomi activeModel=mimo-v2-flash 
               done status=timeout elapsedMs=35201 summaryChars=0

---

[03:06:57.804] [context-overflow-diag] sessionKey=agent:main:feishu:group:... 
               provider=xiaomi/mimo-v2-flash 
               source=assistantError 
               messages=116 
               error=Context overflow: estimated context size exceeds safe threshold during tool loop

---

[03:15:46] stuck session: sessionId=main 
           sessionKey=agent:main:feishu:group:... 
           state=processing 
           age=493s (8+ minutes)

[03:17:46] stuck session: ... age=613s (10+ minutes)

---

"active-memory": {
  "enabled": false
}

---

"active-memory": {
  "config": {
    "model": "deepseek/deepseek-v4-flash",
    "modelFallback": "deepseek/deepseek-v4-flash"
  }
}

RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

Summary

Token waste (API requests are sent and tokens consumed even on timeout)
Context overflow when messages pile up
Stuck sessions that remain in processing state for 10+ minutes

Steps to reproduce

Configure OpenClaw with a slow-responding model (e.g., xiaomi/mimo-v2-flash with 15s+ response times)
Enable active-memory plugin with default settings
Have multiple conversations that trigger active-memory recall
Observe repeated timeout events in gateway logs
Check token usage - tokens are consumed despite timeouts

Expected behavior

When an API call times out:

The plugin should stop retrying after a configurable number of attempts
Token consumption should be minimized on timeout
The system should gracefully degrade (skip memory recall) instead of infinite retry

Actual behavior

No retry limit: The agent-runner layer retries indefinitely on timeout
Token waste: Each timeout still consumes tokens (API request is sent)
Context overflow: Messages pile up during retry loops, causing Context overflow: estimated context size exceeds safe threshold
Stuck sessions: Sessions remain in processing state for 10+ minutes
No circuit breaker: No mechanism to stop retries after repeated failures

OpenClaw version

2026.4.26 (be8c246)

Install method

npm global

Model

xiaomi/mimo-v2-flash

Provider / routing chain

openclaw -> xiaomi (direct API)

Additional provider/model setup details

"active-memory": {
  "enabled": true,
  "config": {
    "agents": ["main"],
    "queryMode": "recent",
    "promptStyle": "balanced",
    "maxSummaryChars": 220,
    "logging": true,
    "timeoutMs": 15000
  }
}

Logs, screenshots, and evidence

API Timeout Events

[02:16:05.824] active-memory: agent=main session=agent:main:main 
               activeProvider=xiaomi activeModel=mimo-v2-flash 
               start timeoutMs=15000 queryChars=962

[02:16:24.172] active-memory: agent=main session=agent:main:main 
               activeProvider=xiaomi activeModel=mimo-v2-flash 
               done status=timeout elapsedMs=18350 summaryChars=0

[03:01:58.709] active-memory: agent=main session=... 
               activeProvider=xiaomi activeModel=mimo-v2-flash 
               start timeoutMs=15000 queryChars=434

[03:02:33.908] active-memory: agent=main session=... 
               activeProvider=xiaomi activeModel=mimo-v2-flash 
               done status=timeout elapsedMs=35201 summaryChars=0

Context Overflow

[03:06:57.804] [context-overflow-diag] sessionKey=agent:main:feishu:group:... 
               provider=xiaomi/mimo-v2-flash 
               source=assistantError 
               messages=116 
               error=Context overflow: estimated context size exceeds safe threshold during tool loop

Stuck Session

[03:15:46] stuck session: sessionId=main 
           sessionKey=agent:main:feishu:group:... 
           state=processing 
           age=493s (8+ minutes)

[03:17:46] stuck session: ... age=613s (10+ minutes)

Token Consumption

Timeout rate: 100% (4/4 calls timed out)
Average response time: 20.89s (threshold: 15s)
Longest response: 35.20s (135% over threshold)
Estimated token waste: Hundreds of thousands of tokens

Impact and severity

Affected: All users using slow-responding models with active-memory plugin

Severity: High

Token waste (direct cost impact)
Session stuck (blocks user interactions)
Context overflow (degrades system stability)

Frequency: 100% timeout rate observed during testing

Consequence:

Unnecessary token consumption
Poor user experience (long waits, no responses)
System instability (stuck sessions, context overflow)

Additional information

Root Cause Analysis

From source code analysis (/dist/extensions/active-memory/index.js):

Plugin has no retry limit configuration: The plugin only handles single-call timeout (15s), returns status: "timeout", and does not retry internally.
Agent-runner layer has no retry limit: When active-memory times out, the agent-runner triggers failover decisions but has no configurable retry limit.
Token already consumed on timeout: The API request is sent and tokens are consumed even when the response times out.

Suggested Fixes

Add retry limit configuration: Add maxRetries config option to active-memory plugin (e.g., 0-3 retries)
Add circuit breaker: Stop retries after N consecutive failures
Add timeout-aware token tracking: Track and report token consumption on timeout
Improve timeout handling: Cancel API request on client-side timeout to potentially save tokens

Temporary Workaround

Disable active-memory plugin or use a faster model:

"active-memory": {
  "enabled": false
}

Or configure with a faster fallback model:

"active-memory": {
  "config": {
    "model": "deepseek/deepseek-v4-flash",
    "modelFallback": "deepseek/deepseek-v4-flash"
  }
}

extent analysis

TL;DR

Implement a retry limit and circuit breaker in the active-memory plugin to prevent indefinite retries and token waste on API timeouts.

Guidance

Add a maxRetries configuration option to the active-memory plugin to limit the number of retries on timeout.
Implement a circuit breaker mechanism to stop retries after a specified number of consecutive failures.
Consider adding timeout-aware token tracking to report and minimize token consumption on timeouts.
Evaluate the feasibility of canceling API requests on client-side timeouts to potentially save tokens.

Example

"active-memory": {
  "config": {
    "maxRetries": 3,
    "circuitBreaker": {
      "threshold": 5,
      "resetTimeout": 30000
    }
  }
}

Notes

The suggested fixes require modifications to the active-memory plugin and potentially the agent-runner layer. The temporary workaround of disabling the active-memory plugin or using a faster model may not be suitable for all use cases.

Recommendation

Apply the workaround by disabling the active-memory plugin or configuring a faster fallback model until a permanent fix is implemented, as this will immediately mitigate the token waste and session stuck issues.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

When an API call times out:

The plugin should stop retrying after a configurable number of attempts
Token consumption should be minimized on timeout
The system should gracefully degrade (skip memory recall) instead of infinite retry

#api #autograd error #model save/load #optimization #mixed precision

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix [Bug]: active-memory plugin infinite retry on API timeout causes token waste [1 pull requests, 2 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Temporary Workaround

PR fix notes

PR #74158: feat(active-memory): add timeout circuit breaker to skip recall after consecutive failures

Description (problem / solution / changelog)

Summary

Problem

Design

Circuit Breaker State

Behavior

Config Options

Files Changed

Test Plan

Notes

Changed files

Code Example

Bug type

Beta release blocker

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Install method

Model

Provider / routing chain

Additional provider/model setup details

Logs, screenshots, and evidence

API Timeout Events

Context Overflow

Stuck Session

Token Consumption

Impact and severity

Additional information

Root Cause Analysis

Suggested Fixes

Temporary Workaround

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING