openclaw - 💡(How to fix) Fix Idle-stream timeout (v3.31+) breaks local models with heavy context - PR #55072 [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#63200Fetched 2026-04-09 07:57:06
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0
Participants

Error Message

  • No error message displayed - connection simply terminates

Root Cause

Root Cause PR #55072 introduced an idle-stream timeout that aborts connections if no tokens arrive within ~60 seconds. For large local models running via LM Studio or llamacpp, prompt prefill can exceed 60 seconds before the first token emits, especially with:

  • 397B parameter models (e.g., Qwen3.5-397B-A17B)
  • 50K+ tokens of context injection
  • Slower token generation speeds (~10-15 tok/s on consumer hardware)

Fix Action

Fix / Workaround

Impact Users running local models with substantial context are unable to upgrade past v3.28. This effectively blocks them from security patches, bug fixes, and feature updates.

Workaround Stay on v2026.3.28 until fix is implemented. No configuration workaround exists in current versions.

RAW_BUFFERClick to expand / collapse

Description Starting with OpenClaw v2026.3.31, local models with large context injections fail to respond due to a new idle-stream timeout that fires before the first token is generated.

Version Affected

  • Broken: v2026.3.31 and newer
  • Working: v2026.3.28 and older

Symptoms

  • Agent processes prompt for exactly ~60 seconds then stops with no output
  • No error message displayed - connection simply terminates
  • Issue occurs during prompt prefill/encoding phase, before first token generation
  • Affects large local models (30B+ parameters) with heavy context injection (MEMORY.md + SOUL.md + USER.md + daily logs + conversation history = 50K+ tokens)

Root Cause PR #55072 introduced an idle-stream timeout that aborts connections if no tokens arrive within ~60 seconds. For large local models running via LM Studio or llamacpp, prompt prefill can exceed 60 seconds before the first token emits, especially with:

  • 397B parameter models (e.g., Qwen3.5-397B-A17B)
  • 50K+ tokens of context injection
  • Slower token generation speeds (~10-15 tok/s on consumer hardware)

Impact Users running local models with substantial context are unable to upgrade past v3.28. This effectively blocks them from security patches, bug fixes, and feature updates.

Related Issues

  • #41371 (feature request for configurable timeouts - filed 1 month ago, still open)
  • #59604 (same timeout bug reported 5 days ago)
  • #61487 (closed as "user configuration issue" but actually this same timeout problem)

Proposed Solution

  1. Make the idle-stream timeout configurable via agents.defaults or model-level settings
  2. Increase default timeout for local/runner models to 300-600 seconds
  3. Separate "connection timeout" from "idle-stream timeout" - prefill time should not count against idle timeout

Workaround Stay on v2026.3.28 until fix is implemented. No configuration workaround exists in current versions.

Environment

  • OS: Ubuntu 24.04 LTS
  • Model: Qwen3.5-397B-A17B via LM Studio
  • Context size: ~50K+ tokens (MEMORY.md + SOUL.md + USER.md + daily logs)
  • Token generation: ~60 tok/s

This appears to be a regression affecting the local model community. Happy to provide additional testing or logs if helpful.

extent analysis

TL;DR

The most likely fix is to make the idle-stream timeout configurable and increase the default timeout for local models to accommodate large context injections.

Guidance

  • Verify that the issue occurs only with large local models (30B+ parameters) and substantial context injection (50K+ tokens) by testing with smaller models and context sizes.
  • Consider downgrading to v2026.3.28 as a temporary workaround until a fix is implemented, as no configuration workaround exists in current versions.
  • Monitor the progress of related issues, such as #41371, which requests configurable timeouts, and #59604, which reports the same timeout bug.
  • Test the proposed solution of increasing the default timeout for local models to 300-600 seconds to determine its effectiveness.

Example

No code snippet is provided as the issue is related to configuration and model settings rather than code.

Notes

The issue appears to be a regression introduced in v2026.3.31, and the proposed solution aims to address the idle-stream timeout. However, the effectiveness of the solution may depend on the specific model and context size.

Recommendation

Apply the workaround of staying on v2026.3.28 until a fix is implemented, as it is the most straightforward solution currently available.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING