openclaw - ✅(Solved) Fix doctor: agents.defaults.llm.idleTimeoutSeconds auto-fix discards the user value; runtime gives no signal until doctor runs [1 pull requests, 1 comments, 2 participants]

openclaw2026-04-30 06:29:47

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#74910•Fetched 2026-05-01 05:40:03

View on GitHub

Comments

Participants

Timeline

Reactions

Author

andhai

Participants

andhai

clawsweeper[bot]

Timeline (top)

subscribed ×3mentioned ×2commented ×1cross-referenced ×1

agents.defaults.llm.idleTimeoutSeconds (legacy timeout knob) is correctly recognized by openclaw doctor's deprecation rule introduced in v2026.4.27. Two gaps remain that combined to silently produce 120s timeouts on slow local models:

doctor --fix deletes the legacy block without preserving the user's value. The migrated key (models.providers.<id>.timeoutSeconds) is not populated; the user's "I want 180s" intent is silently dropped.
The runtime emits no warning when the legacy key is present, so users who haven't run openclaw doctor since v2026.4.27 see only the symptom: prefills on slow local models hit a hardcoded 120s ceiling and fall back to the configured fallback model.

Root Cause

doctor --fix deletes the legacy block without preserving the user's value. The migrated key (models.providers.<id>.timeoutSeconds) is not populated; the user's "I want 180s" intent is silently dropped.
The runtime emits no warning when the legacy key is present, so users who haven't run openclaw doctor since v2026.4.27 see only the symptom: prefills on slow local models hit a hardcoded 120s ceiling and fall back to the configured fallback model.

Fix Action

Fixed

Fixed by PR: Fix legacy LLM timeout diagnostics (https://github.com/openclaw/openclaw/pull/74940)

PR fix notes

PR #74940: Fix legacy LLM timeout diagnostics

Repository: openclaw/openclaw
Author: chiyouYCH
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/74940

Description (problem / solution / changelog)

Summary

Preserve the numeric agents.defaults.llm.idleTimeoutSeconds value in doctor --fix output instead of silently dropping it.
Add a one-shot runtime config warning when agents.defaults.llm is still present, pointing users to models.providers.<id>.timeoutSeconds.
Cover doctor migration and runtime-load diagnostics with focused tests.

Fixes #74910

Tests

pnpm test src/commands/doctor/shared/legacy-config-migrate.test.ts src/config/io.compat.test.ts src/config/config-misc.test.ts src/agents/pi-embedded-runner/run/llm-idle-timeout.test.ts
pnpm exec oxfmt --check --threads=1 src/commands/doctor/shared/legacy-config-migrations.runtime.agents.ts src/commands/doctor/shared/legacy-config-migrate.test.ts src/config/io.ts src/config/io.compat.test.ts src/config/config-misc.test.ts CHANGELOG.md
pnpm check:changed

Changed files

CHANGELOG.md (modified, +0/-8412)
src/commands/doctor/shared/legacy-config-migrate.test.ts (modified, +14/-4)
src/commands/doctor/shared/legacy-config-migrations.runtime.agents.ts (modified, +29/-1)
src/config/io.compat.test.ts (modified, +52/-0)
src/config/io.ts (modified, +57/-0)

Code Example

"agents": {
  "defaults": {
    "model": { "primary": "mlx/<slow-local-30b-model>" },
    "llm": { "idleTimeoutSeconds": 180 }
  }
}

---

{
  id: "agents.defaults.llm->models.providers.timeoutSeconds",
  legacyRules: [{
    path: ["agents","defaults","llm"],
    message: 'agents.defaults.llm is legacy; use models.providers.<id>.timeoutSeconds for slow model/provider timeouts. Run "openclaw doctor --fix".'
  }],
  apply: (raw, changes) => {
    delete defaults.llm;
    changes.push("Removed agents.defaults.llm; model idle timeout now follows models.providers.<id>.timeoutSeconds.");
  }
}

---

// ~/.openclaw/openclaw.json
{
  "agents": {
    "defaults": {
      "model": { "primary": "mlx/<some-slow-local-30b-model>" },
      "llm": { "idleTimeoutSeconds": 600 }
    }
  },
  "models": {
    "providers": {
      "mlx": {
        "baseUrl": "http://127.0.0.1:8080/v1",
        "api": "openai-completions",
        "models": [{ "id": "<some-slow-local-30b-model>" }]
      }
    }
  }
}

RAW_BUFFERClick to expand / collapse

Summary

doctor --fix deletes the legacy block without preserving the user's value. The migrated key (models.providers.<id>.timeoutSeconds) is not populated; the user's "I want 180s" intent is silently dropped.
The runtime emits no warning when the legacy key is present, so users who haven't run openclaw doctor since v2026.4.27 see only the symptom: prefills on slow local models hit a hardcoded 120s ceiling and fall back to the configured fallback model.

What I observed

On v2026.4.27 (cbc2ba0931), with this config:

"agents": {
  "defaults": {
    "model": { "primary": "mlx/<slow-local-30b-model>" },
    "llm": { "idleTimeoutSeconds": 180 }
  }
}

openclaw agent --message "<prompt that triggers >120s prefill>" consistently hit a 120s idle timeout, not 180s, then fell back to OpenAI.
src/config/agent-timeout-defaults.ts → DEFAULT_LLM_IDLE_TIMEOUT_SECONDS = 120.
src/agents/pi-embedded-runner/run/llm-idle-timeout.ts → resolveLlmIdleTimeoutMs(...) uses params.modelRequestTimeoutMs (derived from models.providers.<id>.timeoutSeconds) as the override path. There is no path from agents.defaults.llm.idleTimeoutSeconds into this resolver.

The actual user-side fix is to set models.providers.<id>.timeoutSeconds instead, which works as expected.

What doctor does today

src/commands/doctor/shared/legacy-config-migrations.runtime.agents.ts:

{
  id: "agents.defaults.llm->models.providers.timeoutSeconds",
  legacyRules: [{
    path: ["agents","defaults","llm"],
    message: 'agents.defaults.llm is legacy; use models.providers.<id>.timeoutSeconds for slow model/provider timeouts. Run "openclaw doctor --fix".'
  }],
  apply: (raw, changes) => {
    delete defaults.llm;
    changes.push("Removed agents.defaults.llm; model idle timeout now follows models.providers.<id>.timeoutSeconds.");
  }
}

The detection rule is correct. The apply step:

Deletes the legacy block.
Does not copy idleTimeoutSeconds into any models.providers.<id>.timeoutSeconds.
Does not quote the user's number in the change message, so the user has no breadcrumb to recreate their intent.

End state: the user's explicit "180s" preference is silently dropped on auto-fix.

Suggested improvements

Either alone would help; both would be ideal.

Preserve intent in the change message. Echo the legacy value back so the user can recreate it:

"Removed agents.defaults.llm.idleTimeoutSeconds: 180. To preserve this behavior, add models.providers.<id>.timeoutSeconds: 180 to slow providers (detected providers: mlx, ollama)."

If there's exactly one non-OpenAI provider configured, optionally offer to apply the value there.
Runtime warning at startup when agents.defaults.llm exists, regardless of whether doctor has been run. One-shot warning naming the legacy path and pointing at openclaw doctor. Today, deprecation visibility is effectively opt-in to running doctor.

Repro

// ~/.openclaw/openclaw.json
{
  "agents": {
    "defaults": {
      "model": { "primary": "mlx/<some-slow-local-30b-model>" },
      "llm": { "idleTimeoutSeconds": 600 }
    }
  },
  "models": {
    "providers": {
      "mlx": {
        "baseUrl": "http://127.0.0.1:8080/v1",
        "api": "openai-completions",
        "models": [{ "id": "<some-slow-local-30b-model>" }]
      }
    }
  }
}

Then openclaw agent --message "<prompt that triggers >120s prefill>" times out at 120s, not 600s, with the user having no in-band signal that agents.defaults.llm.idleTimeoutSeconds was the wrong place to set it.

Related diagnostic gap (not a separate issue, just adjacent)

While debugging the timeout symptom, we also bumped into agents.list[i].model.primary silently shadowing agents.defaults.model.primary. The behavior is correct (resolveAgentEffectiveModelPrimary does what it says on the tin), but there's no diagnostic when a user changes their default and a per-agent override quietly keeps the old value. There's already a precedent for inspecting agents.list[].model in src/commands/doctor/shared/codex-route-warnings.ts; a generalized "explicit override differs from defaults" info-level hit there (or in openclaw models list grouping) would close the same kind of "config-says-X-but-runtime-uses-Y" debugging gap.

Environment

OpenClaw v2026.4.27 (cbc2ba0931)
macOS Darwin 25.3.0, Apple silicon, 48 GB unified memory
mlx_lm.server v0.31.1

extent analysis

TL;DR

To fix the issue, update the configuration to use models.providers.<id>.timeoutSeconds instead of agents.defaults.llm.idleTimeoutSeconds for slow model timeouts.

Guidance

Identify the legacy agents.defaults.llm.idleTimeoutSeconds configuration and replace it with the new models.providers.<id>.timeoutSeconds configuration to preserve the intended timeout behavior.
Run openclaw doctor to detect and fix deprecation issues, but be aware that it currently deletes the legacy block without preserving the user's value.
Consider adding a runtime warning at startup when agents.defaults.llm exists to notify users of the deprecation and point them to the correct configuration.
Verify the fix by checking the timeout behavior with the updated configuration.

Example

// Updated configuration
{
  "agents": {
    "defaults": {
      "model": { "primary": "mlx/<slow-local-30b-model>" }
    }
  },
  "models": {
    "providers": {
      "mlx": {
        "baseUrl": "http://127.0.0.1:8080/v1",
        "api": "openai-completions",
        "models": [{ "id": "<slow-local-30b-model>" }],
        "timeoutSeconds": 180
      }
    }
  }
}

Notes

The current implementation of openclaw doctor deletes the legacy block without preserving the user's value, which can lead to silent timeouts. The suggested improvements aim to address this issue by preserving the user's intent and providing a runtime warning.

Recommendation

Apply the workaround by updating the configuration to use models.providers.<id>.timeoutSeconds instead of agents.defaults.llm.idleTimeoutSeconds for slow model timeouts. This will ensure that the intended timeout behavior is preserved.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - ✅(Solved) Fix doctor: agents.defaults.llm.idleTimeoutSeconds auto-fix discards the user value; runtime gives no signal until doctor runs [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fixed

PR fix notes

PR #74940: Fix legacy LLM timeout diagnostics

Description (problem / solution / changelog)

Summary

Tests

Changed files

Code Example

Summary

What I observed

What doctor does today

Suggested improvements

Repro

Related diagnostic gap (not a separate issue, just adjacent)

Environment

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

openclaw - ✅(Solved) Fix doctor: agents.defaults.llm.idleTimeoutSeconds auto-fix discards the user value; runtime gives no signal until doctor runs [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fixed

PR fix notes

PR #74940: Fix legacy LLM timeout diagnostics

Description (problem / solution / changelog)

Summary

Tests

Changed files

Code Example

Summary

What I observed

What doctor does today

Suggested improvements

Repro

Related diagnostic gap (not a separate issue, just adjacent)

Environment

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING