openclaw - ✅(Solved) Fix doctor: agents.defaults.llm.idleTimeoutSeconds auto-fix discards the user value; runtime gives no signal until doctor runs [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#74910Fetched 2026-05-01 05:40:03
View on GitHub
Comments
1
Participants
2
Timeline
7
Reactions
2
Author
Timeline (top)
subscribed ×3mentioned ×2commented ×1cross-referenced ×1

agents.defaults.llm.idleTimeoutSeconds (legacy timeout knob) is correctly recognized by openclaw doctor's deprecation rule introduced in v2026.4.27. Two gaps remain that combined to silently produce 120s timeouts on slow local models:

  1. doctor --fix deletes the legacy block without preserving the user's value. The migrated key (models.providers.<id>.timeoutSeconds) is not populated; the user's "I want 180s" intent is silently dropped.
  2. The runtime emits no warning when the legacy key is present, so users who haven't run openclaw doctor since v2026.4.27 see only the symptom: prefills on slow local models hit a hardcoded 120s ceiling and fall back to the configured fallback model.

Root Cause

agents.defaults.llm.idleTimeoutSeconds (legacy timeout knob) is correctly recognized by openclaw doctor's deprecation rule introduced in v2026.4.27. Two gaps remain that combined to silently produce 120s timeouts on slow local models:

  1. doctor --fix deletes the legacy block without preserving the user's value. The migrated key (models.providers.<id>.timeoutSeconds) is not populated; the user's "I want 180s" intent is silently dropped.
  2. The runtime emits no warning when the legacy key is present, so users who haven't run openclaw doctor since v2026.4.27 see only the symptom: prefills on slow local models hit a hardcoded 120s ceiling and fall back to the configured fallback model.

Fix Action

Fixed

PR fix notes

PR #74940: Fix legacy LLM timeout diagnostics

Description (problem / solution / changelog)

Summary

  • Preserve the numeric agents.defaults.llm.idleTimeoutSeconds value in doctor --fix output instead of silently dropping it.
  • Add a one-shot runtime config warning when agents.defaults.llm is still present, pointing users to models.providers.<id>.timeoutSeconds.
  • Cover doctor migration and runtime-load diagnostics with focused tests.

Fixes #74910

Tests

  • pnpm test src/commands/doctor/shared/legacy-config-migrate.test.ts src/config/io.compat.test.ts src/config/config-misc.test.ts src/agents/pi-embedded-runner/run/llm-idle-timeout.test.ts
  • pnpm exec oxfmt --check --threads=1 src/commands/doctor/shared/legacy-config-migrations.runtime.agents.ts src/commands/doctor/shared/legacy-config-migrate.test.ts src/config/io.ts src/config/io.compat.test.ts src/config/config-misc.test.ts CHANGELOG.md
  • pnpm check:changed

Changed files

  • CHANGELOG.md (modified, +0/-8412)
  • src/commands/doctor/shared/legacy-config-migrate.test.ts (modified, +14/-4)
  • src/commands/doctor/shared/legacy-config-migrations.runtime.agents.ts (modified, +29/-1)
  • src/config/io.compat.test.ts (modified, +52/-0)
  • src/config/io.ts (modified, +57/-0)

Code Example

"agents": {
  "defaults": {
    "model": { "primary": "mlx/<slow-local-30b-model>" },
    "llm": { "idleTimeoutSeconds": 180 }
  }
}

---

{
  id: "agents.defaults.llm->models.providers.timeoutSeconds",
  legacyRules: [{
    path: ["agents","defaults","llm"],
    message: 'agents.defaults.llm is legacy; use models.providers.<id>.timeoutSeconds for slow model/provider timeouts. Run "openclaw doctor --fix".'
  }],
  apply: (raw, changes) => {
    delete defaults.llm;
    changes.push("Removed agents.defaults.llm; model idle timeout now follows models.providers.<id>.timeoutSeconds.");
  }
}

---

// ~/.openclaw/openclaw.json
{
  "agents": {
    "defaults": {
      "model": { "primary": "mlx/<some-slow-local-30b-model>" },
      "llm": { "idleTimeoutSeconds": 600 }
    }
  },
  "models": {
    "providers": {
      "mlx": {
        "baseUrl": "http://127.0.0.1:8080/v1",
        "api": "openai-completions",
        "models": [{ "id": "<some-slow-local-30b-model>" }]
      }
    }
  }
}
RAW_BUFFERClick to expand / collapse

Summary

agents.defaults.llm.idleTimeoutSeconds (legacy timeout knob) is correctly recognized by openclaw doctor's deprecation rule introduced in v2026.4.27. Two gaps remain that combined to silently produce 120s timeouts on slow local models:

  1. doctor --fix deletes the legacy block without preserving the user's value. The migrated key (models.providers.<id>.timeoutSeconds) is not populated; the user's "I want 180s" intent is silently dropped.
  2. The runtime emits no warning when the legacy key is present, so users who haven't run openclaw doctor since v2026.4.27 see only the symptom: prefills on slow local models hit a hardcoded 120s ceiling and fall back to the configured fallback model.

What I observed

On v2026.4.27 (cbc2ba0931), with this config:

"agents": {
  "defaults": {
    "model": { "primary": "mlx/<slow-local-30b-model>" },
    "llm": { "idleTimeoutSeconds": 180 }
  }
}
  • openclaw agent --message "<prompt that triggers >120s prefill>" consistently hit a 120s idle timeout, not 180s, then fell back to OpenAI.
  • src/config/agent-timeout-defaults.tsDEFAULT_LLM_IDLE_TIMEOUT_SECONDS = 120.
  • src/agents/pi-embedded-runner/run/llm-idle-timeout.tsresolveLlmIdleTimeoutMs(...) uses params.modelRequestTimeoutMs (derived from models.providers.<id>.timeoutSeconds) as the override path. There is no path from agents.defaults.llm.idleTimeoutSeconds into this resolver.

The actual user-side fix is to set models.providers.<id>.timeoutSeconds instead, which works as expected.

What doctor does today

src/commands/doctor/shared/legacy-config-migrations.runtime.agents.ts:

{
  id: "agents.defaults.llm->models.providers.timeoutSeconds",
  legacyRules: [{
    path: ["agents","defaults","llm"],
    message: 'agents.defaults.llm is legacy; use models.providers.<id>.timeoutSeconds for slow model/provider timeouts. Run "openclaw doctor --fix".'
  }],
  apply: (raw, changes) => {
    delete defaults.llm;
    changes.push("Removed agents.defaults.llm; model idle timeout now follows models.providers.<id>.timeoutSeconds.");
  }
}

The detection rule is correct. The apply step:

  • Deletes the legacy block.
  • Does not copy idleTimeoutSeconds into any models.providers.<id>.timeoutSeconds.
  • Does not quote the user's number in the change message, so the user has no breadcrumb to recreate their intent.

End state: the user's explicit "180s" preference is silently dropped on auto-fix.

Suggested improvements

Either alone would help; both would be ideal.

  1. Preserve intent in the change message. Echo the legacy value back so the user can recreate it:

    "Removed agents.defaults.llm.idleTimeoutSeconds: 180. To preserve this behavior, add models.providers.<id>.timeoutSeconds: 180 to slow providers (detected providers: mlx, ollama)."

    If there's exactly one non-OpenAI provider configured, optionally offer to apply the value there.

  2. Runtime warning at startup when agents.defaults.llm exists, regardless of whether doctor has been run. One-shot warning naming the legacy path and pointing at openclaw doctor. Today, deprecation visibility is effectively opt-in to running doctor.

Repro

// ~/.openclaw/openclaw.json
{
  "agents": {
    "defaults": {
      "model": { "primary": "mlx/<some-slow-local-30b-model>" },
      "llm": { "idleTimeoutSeconds": 600 }
    }
  },
  "models": {
    "providers": {
      "mlx": {
        "baseUrl": "http://127.0.0.1:8080/v1",
        "api": "openai-completions",
        "models": [{ "id": "<some-slow-local-30b-model>" }]
      }
    }
  }
}

Then openclaw agent --message "<prompt that triggers >120s prefill>" times out at 120s, not 600s, with the user having no in-band signal that agents.defaults.llm.idleTimeoutSeconds was the wrong place to set it.

Related diagnostic gap (not a separate issue, just adjacent)

While debugging the timeout symptom, we also bumped into agents.list[i].model.primary silently shadowing agents.defaults.model.primary. The behavior is correct (resolveAgentEffectiveModelPrimary does what it says on the tin), but there's no diagnostic when a user changes their default and a per-agent override quietly keeps the old value. There's already a precedent for inspecting agents.list[].model in src/commands/doctor/shared/codex-route-warnings.ts; a generalized "explicit override differs from defaults" info-level hit there (or in openclaw models list grouping) would close the same kind of "config-says-X-but-runtime-uses-Y" debugging gap.

Environment

  • OpenClaw v2026.4.27 (cbc2ba0931)
  • macOS Darwin 25.3.0, Apple silicon, 48 GB unified memory
  • mlx_lm.server v0.31.1

extent analysis

TL;DR

To fix the issue, update the configuration to use models.providers.<id>.timeoutSeconds instead of agents.defaults.llm.idleTimeoutSeconds for slow model timeouts.

Guidance

  • Identify the legacy agents.defaults.llm.idleTimeoutSeconds configuration and replace it with the new models.providers.<id>.timeoutSeconds configuration to preserve the intended timeout behavior.
  • Run openclaw doctor to detect and fix deprecation issues, but be aware that it currently deletes the legacy block without preserving the user's value.
  • Consider adding a runtime warning at startup when agents.defaults.llm exists to notify users of the deprecation and point them to the correct configuration.
  • Verify the fix by checking the timeout behavior with the updated configuration.

Example

// Updated configuration
{
  "agents": {
    "defaults": {
      "model": { "primary": "mlx/<slow-local-30b-model>" }
    }
  },
  "models": {
    "providers": {
      "mlx": {
        "baseUrl": "http://127.0.0.1:8080/v1",
        "api": "openai-completions",
        "models": [{ "id": "<slow-local-30b-model>" }],
        "timeoutSeconds": 180
      }
    }
  }
}

Notes

The current implementation of openclaw doctor deletes the legacy block without preserving the user's value, which can lead to silent timeouts. The suggested improvements aim to address this issue by preserving the user's intent and providing a runtime warning.

Recommendation

Apply the workaround by updating the configuration to use models.providers.<id>.timeoutSeconds instead of agents.defaults.llm.idleTimeoutSeconds for slow model timeouts. This will ensure that the intended timeout behavior is preserved.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING