openclaw - ✅(Solved) Fix [Bug]: The local ollama llm was invalid after 4.26 version [2 pull requests, 2 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#76117Fetched 2026-05-03 04:42:09
View on GitHub
Comments
2
Participants
3
Timeline
11
Reactions
2
Author
Timeline (top)
referenced ×4commented ×2cross-referenced ×2labeled ×2

When I updated the vsesion to v2026.4.29, the local ollama LLM was ruined. Before v2026.4.26, the local LLM thinks slowly, but it is precious call the tools, knowing your purpose. And now, the local LLM don't know my commands, but the expansive remote LLM can. I asked these LLM, they said the codes, like selection-CwAy0mf2.js, limits the timeout seconds to 120s. for example: const DEFAULT_LLM_IDLE_TIMEOUT_MS = 120 * 1e3;(line 5977) When local LLM is thinking, they cut off it and let the llm says nonsense.

So I fixed it, and the local LLM still act like a moron. Too many codes to review.

I wish you can fix the issues because I don't care about the latency. I can wait the more precise answer. Thank you.

Root Cause

I wish you can fix the issues because I don't care about the latency. I can wait the more precise answer. Thank you.

Fix Action

Fixed

PR fix notes

PR #76122: fix(agent-timeout): increase DEFAULT_LLM_IDLE_TIMEOUT_SECONDS from 120 to 300

Description (problem / solution / changelog)

Summary

Increases the default LLM idle timeout from 120s to 300s (5 minutes) for local Ollama models.

Problem

Local Ollama models, especially larger models like qwen3.6:35B, require significantly more think time than cloud APIs. The hardcoded 120-second default was too aggressive and caused local models to be cut off mid-think, producing nonsensical responses.

Issue: #76117

Fix

Changed DEFAULT_LLM_IDLE_TIMEOUT_SECONDS from 120 to 300 in src/config/agent-timeout-defaults.ts.

Additional Notes

  • Users can still override this via agents.defaults.timeoutSeconds in their openclaw.json config
  • This is a non-breaking change — users who explicitly set timeoutSeconds will see no difference
  • The change primarily benefits local model users on slower hardware

Changed files

  • src/config/agent-timeout-defaults.ts (modified, +1/-1)

PR #76181: fix(ollama): restore catalog-driven num_ctx for native /api/chat and skip idle watchdog for local streams

Description (problem / solution / changelog)

Summary

  • Problem: Issue #76117 reports that after upgrading from v2026.4.26 to v2026.4.29, local Ollama runs against models such as qwen3.6:35B:a3b produce broken output: the model "doesn't know my commands", uses wrong tools, and "says nonsense" on agent turns. Critically, the reporter explicitly says: "So I fixed it [the timeout], and the local LLM still act like a moron" and "I don't care about the latency" — proving the symptom is not caused by the 120s LLM idle watchdog (src/agents/pi-embedded-runner/run/llm-idle-timeout.ts). Bumping the timeout does not help; the request never reaches a timeout — the model finishes streaming, but its output is garbage because it lost context.

  • Root Cause: Commit 7559845597 "fix(ollama): avoid implicit native num_ctx override" (2026-04-27, shipped in v2026.4.27 / 4.28 / 4.29) changed resolveOllamaModelOptions in extensions/ollama/src/stream.ts:293 from

    options.num_ctx = resolveOllamaNumCtx(model);  // user-config OR catalog fallback

    to

    const numCtx = resolveOllamaConfiguredNumCtx(model);
    if (numCtx !== undefined) {
      options.num_ctx = numCtx;
    }

    The catalog-based fallback (model.contextWindow ?? model.maxTokens) was lost. For users with a default openclaw.json (no explicit models.providers.ollama.timeoutSeconds and no params.num_ctx), every native /api/chat request now ships without options.num_ctx. Ollama silently falls back to the model's Modelfile default — typically 2048 tokens. A typical OpenClaw agent turn carries a system prompt (~3-5K tokens) plus tool definitions (~3-8K tokens) plus history; the request is silently truncated to the last ~2K, the model loses the tool catalog, and the stream completes with "wrong tools / nonsense" output. This explains every line of the reporter's symptom description and why bumping the timeout did nothing.

    The author's stated intent ("avoid implicit override") was reasonable in spirit — don't second-guess Ollama's Modelfile when the catalog has no opinion — but the implementation also dropped catalog-known windows (qwen3.6 → 32K/128K, llama3 → 128K, gemma3 → 128K) which OpenClaw's catalog already records. Those values are not an implicit override; they are explicit information Ollama has no way to recover.

  • Fix: Add a narrow resolveOllamaNativeNumCtx(model) helper that resolves in priority order: (1) explicit params.num_ctx; (2) catalog contextWindow / maxTokens if present; (3) undefined (let the Modelfile decide for unknown models). Wire it into resolveOllamaModelOptions. This restores 4.26 behavior for all known models without re-introducing the DEFAULT_CONTEXT_TOKENS fallback that the original commit deliberately removed for unknown models — which preserves the "avoid implicit override" intent for genuinely catalog-less models.

    In the same PR, this change is bundled with a related but distinct fix on the LLM idle watchdog (the original scope of this PR): the 120s default watchdog encodes a network-silence-as-hang assumption that does not hold for local providers (loopback / RFC 1918 / RFC 6598 CGNAT / .local). When no explicit timeout is configured, the watchdog is now skipped for local provider URLs. This is a real bug — local sockets do not stall — but it is adjacent to, not the cause of, the user-reported symptom in #76117. We keep the two changes together because the watchdog change had already passed bot review on this branch and removing it would be churn.

  • What changed:

    • extensions/ollama/src/stream.ts — added resolveOllamaNativeNumCtx; resolveOllamaModelOptions now uses it instead of resolveOllamaConfiguredNumCtx. Catalog windows survive the trip to Ollama.
    • extensions/ollama/src/stream-runtime.test.ts — updated four assertions that previously locked in the broken num_ctx === undefined behavior on models that explicitly set contextWindow; added two new assertions: catalog window is forwarded as num_ctx, and an unknown catalog still leaves num_ctx absent (preserving the original commit's intent for that case).
    • src/agents/pi-embedded-runner/run/llm-idle-timeout.ts — provider-aware watchdog: skip the default fallback when model.baseUrl is loopback / RFC 1918 / RFC 6598 / .local and no explicit timeout is configured. Strict IPv4-literal regex guards against numeric-looking hostnames such as http://10.0.0.5evil:11434.
    • src/agents/pi-embedded-runner/run/llm-idle-timeout.test.ts — coverage for local/non-local IPv4 boundaries (RFC 1918 ranges, RFC 6598 CGNAT, 127/8 full range), numeric-hostname injection cases, malformed baseUrl, explicit timeout interaction.
    • src/agents/pi-embedded-runner/run/attempt.ts — passes params.model (typed as { baseUrl?: string }, mirroring the existing requestTimeoutMs cast on the same call) into resolveLlmIdleTimeoutMs.
  • What did NOT change (scope boundary):

    • extensions/ollama/src/stream.ts:createOllamaStreamFn request flow, /api/chat body shape, header construction, SSRF policy — unchanged. Only the value carried in options.num_ctx changes, and only for models that have catalog data.
    • resolveOllamaNumCtx (used by the OpenAI-compatibility wrapper path) — unchanged. The compat path still has its DEFAULT_CONTEXT_TOKENS fallback because that path wraps a request that already shipped without num_ctx; it is the right place to backstop with a constant.
    • streamWithIdleTimeout itself, abort plumbing, hook signatures — unchanged.
    • DEFAULT_LLM_IDLE_TIMEOUT_MS, DEFAULT_LLM_IDLE_TIMEOUT_SECONDS, DEFAULT_CONTEXT_TOKENS — same values, same semantics, same defaults for cloud/wrapper paths.
    • Provider plugins, models.json schema, config schema, docs (docs/providers/ollama.md), CHANGELOG.md — unchanged. (CHANGELOG is intentionally left for the maintainer to slot under the right release on merge.)
    • Cron trigger, explicit runTimeoutMs, explicit agents.defaults.timeoutSeconds, explicit models.providers.<id>.timeoutSeconds paths — bit-for-bit identical.
    • Remote provider behavior (any non-local baseUrl) — bit-for-bit identical.

Reproduction

  1. Install OpenClaw v2026.4.29 (or any version since 7559845597, 2026-04-27) with a default openclaw.json — no models.providers.ollama.timeoutSeconds, no models.providers.ollama.params.num_ctx.
  2. Run an Ollama daemon on http://127.0.0.1:11434:
    ollama pull qwen3.6:35B
  3. Issue any agent-style request that requires tool selection:
    openclaw infer model run \
      --model ollama/qwen3.6:35B \
      --prompt "Plan three sequential bash commands and call them via tools."
  4. Before this PR: The model loses the tool catalog (truncated to ~2048 tokens), invents tool names that do not exist, picks the wrong tool, or returns plain prose where structured tool calls were required. Bumping agents.defaults.timeoutSeconds does not help — confirmed by the issue reporter: "I fixed it, and the local LLM still act like a moron."
  5. After this PR: The full system prompt and tool definitions reach the model because num_ctx = 131072 (or whatever the catalog records for that model) flows through to Ollama. Tool selection and answers behave as they did on v2026.4.26.
  6. Regression check (must keep working): a remote provider such as https://api.openai.com/v1 still receives the existing 120s idle watchdog; an Ollama model with no catalog contextWindow still ships without num_ctx so the Modelfile decides — preserving the 7559845597 author's "avoid implicit override" intent for that case.

Risk / Mitigation

  • Risk: Forwarding catalog contextWindow as num_ctx will cause Ollama to allocate the corresponding KV cache. On memory-constrained hosts, a model whose catalog says 131072 will use noticeably more RAM/VRAM than the Modelfile's 2048 default. This matches v2026.4.26 behavior exactly, but represents a change vs. current main for users who upgraded between 4.27 and now and silently adapted to the truncated context.
  • Risk: A user who manually created a custom Ollama provider entry without populating contextWindow will get the new "no num_ctx" behavior — the same as today, no regression.
  • Risk: The watchdog change disables a guard for local providers that was incorrectly applied to begin with. A genuinely hung local Ollama daemon will no longer self-abort at 120s; agent / run / explicit provider timeouts still bound the request.
  • Mitigation:
    • Users who want the post-7559845597 "trust Modelfile" behavior can opt out by removing contextWindow from their custom catalog entries, or by setting models.providers.<provider>.params.num_ctx to the Modelfile value explicitly.
    • The new resolveOllamaNativeNumCtx helper is internal; the public resolveOllamaNumCtx (used by the compat wrapper) is unchanged so the wrapper-side fallback semantics are untouched.
    • Coverage: existing extensions/ollama/src/stream-runtime.test.ts tests for native /api/chat are updated; two new tests assert the catalog-fallback and the no-catalog-no-num_ctx contracts. Coverage on the watchdog side asserts both local and non-local ranges, numeric-hostname injection, and explicit-timeout interaction.
    • No use of any. The new helper signature is (model: ProviderRuntimeModel) => number | undefined. The watchdog parameter is structurally typed as { baseUrl?: string }.

Why this is the root cause and not a symptom patch

The reporter's exact words ("I fixed it [the timeout], and the local LLM still act like a moron") are the load-bearing evidence. A fix that targets the 120s watchdog leaves the reporter's symptom intact, because the request was never being aborted; it was being truncated before it even left OpenClaw. The smallest change that restores 4.26-equivalent behavior for the reporter's repro is to put num_ctx back into the request body. The PR does exactly that, and bundles the orthogonal watchdog cleanup that was already accepted on this branch.

Change Type (select all)

  • Bug fix

Scope (select all touched areas)

  • Ollama provider extension
  • Agents / pi-embedded-runner

Changed files

  • extensions/ollama/src/stream-runtime.test.ts (modified, +60/-4)
  • extensions/ollama/src/stream.ts (modified, +29/-1)
  • src/agents/pi-embedded-runner/run/attempt.ts (modified, +1/-0)
  • src/agents/pi-embedded-runner/run/llm-idle-timeout.test.ts (modified, +102/-0)
  • src/agents/pi-embedded-runner/run/llm-idle-timeout.ts (modified, +87/-0)
RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Beta release blocker

No

Summary

When I updated the vsesion to v2026.4.29, the local ollama LLM was ruined. Before v2026.4.26, the local LLM thinks slowly, but it is precious call the tools, knowing your purpose. And now, the local LLM don't know my commands, but the expansive remote LLM can. I asked these LLM, they said the codes, like selection-CwAy0mf2.js, limits the timeout seconds to 120s. for example: const DEFAULT_LLM_IDLE_TIMEOUT_MS = 120 * 1e3;(line 5977) When local LLM is thinking, they cut off it and let the llm says nonsense.

So I fixed it, and the local LLM still act like a moron. Too many codes to review.

I wish you can fix the issues because I don't care about the latency. I can wait the more precise answer. Thank you.

Steps to reproduce

  1. Updated the openclaw to v2026.4.29 without the openclaw.json changing.
  2. Ask the ollama/qwen3.6:latest.

Expected behavior

Using correct tools and good answer.

Actual behavior

says nonsense.

OpenClaw version

2026.4.29

Operating system

Ubuntu 24

Install method

No response

Model

ollama/qwen3.6:35B:a3b

Provider / routing chain

gateway webchat

Additional provider/model setup details

No response

Logs, screenshots, and evidence

Impact and severity

No response

Additional information

No response

extent analysis

TL;DR

Increase the DEFAULT_LLM_IDLE_TIMEOUT_MS value to allow the local LLM more time to respond.

Guidance

  • Review the selection-CwAy0mf2.js file and update the DEFAULT_LLM_IDLE_TIMEOUT_MS constant to a higher value, such as 300 seconds or more, to give the local LLM sufficient time to process commands.
  • Verify that the local LLM is working correctly after updating the timeout value by testing it with various commands and checking the responses.
  • If increasing the timeout value does not resolve the issue, investigate other potential causes, such as changes in the openclaw.json file or updates to the ollama/qwen3.6 model.
  • Consider testing the local LLM with a different model or version to isolate the issue.

Example

const DEFAULT_LLM_IDLE_TIMEOUT_MS = 300 * 1e3; // updated timeout value

Notes

The issue may be specific to the ollama/qwen3.6:35B:a3b model or the v2026.4.29 version of OpenClaw, and further investigation may be needed to determine the root cause.

Recommendation

Apply workaround: increase the DEFAULT_LLM_IDLE_TIMEOUT_MS value to allow the local LLM more time to respond, as this is a simple and non-invasive change that may resolve the issue.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Using correct tools and good answer.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Bug]: The local ollama llm was invalid after 4.26 version [2 pull requests, 2 comments, 3 participants]