openclaw - 💡(How to fix) Fix [Bug]: Ollama `num_ctx` regression [2 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

After upgrading from OpenClaw v2026.4.x to v2026.5.10+, Ollama models ignore the configured contextWindow and silently default to Ollama's Modelfile context size (typically 4096 tokens). This breaks agents whose prompts + tool definitions + history exceed 4096 tokens. The fallback to contextWindow was removed in commit a8c745a623 without migration path, CHANGELOG note, or backward compatibility.

Error Message

time=2026-05-12T09:57:14.465Z level=WARN source=runner.go:187 msg="truncating input prompt" limit=4096 prompt=10974 keep=4 new=4096 time=2026-05-12T09:57:24.186Z level=DEBUG source=cache.go:301 msg="kv cache removal unsupported, clearing cache and returning inputs for reprocessing" id=0 error="model does not support operation"

Root Cause

After upgrading from OpenClaw v2026.4.x to v2026.5.10+, Ollama models ignore the configured contextWindow and silently default to Ollama's Modelfile context size (typically 4096 tokens). This breaks agents whose prompts + tool definitions + history exceed 4096 tokens. The fallback to contextWindow was removed in commit a8c745a623 without migration path, CHANGELOG note, or backward compatibility.

Fix Action

Fixed

Code Example

// BEFORE (commit c1b9af2770):
function resolveOllamaNativeNumCtx(model) {
  const configured = resolveOllamaConfiguredNumCtx(model);
  if (configured !== undefined) return configured;
  const catalog = model.contextWindow ?? model.maxTokens;
  if (typeof catalog === "number" && Number.isFinite(catalog) && catalog > 0) return Math.floor(catalog);
  return undefined;
}

// AFTER (commit a8c745a623):
function resolveOllamaNativeNumCtx(model) {
  return resolveOllamaConfiguredNumCtx(model);
}

---

time=2026-05-12T09:57:14.021Z level=DEBUG source=server.go:1438 msg="model load progress 1.00"
time=2026-05-12T09:57:14.249Z level=DEBUG source=ggml.go:326 msg="key with type not found" key=qwen35moe.pooling_type default=0
time=2026-05-12T09:57:14.272Z level=INFO source=server.go:1432 msg="llama runner started in 43.01 seconds"
time=2026-05-12T09:57:14.272Z level=DEBUG source=sched.go:573 msg="finished setting up" runner.name=registry.ollama.ai/library/qwen3.6-35b-a3b-UD-Q4_K_M-unsloth:latest runner.inference="[{ID:GPU-8416b7d6-2f5f-d827-5992-4affda11c96a Library:CUDA} {ID:GPU-a4c65198-ad5c-f3c2-d8a3-2829beb71c35 Library:CUDA}]" runner.size="22.7 GiB" runner.vram="15.9 GiB" runner.parallel=1 runner.pid=109 runner.model=/root/.ollama/models/blobs/sha256-ac0e2c1189e055faa36eff361580e79c5bd6f8e76bffb4ce547f167d53e31a61 runner.num_ctx=4096
time=2026-05-12T09:57:14.373Z level=DEBUG source=prompt.go:75 msg="truncating input messages which exceed context length" truncated=1
time=2026-05-12T09:57:14.375Z level=DEBUG source=server.go:1580 msg="completion request" images=0 prompt=43512 format=""
time=2026-05-12T09:57:14.465Z level=WARN source=runner.go:187 msg="truncating input prompt" limit=4096 prompt=10974 keep=4 new=4096
time=2026-05-12T09:57:14.465Z level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=0 prompt=4096 used=0 remaining=4096
time=2026-05-12T09:57:24.186Z level=DEBUG source=cache.go:295 msg="context limit hit - shifting" id=0 limit=4096 input=4096 keep=4 discard=2046
time=2026-05-12T09:57:24.186Z level=DEBUG source=cache.go:301 msg="kv cache removal unsupported, clearing cache and returning inputs for reprocessing" id=0 error="model does not support operation"
[GIN] 2026/05/12 - 09:57:41 | 200 |         1m11s |      172.18.0.1 | POST     "/api/chat"
time=2026-05-12T09:57:41.626Z level=DEBUG source=sched.go:581 msg="context for request finished"
RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Beta release blocker

No

Summary

After upgrading from OpenClaw v2026.4.x to v2026.5.10+, Ollama models ignore the configured contextWindow and silently default to Ollama's Modelfile context size (typically 4096 tokens). This breaks agents whose prompts + tool definitions + history exceed 4096 tokens. The fallback to contextWindow was removed in commit a8c745a623 without migration path, CHANGELOG note, or backward compatibility.

Steps to reproduce

  1. Configure an Ollama model in openclaw.json with "contextWindow": 81920 and "maxTokens": 8192, without params.num_ctx.
  2. Start the gateway on v2026.5.10+ (confirmed on v2026.5.10-beta.3 and v2026.5.12-beta.6).
  3. Send a message to the agent.
  4. Observe in gateway logs that the native /api/chat request carries "options": { "num_ctx": 4096 } instead of the configured 81920.

Expected behavior

The model receives num_ctx: 81920, matching the configured contextWindow. This was the behavior in v2026.4.x and v2026.5.9 (commit c1b9af2770, PR #76181).

Actual behavior

The model receives num_ctx: 4096 (Ollama Modelfile default). Agent conversation context (system prompt + tool definitions + message history) exceeds 4096 tokens and is truncated, causing wrong tool selection and incorrect responses.

OpenClaw version

v2026.5.10-beta.3 through v2026.5.12-beta.8

Operating system

Linux 6.8.0-111-generic (Ubuntu 24.04)

Install method

git clone + pnpm

Model

ollama/qwen3.6-35b-a3b-UD-Q4_K_M-unsloth:latest

Provider / routing chain

openclaw -> ollama (native /api/chat on http://127.0.0.1:11434)

Additional provider/model setup details

The breaking change is in commit a8c745a623 ("fix: stop forcing native ollama num_ctx") on 2026-05-10. The resolveOllamaNativeNumCtx() function was simplified from a multi-step resolver (params.num_ctx → contextWindow → maxTokens → undefined) to only honoring params.num_ctx:

// BEFORE (commit c1b9af2770):
function resolveOllamaNativeNumCtx(model) {
  const configured = resolveOllamaConfiguredNumCtx(model);
  if (configured !== undefined) return configured;
  const catalog = model.contextWindow ?? model.maxTokens;
  if (typeof catalog === "number" && Number.isFinite(catalog) && catalog > 0) return Math.floor(catalog);
  return undefined;
}

// AFTER (commit a8c745a623):
function resolveOllamaNativeNumCtx(model) {
  return resolveOllamaConfiguredNumCtx(model);
}

No CHANGELOG section documents this as a breaking change. openclaw doctor --fix does not provide migration from contextWindow to params.num_ctx.

Last known good commit: c1b9af2770 (2026-05-03). First known bad commit: a8c745a623 (2026-05-10).

Logs, screenshots, and evidence

time=2026-05-12T09:57:14.021Z level=DEBUG source=server.go:1438 msg="model load progress 1.00"
time=2026-05-12T09:57:14.249Z level=DEBUG source=ggml.go:326 msg="key with type not found" key=qwen35moe.pooling_type default=0
time=2026-05-12T09:57:14.272Z level=INFO source=server.go:1432 msg="llama runner started in 43.01 seconds"
time=2026-05-12T09:57:14.272Z level=DEBUG source=sched.go:573 msg="finished setting up" runner.name=registry.ollama.ai/library/qwen3.6-35b-a3b-UD-Q4_K_M-unsloth:latest runner.inference="[{ID:GPU-8416b7d6-2f5f-d827-5992-4affda11c96a Library:CUDA} {ID:GPU-a4c65198-ad5c-f3c2-d8a3-2829beb71c35 Library:CUDA}]" runner.size="22.7 GiB" runner.vram="15.9 GiB" runner.parallel=1 runner.pid=109 runner.model=/root/.ollama/models/blobs/sha256-ac0e2c1189e055faa36eff361580e79c5bd6f8e76bffb4ce547f167d53e31a61 runner.num_ctx=4096
time=2026-05-12T09:57:14.373Z level=DEBUG source=prompt.go:75 msg="truncating input messages which exceed context length" truncated=1
time=2026-05-12T09:57:14.375Z level=DEBUG source=server.go:1580 msg="completion request" images=0 prompt=43512 format=""
time=2026-05-12T09:57:14.465Z level=WARN source=runner.go:187 msg="truncating input prompt" limit=4096 prompt=10974 keep=4 new=4096
time=2026-05-12T09:57:14.465Z level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=0 prompt=4096 used=0 remaining=4096
time=2026-05-12T09:57:24.186Z level=DEBUG source=cache.go:295 msg="context limit hit - shifting" id=0 limit=4096 input=4096 keep=4 discard=2046
time=2026-05-12T09:57:24.186Z level=DEBUG source=cache.go:301 msg="kv cache removal unsupported, clearing cache and returning inputs for reprocessing" id=0 error="model does not support operation"
[GIN] 2026/05/12 - 09:57:41 | 200 |         1m11s |      172.18.0.1 | POST     "/api/chat"
time=2026-05-12T09:57:41.626Z level=DEBUG source=sched.go:581 msg="context for request finished"

Impact and severity

  • Affected users: All OpenClaw users running Ollama with contextWindow set but no params.num_ctx, upgrading from v2026.4.x to v2026.5.10+
  • Severity: High — agents are non-functional due to truncated context, broken tool selection, incorrect responses
  • Frequency: Always (deterministic on every agent response)
  • Consequence: Agent functionality completely broken until users manually add params.num_ctx to every model config entry

Additional information

  • Affected versions: v2026.5.10-beta.3 through beta.6, v2026.5.12-beta.1 through beta.6
  • Last known good: v2026.4.29 / commit c1b9af2770
  • First known bad: v2026.5.10-beta.3 / commit a8c745a623
  • No open issues or PRs address this regression as of 2026-05-14
  • The CHANGELOG contains no "Breaking changes" section documenting this config change
  • Ollama documentation and GitHub Issue ollama/ollama#8903 confirm that changing num_ctx between requests causes full model reloads (re-allocating the KV cache), making the "avoid pathological latency" justification counterproductive
  • Ollama regression pattern: 182 Ollama-related commits in the past year, 109 labeled "fix", suggesting repeated breakage without adequate local testing

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

The model receives num_ctx: 81920, matching the configured contextWindow. This was the behavior in v2026.4.x and v2026.5.9 (commit c1b9af2770, PR #76181).

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING