openclaw - ✅(Solved) Fix llamacpp provider: session table shows n_ctx_train (262k) instead of actual runtime context (n_ctx) [1 pull requests, 2 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#73664Fetched 2026-04-29 06:16:43
View on GitHub
Comments
2
Participants
3
Timeline
3
Reactions
0
Author
Timeline (top)
commented ×2cross-referenced ×1

When a llamacpp provider model is active, the Sessions table displays the model's training context window (n_ctx_train) as the denominator rather than the actual runtime context window (n_ctx). This makes token usage percentages misleading.

Error Message

Sessions table shows:

Root Cause

The /v1/models OpenAI-compatible endpoint only exposes n_ctx_train in the meta field:

"meta": {
  "n_ctx_train": 262144,
  ...
}

It does not include n_ctx (the actual loaded context). However, llama.cpp's /props endpoint does expose the correct value:

"n_ctx": 65536

Additionally, the agent's models.json file has contextWindow: 65536 set correctly, but this value appears to be overridden by the live server query result.

PR fix notes

PR #74057: [AI-assisted] fix(providers): use llama.cpp runtime context cap

Description (problem / solution / changelog)

Summary

  • Adds opportunistic llama.cpp /props discovery for self-hosted OpenAI-compatible providers.
  • Strips a trailing /v1 before calling /props, matching llama.cpp's OpenAI-compatible layout.
  • Uses /props.n_ctx as contextTokens so runtime/session budgeting reflects the loaded context window.
  • Preserves explicit configured contextWindow values ahead of live /props discovery.
  • Keeps n_ctx_train from /v1/models as native contextWindow metadata when present.

Closes #73664.

Testing

  • node scripts/test-projects.mjs src/plugins/provider-self-hosted-setup.test.ts
  • git diff --check
  • corepack pnpm tsgo:core
  • corepack pnpm tsgo:core:test

AI Assistance

AI-assisted with Codex. I understand the change: it keeps user-configured caps authoritative while allowing llama.cpp-compatible servers to report their actual loaded runtime context through /props.n_ctx.

Changed files

  • src/plugins/provider-self-hosted-setup.test.ts (modified, +96/-0)
  • src/plugins/provider-self-hosted-setup.ts (modified, +88/-15)

Code Example

agent:main:main  |  qwen3.6-mxfp4-moe  |  56k/262k (21%)

---

"meta": {
  "n_ctx_train": 262144,
  ...
}

---

"n_ctx": 65536
RAW_BUFFERClick to expand / collapse

Summary

When a llamacpp provider model is active, the Sessions table displays the model's training context window (n_ctx_train) as the denominator rather than the actual runtime context window (n_ctx). This makes token usage percentages misleading.

Environment

  • OpenClaw version: 2026.4.25
  • Provider: llamacpp (OpenAI-compatible server via llama.cpp)
  • Model: qwen3.6-mxfp4-moe
  • llama.cpp build: b8951-665abc609

Observed behavior

Sessions table shows:

agent:main:main  |  qwen3.6-mxfp4-moe  |  56k/262k (21%)

The denominator 262k is n_ctx_train — the model's trained context ceiling — not the server's actual loaded context.

Expected behavior

The denominator should reflect the actual runtime context window the server was started with (--ctx-size 65536), showing 56k/65k and a correct usage percentage (~86%).

Root cause

The /v1/models OpenAI-compatible endpoint only exposes n_ctx_train in the meta field:

"meta": {
  "n_ctx_train": 262144,
  ...
}

It does not include n_ctx (the actual loaded context). However, llama.cpp's /props endpoint does expose the correct value:

"n_ctx": 65536

Additionally, the agent's models.json file has contextWindow: 65536 set correctly, but this value appears to be overridden by the live server query result.

Suggested fix

For llamacpp provider types, OpenClaw should either:

  1. Prefer models.json — trust the contextWindow value configured in the agent's models.json over the live server metadata, or
  2. Query /props — for llama.cpp servers, fetch n_ctx from the /props endpoint instead of (or in addition to) n_ctx_train from /v1/models

Option 1 is simpler and consistent with how other providers work. Option 2 is more accurate but requires provider-specific logic.

extent analysis

TL;DR

Prefer the contextWindow value from models.json or query the /props endpoint for n_ctx to accurately display the runtime context window.

Guidance

  • Verify that the contextWindow value in models.json matches the expected runtime context window (--ctx-size 65536).
  • For llamacpp providers, consider querying the /props endpoint to fetch the actual n_ctx value.
  • Update the Sessions table to use the preferred n_ctx value instead of n_ctx_train from the /v1/models endpoint.
  • Test the change to ensure the token usage percentages are accurately displayed.

Example

No code snippet is provided as the issue does not require a specific code change, but rather a logical update to the data source used for displaying context window information.

Notes

The suggested fix assumes that the models.json file is correctly configured and up-to-date. If the file is not reliable, querying the /props endpoint may be a more accurate solution.

Recommendation

Apply workaround: Prefer the contextWindow value from models.json for simplicity and consistency with other providers. This approach is simpler and less prone to errors, but may not always reflect the actual runtime context window if the models.json file is outdated.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

The denominator should reflect the actual runtime context window the server was started with (--ctx-size 65536), showing 56k/65k and a correct usage percentage (~86%).

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix llamacpp provider: session table shows n_ctx_train (262k) instead of actual runtime context (n_ctx) [1 pull requests, 2 comments, 3 participants]