openclaw - ✅(Solved) Fix llamacpp provider: session table shows n_ctx_train (262k) instead of actual runtime context (n_ctx) [1 pull requests, 2 comments, 3 participants]

openclaw2026-04-28 16:18:30

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#73664•Fetched 2026-04-29 06:16:43

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

commented ×2cross-referenced ×1

When a llamacpp provider model is active, the Sessions table displays the model's training context window (n_ctx_train) as the denominator rather than the actual runtime context window (n_ctx). This makes token usage percentages misleading.

Error Message

Sessions table shows:

Root Cause

The /v1/models OpenAI-compatible endpoint only exposes n_ctx_train in the meta field:

"meta": {
  "n_ctx_train": 262144,
  ...
}

It does not include n_ctx (the actual loaded context). However, llama.cpp's /props endpoint does expose the correct value:

"n_ctx": 65536

Additionally, the agent's models.json file has contextWindow: 65536 set correctly, but this value appears to be overridden by the live server query result.

PR fix notes

PR #74057: [AI-assisted] fix(providers): use llama.cpp runtime context cap

Repository: openclaw/openclaw
Author: brokemac79
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/74057

Description (problem / solution / changelog)

Summary

Adds opportunistic llama.cpp /props discovery for self-hosted OpenAI-compatible providers.
Strips a trailing /v1 before calling /props, matching llama.cpp's OpenAI-compatible layout.
Uses /props.n_ctx as contextTokens so runtime/session budgeting reflects the loaded context window.
Preserves explicit configured contextWindow values ahead of live /props discovery.
Keeps n_ctx_train from /v1/models as native contextWindow metadata when present.

Closes #73664.

Testing

node scripts/test-projects.mjs src/plugins/provider-self-hosted-setup.test.ts
git diff --check
corepack pnpm tsgo:core
corepack pnpm tsgo:core:test

AI Assistance

AI-assisted with Codex. I understand the change: it keeps user-configured caps authoritative while allowing llama.cpp-compatible servers to report their actual loaded runtime context through /props.n_ctx.

Changed files

src/plugins/provider-self-hosted-setup.test.ts (modified, +96/-0)
src/plugins/provider-self-hosted-setup.ts (modified, +88/-15)

Code Example

agent:main:main  |  qwen3.6-mxfp4-moe  |  56k/262k (21%)

---

"meta": {
  "n_ctx_train": 262144,
  ...
}

---

"n_ctx": 65536

RAW_BUFFERClick to expand / collapse

Summary

Environment

OpenClaw version: 2026.4.25
Provider: llamacpp (OpenAI-compatible server via llama.cpp)
Model: qwen3.6-mxfp4-moe
llama.cpp build: b8951-665abc609

Observed behavior

Sessions table shows:

agent:main:main  |  qwen3.6-mxfp4-moe  |  56k/262k (21%)

The denominator 262k is n_ctx_train — the model's trained context ceiling — not the server's actual loaded context.

Expected behavior

The denominator should reflect the actual runtime context window the server was started with (--ctx-size 65536), showing 56k/65k and a correct usage percentage (~86%).

Root cause

The /v1/models OpenAI-compatible endpoint only exposes n_ctx_train in the meta field:

"meta": {
  "n_ctx_train": 262144,
  ...
}

It does not include n_ctx (the actual loaded context). However, llama.cpp's /props endpoint does expose the correct value:

"n_ctx": 65536

Additionally, the agent's models.json file has contextWindow: 65536 set correctly, but this value appears to be overridden by the live server query result.

Suggested fix

For llamacpp provider types, OpenClaw should either:

Prefer models.json — trust the contextWindow value configured in the agent's models.json over the live server metadata, or
Query /props — for llama.cpp servers, fetch n_ctx from the /props endpoint instead of (or in addition to) n_ctx_train from /v1/models

Option 1 is simpler and consistent with how other providers work. Option 2 is more accurate but requires provider-specific logic.

extent analysis

TL;DR

Prefer the contextWindow value from models.json or query the /props endpoint for n_ctx to accurately display the runtime context window.

Guidance

Verify that the contextWindow value in models.json matches the expected runtime context window (--ctx-size 65536).
For llamacpp providers, consider querying the /props endpoint to fetch the actual n_ctx value.
Update the Sessions table to use the preferred n_ctx value instead of n_ctx_train from the /v1/models endpoint.
Test the change to ensure the token usage percentages are accurately displayed.

Example

No code snippet is provided as the issue does not require a specific code change, but rather a logical update to the data source used for displaying context window information.

Notes

The suggested fix assumes that the models.json file is correctly configured and up-to-date. If the file is not reliable, querying the /props endpoint may be a more accurate solution.

Recommendation

Apply workaround: Prefer the contextWindow value from models.json for simplicity and consistency with other providers. This approach is simpler and less prone to errors, but may not always reflect the actual runtime context window if the models.json file is outdated.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

The denominator should reflect the actual runtime context window the server was started with (--ctx-size 65536), showing 56k/65k and a correct usage percentage (~86%).

#index setup #retrieval issue #search optimization #API routing #API middleware

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - ✅(Solved) Fix llamacpp provider: session table shows n_ctx_train (262k) instead of actual runtime context (n_ctx) [1 pull requests, 2 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

PR fix notes

PR #74057: [AI-assisted] fix(providers): use llama.cpp runtime context cap

Description (problem / solution / changelog)

Summary

Testing

AI Assistance

Changed files

Code Example

Summary

Environment

Observed behavior

Expected behavior

Root cause

Suggested fix

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

TRENDING

openclaw - ✅(Solved) Fix llamacpp provider: session table shows n_ctx_train (262k) instead of actual runtime context (n_ctx) [1 pull requests, 2 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

PR fix notes

PR #74057: [AI-assisted] fix(providers): use llama.cpp runtime context cap

Description (problem / solution / changelog)

Summary

Testing

AI Assistance

Changed files

Code Example

Summary

Environment

Observed behavior

Expected behavior

Root cause

Suggested fix

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING