openclaw - ✅(Solved) Fix [Bug] Context display shows ?/131k with llama.cpp after upgrading to 2026.5.4 — field name mismatch not resolved [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#77992Fetched 2026-05-06 06:18:11
View on GitHub
Comments
1
Participants
2
Timeline
4
Reactions
2
Author
Timeline (top)
labeled ×2commented ×1cross-referenced ×1

After upgrading from OpenClaw 2026.2.26 to 2026.5.4, the context display shows '?/131k' instead of actual token usage when using llama.cpp as the model provider. OpenClaw expects 'input_tokens' and 'output_tokens' fields but llama.cpp returns 'prompt_tokens' and 'completion_tokens'.

Root Cause

Context display shows '?/131k' (question mark instead of actual token count). OpenClaw fails to find the expected 'input_tokens' and 'output_tokens' fields because llama.cpp returns 'prompt_tokens' and 'completion_tokens' instead. This is the same issue reported in #53448 but still unfixed in 2026.5.4.

Fix Action

Fixed

PR fix notes

PR #78085: fix(agents): parse prompt_tokens/completion_tokens in CLI usage for llama.cpp compatibility (#77992)

Description (problem / solution / changelog)

Summary

  • toCliUsage() in cli-output.ts only recognized input_tokens/output_tokens (and camelCase aliases) from CLI runner output. llama.cpp and other OpenAI-compatible local providers return prompt_tokens/completion_tokens instead, which are the standard OpenAI field names.
  • Without the fallback, usage was silently dropped and context display showed ?/131k for all llama.cpp, Ollama, and similar OpenAI-compatible users.
  • Fix: add prompt_tokens → fallback for totalInput and completion_tokens → fallback for output in toCliUsage(). Both parseCliJson and parseCliJsonl route through this function, so all CLI output parsing paths are covered.

Closes #77992

Testing

  • pnpm vitest run src/agents/cli-output.test.ts

Real behavior proof

  • Behavior: Context display shows ?/131k with llama.cpp after upgrading to 2026.5.4 — field name mismatch causes usage to be silently dropped
  • Tested via targeted unit test added in this PR that exercises the exact llama.cpp response shape (prompt_tokens, completion_tokens, total_tokens).
  • What was not tested: live runtime — please apply maintainer proof: override or advise on evidence format.

Changed files

  • src/agents/cli-output.test.ts (modified, +37/-0)
  • src/agents/cli-output.ts (modified, +9/-2)

Code Example

Related issue: #53448 (reported March 24, 2026, still unfixed in 2026.5.4)

llama.cpp server returns usage in OpenAI-compatible format:
{
  "usage": {
    "prompt_tokens": 11,
    "completion_tokens": 1,
    "total_tokens": 12
  }
}

OpenClaw expects 'input_tokens' and 'output_tokens' which don't exist in llama.cpp's response.
RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Beta release blocker

No

Summary

After upgrading from OpenClaw 2026.2.26 to 2026.5.4, the context display shows '?/131k' instead of actual token usage when using llama.cpp as the model provider. OpenClaw expects 'input_tokens' and 'output_tokens' fields but llama.cpp returns 'prompt_tokens' and 'completion_tokens'.

Steps to reproduce

  1. Run OpenClaw 2026.5.4 with llama.cpp server as model backend (running locally on port 8080)
  2. Send a message through the Telegram channel
  3. Check the session status display - context shows '?/131k' instead of actual token count
  4. Verify the llama.cpp server returns usage with 'prompt_tokens' and 'completion_tokens' fields (OpenAI-compatible format)

Expected behavior

In OpenClaw 2026.2.26, the context display showed actual token usage (e.g., '45/131k'). The system should correctly parse llama.cpp's 'prompt_tokens' and 'completion_tokens' fields and display the real-time token usage rate.

Actual behavior

Context display shows '?/131k' (question mark instead of actual token count). OpenClaw fails to find the expected 'input_tokens' and 'output_tokens' fields because llama.cpp returns 'prompt_tokens' and 'completion_tokens' instead. This is the same issue reported in #53448 but still unfixed in 2026.5.4.

OpenClaw version

2026.5.4

Operating system

Linux Mint 22.1 (based on Ubuntu 24.04) - Linux 6.14.0-37-generic (x64)

Install method

No response

Model

llamacpp/Qwen3.6-35B-A3B-UD-IQ3_XXS.gguf

Provider / routing chain

openclaw -> llamacpp (local llama-server on http://127.0.0.1:8080)

Additional provider/model setup details

llama.cpp server running locally on port 8080 with OpenAI-compatible API format. Model: Qwen3.6-35B-A3B-UD-IQ3_XXS.gguf (131k context window). Configured in openclaw.json under models.providers.llamacpp.

Logs, screenshots, and evidence

Related issue: #53448 (reported March 24, 2026, still unfixed in 2026.5.4)

llama.cpp server returns usage in OpenAI-compatible format:
{
  "usage": {
    "prompt_tokens": 11,
    "completion_tokens": 1,
    "total_tokens": 12
  }
}

OpenClaw expects 'input_tokens' and 'output_tokens' which don't exist in llama.cpp's response.

Impact and severity

Affected: All self-hosted OpenClaw users running llama.cpp or Ollama as local model provider Severity: High - prevents accurate context monitoring and may cause context overflow without warning Frequency: Always (100% of sessions with llama.cpp) Consequence: LCM auto-compression may not trigger, context window can overflow silently, user cannot monitor token usage

Additional information

Last known good version: 2026.2.26 First known bad version: 2026.5.4

This is a regression that broke context tracking for llama.cpp users. The fix suggested in #53448 is straightforward - add fallback field name support:

input: response.usage?.prompt_tokens ?? response.usage?.input_tokens ?? 0, output: response.usage?.completion_tokens ?? response.usage?.output_tokens ?? 0,

extent analysis

TL;DR

Update OpenClaw to handle 'prompt_tokens' and 'completion_tokens' fields returned by llama.cpp by adding fallback field name support.

Guidance

  • Verify that the llama.cpp server returns the expected 'prompt_tokens' and 'completion_tokens' fields in its response.
  • Update the OpenClaw code to include fallback field name support for 'input_tokens' and 'output_tokens' as suggested in the related issue #53448.
  • Test the updated OpenClaw version with the llama.cpp server to ensure accurate token usage display.
  • Consider temporarily downgrading to OpenClaw version 2026.2.26 if a fix is not immediately available.

Example

input: response.usage?.prompt_tokens ?? response.usage?.input_tokens ?? 0,
output: response.usage?.completion_tokens ?? response.usage?.output_tokens ?? 0,

Notes

The provided fix is based on the information given in the issue and the suggested solution in #53448. It is assumed that adding fallback field name support will resolve the issue.

Recommendation

Apply the workaround by adding fallback field name support as suggested, since the issue is a regression and the fix is relatively straightforward.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

In OpenClaw 2026.2.26, the context display showed actual token usage (e.g., '45/131k'). The system should correctly parse llama.cpp's 'prompt_tokens' and 'completion_tokens' fields and display the real-time token usage rate.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING