openclaw - 💡(How to fix) Fix [Bug] CLI backend token counter / context window display is unreliable — shows wildly wrong values (often >100% of cap) on small sessions [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#76979Fetched 2026-05-04 04:59:50
View on GitHub
Comments
1
Participants
2
Timeline
2
Reactions
2
Timeline (top)
closed ×1commented ×1

The displayed token / context window percentage is consistently wrong on CLI-routed sessions (e.g., claude-cli/* provider). Numbers stored in sessions.json.agent:main:main.totalTokens and shown in the TUI status bar do not match the actual context size — frequently showing >100% of the model's context cap on small sessions where the real usage is tiny.

This is a follow-up / addendum to #69016 (which is locked). The 4.20+ #70625 fix wired up the lifecycle hooks (compaction now triggers, summaries get written), but token observability remains broken on the CLI path.

Root Cause

Likely root cause

Fix Action

Fix / Workaround

  • #69016 — Original CLI backend lifecycle gap (locked, addendum filed here)
  • Martian-Engineering/lossless-claw#464 — Bootstrap mismatch on CLI backends
  • #69004 — Config rewriter wipes user config (not directly related but compounds the pain because the workaround paths get wiped)

Code Example

sessions.json.agent:main:main BEFORE one test turn (post-/lcm rotate):
  totalTokens: 855,290
  totalTokensFresh: true
  inputTokens: 6
  outputTokens: 64
  contextTokens: 1,048,576

sessions.json.agent:main:main AFTER hard gateway restart + 1 system event:
  totalTokens: 4,298,594  (← jumped UP 5x with no real activity)
  totalTokensFresh: true
  inputTokens: 10
  outputTokens: 5,030
  contextTokens: 1,048,576
RAW_BUFFERClick to expand / collapse

Description

The displayed token / context window percentage is consistently wrong on CLI-routed sessions (e.g., claude-cli/* provider). Numbers stored in sessions.json.agent:main:main.totalTokens and shown in the TUI status bar do not match the actual context size — frequently showing >100% of the model's context cap on small sessions where the real usage is tiny.

This is a follow-up / addendum to #69016 (which is locked). The 4.20+ #70625 fix wired up the lifecycle hooks (compaction now triggers, summaries get written), but token observability remains broken on the CLI path.

Concrete repro from production setup

Setup:

  • OpenClaw 2026.4.23
  • lossless-claw 0.9.3
  • macOS 26.3.1 (arm64), Node 25.8.0
  • Active model: claude-cli/claude-opus-4-7
  • Auth profile: anthropic:claude-cli (OAuth, sk-ant-oat01-* token)

Sequence:

EventTUI / sessions.json shows
After /lcm rotate (transcript reduced from 7.6MB → 81KB, 32-message tail preserved)tokens 10.0m/1.0m (955%)
After one quick test messagetokens 1.7m/1.0m (163%)
After full gateway hard restart (kill -9 + start)sessions.json totalTokens: 4,298,594 (totalTokensFresh: true)
Manual /compact invocationCompaction failed: live context still exceeds target • Context 208k/1.0m (20%)

The Context 208k/1.0m from the compaction message is closer to reasonable but still high given the actual measurable plumbing.

Actual measured context floor

Bootstrap files injected per turn (workspace bootstrap pipeline): ~26k tokens total

  • MEMORY.md: ~14,810 tokens
  • life-lessons.md: ~9,474 tokens
  • SOUL.md / USER.md / AGENTS.md / TOOLS.md / SESSION_BUFFER.md / HEARTBEAT.md / INVENTORY.md / MEMORY-CHECKLIST.md / bookkeeping.md combined: ~13k
  • context/*.md: ~2k

Active session JSONL (post-rotate): 83KB / 41 lines = ~20k tokens worst case

Realistic context floor: ~100-150k tokens even with tool defs and system prompt accounted for.

The 4.3M / 855k / 1.7M / 208k numbers seen in sessions.json.totalTokens are all wildly inconsistent with this reality, but the field is flagged totalTokensFresh: true so the TUI trusts and displays it.

Cross-validation: API path is correct

The same setup, switching the active model from claude-cli/claude-opus-4-7 to anthropic/claude-opus-4-7 (direct API path with sk-ant-oat01 OAuth token + oauth-2025-04-20 beta header), displays token usage correctly:

  • Percentages stay under 100%
  • Numbers track real activity
  • Compaction succeeds when triggered

Switching back to claude-cli/* immediately reverts to the broken display. Same data, same conversation, same model — only the provider routing differs.

Likely root cause

CLI backend's usage JSON response (when --output-format json is set on the claude command) reports Claude Code's own internal session token usage, which already includes:

  • Claude Code's system prompt
  • Claude Code's MCP tool definitions
  • Claude Code's project context
  • Whatever else Claude Code injects

Openclaw appears to add its own estimate on top (bootstrap files + lossless-claw summaries + fresh tail), double-counting overlapping content. The same calculation works correctly for direct API providers because openclaw owns the full message construction and there's nothing to double-count.

Downstream impact

Users hit Compaction failed: live context still exceeds target warnings when the actual context is fine. This causes them to either:

  1. Raise maxAssemblyTokenBudget to absurd values chasing the wrong target
  2. Assume their setup is broken when it isn't
  3. Lose intuition about how much real context they're actually using

The wrong number becomes a confidently-wrong signal that breaks tuning intuition.

It also makes the CLI path effectively unusable for anyone who relies on the context counter to manage their session — they have to either accept the broken display or move to API providers.

Suggested fixes (any one would help)

  1. Trust the CLI's reported usage as authoritative — when the provider is a CLI backend, don't add openclaw's bootstrap/system estimate. Just use what the CLI returns.
  2. Subtract overlapping overhead — openclaw could subtract its own bootstrap/system contribution from the CLI's reported total before displaying.
  3. Mark as "estimate only" — show the token count with a ~ or warning marker on CLI backends so users know it's not authoritative.
  4. Provide a "true unknown" mode — display ?/1M or similar when the gateway can't compute confidently, instead of showing a wrong-but-confident number.

Related

  • #69016 — Original CLI backend lifecycle gap (locked, addendum filed here)
  • Martian-Engineering/lossless-claw#464 — Bootstrap mismatch on CLI backends
  • #69004 — Config rewriter wipes user config (not directly related but compounds the pain because the workaround paths get wiped)

Repro data

sessions.json.agent:main:main BEFORE one test turn (post-/lcm rotate):
  totalTokens: 855,290
  totalTokensFresh: true
  inputTokens: 6
  outputTokens: 64
  contextTokens: 1,048,576

sessions.json.agent:main:main AFTER hard gateway restart + 1 system event:
  totalTokens: 4,298,594  (← jumped UP 5x with no real activity)
  totalTokensFresh: true
  inputTokens: 10
  outputTokens: 5,030
  contextTokens: 1,048,576

The jump from 855k → 4.3M with only ~5k worth of input/output activity in between is concrete evidence that the field is not measuring what its name implies.

extent analysis

TL;DR

The most likely fix is to trust the CLI's reported usage as authoritative when the provider is a CLI backend, and not add openclaw's bootstrap/system estimate.

Guidance

  • Verify the issue by checking the sessions.json file and comparing the totalTokens value with the actual context size.
  • Check the CLI backend's usage JSON response to see if it includes the system prompt, tool definitions, and project context.
  • Consider implementing one of the suggested fixes, such as trusting the CLI's reported usage or subtracting overlapping overhead.
  • Test the fix by running the sequence of events described in the issue and checking if the totalTokens value is accurate.

Example

No code snippet is provided as the issue is more related to the logic and calculation of token usage rather than a specific code implementation.

Notes

The issue seems to be specific to the CLI backend and the way it reports token usage. The suggested fixes are based on the assumption that the CLI backend's reported usage is accurate and that openclaw's estimate is the one causing the issue.

Recommendation

Apply the workaround of trusting the CLI's reported usage as authoritative when the provider is a CLI backend. This is because the CLI backend's reported usage already includes the system prompt, tool definitions, and project context, and adding openclaw's estimate is causing the double-counting issue.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING