openclaw - 💡(How to fix) Fix [Bug] CLI backend token counter / context window display is unreliable — shows wildly wrong values (often >100% of cap) on small sessions [1 comments, 2 participants]

Fix Action

Fix / Workaround

#69016 — Original CLI backend lifecycle gap (locked, addendum filed here)
Martian-Engineering/lossless-claw#464 — Bootstrap mismatch on CLI backends
#69004 — Config rewriter wipes user config (not directly related but compounds the pain because the workaround paths get wiped)

Code Example

sessions.json.agent:main:main BEFORE one test turn (post-/lcm rotate):
  totalTokens: 855,290
  totalTokensFresh: true
  inputTokens: 6
  outputTokens: 64
  contextTokens: 1,048,576

sessions.json.agent:main:main AFTER hard gateway restart + 1 system event:
  totalTokens: 4,298,594  (← jumped UP 5x with no real activity)
  totalTokensFresh: true
  inputTokens: 10
  outputTokens: 5,030
  contextTokens: 1,048,576

Description

The displayed token / context window percentage is consistently wrong on CLI-routed sessions (e.g., claude-cli/* provider). Numbers stored in sessions.json.agent:main:main.totalTokens and shown in the TUI status bar do not match the actual context size — frequently showing >100% of the model's context cap on small sessions where the real usage is tiny.

This is a follow-up / addendum to #69016 (which is locked). The 4.20+ #70625 fix wired up the lifecycle hooks (compaction now triggers, summaries get written), but token observability remains broken on the CLI path.

Concrete repro from production setup

Setup:

OpenClaw 2026.4.23
lossless-claw 0.9.3
macOS 26.3.1 (arm64), Node 25.8.0
Active model: claude-cli/claude-opus-4-7
Auth profile: anthropic:claude-cli (OAuth, sk-ant-oat01-* token)

Sequence:

Event	TUI / sessions.json shows
After `/lcm rotate` (transcript reduced from 7.6MB → 81KB, 32-message tail preserved)	`tokens 10.0m/1.0m (955%)`
After one quick test message	`tokens 1.7m/1.0m (163%)`
After full gateway hard restart (`kill -9` + start)	`sessions.json totalTokens: 4,298,594` (`totalTokensFresh: true`)
Manual `/compact` invocation	`Compaction failed: live context still exceeds target • Context 208k/1.0m (20%)`

The Context 208k/1.0m from the compaction message is closer to reasonable but still high given the actual measurable plumbing.

Actual measured context floor

Bootstrap files injected per turn (workspace bootstrap pipeline): ~26k tokens total

MEMORY.md: ~14,810 tokens
life-lessons.md: ~9,474 tokens
SOUL.md / USER.md / AGENTS.md / TOOLS.md / SESSION_BUFFER.md / HEARTBEAT.md / INVENTORY.md / MEMORY-CHECKLIST.md / bookkeeping.md combined: ~13k
context/*.md: ~2k

Active session JSONL (post-rotate): 83KB / 41 lines = ~20k tokens worst case

Realistic context floor: ~100-150k tokens even with tool defs and system prompt accounted for.

The 4.3M / 855k / 1.7M / 208k numbers seen in sessions.json.totalTokens are all wildly inconsistent with this reality, but the field is flagged totalTokensFresh: true so the TUI trusts and displays it.

Cross-validation: API path is correct

The same setup, switching the active model from claude-cli/claude-opus-4-7 to anthropic/claude-opus-4-7 (direct API path with sk-ant-oat01 OAuth token + oauth-2025-04-20 beta header), displays token usage correctly:

Percentages stay under 100%
Numbers track real activity
Compaction succeeds when triggered

Switching back to claude-cli/* immediately reverts to the broken display. Same data, same conversation, same model — only the provider routing differs.

Likely root cause

CLI backend's usage JSON response (when --output-format json is set on the claude command) reports Claude Code's own internal session token usage, which already includes:

Claude Code's system prompt
Claude Code's MCP tool definitions
Claude Code's project context
Whatever else Claude Code injects

Openclaw appears to add its own estimate on top (bootstrap files + lossless-claw summaries + fresh tail), double-counting overlapping content. The same calculation works correctly for direct API providers because openclaw owns the full message construction and there's nothing to double-count.

Downstream impact

Users hit Compaction failed: live context still exceeds target warnings when the actual context is fine. This causes them to either:

Raise maxAssemblyTokenBudget to absurd values chasing the wrong target
Assume their setup is broken when it isn't
Lose intuition about how much real context they're actually using

The wrong number becomes a confidently-wrong signal that breaks tuning intuition.

It also makes the CLI path effectively unusable for anyone who relies on the context counter to manage their session — they have to either accept the broken display or move to API providers.

Suggested fixes (any one would help)

Trust the CLI's reported usage as authoritative — when the provider is a CLI backend, don't add openclaw's bootstrap/system estimate. Just use what the CLI returns.
Subtract overlapping overhead — openclaw could subtract its own bootstrap/system contribution from the CLI's reported total before displaying.
Mark as "estimate only" — show the token count with a ~ or warning marker on CLI backends so users know it's not authoritative.
Provide a "true unknown" mode — display ?/1M or similar when the gateway can't compute confidently, instead of showing a wrong-but-confident number.

#69016 — Original CLI backend lifecycle gap (locked, addendum filed here)
Martian-Engineering/lossless-claw#464 — Bootstrap mismatch on CLI backends
#69004 — Config rewriter wipes user config (not directly related but compounds the pain because the workaround paths get wiped)

Repro data

sessions.json.agent:main:main BEFORE one test turn (post-/lcm rotate):
  totalTokens: 855,290
  totalTokensFresh: true
  inputTokens: 6
  outputTokens: 64
  contextTokens: 1,048,576

sessions.json.agent:main:main AFTER hard gateway restart + 1 system event:
  totalTokens: 4,298,594  (← jumped UP 5x with no real activity)
  totalTokensFresh: true
  inputTokens: 10
  outputTokens: 5,030
  contextTokens: 1,048,576

The jump from 855k → 4.3M with only ~5k worth of input/output activity in between is concrete evidence that the field is not measuring what its name implies.

extent analysis

TL;DR

The most likely fix is to trust the CLI's reported usage as authoritative when the provider is a CLI backend, and not add openclaw's bootstrap/system estimate.

Guidance

Verify the issue by checking the sessions.json file and comparing the totalTokens value with the actual context size.
Check the CLI backend's usage JSON response to see if it includes the system prompt, tool definitions, and project context.
Consider implementing one of the suggested fixes, such as trusting the CLI's reported usage or subtracting overlapping overhead.
Test the fix by running the sequence of events described in the issue and checking if the totalTokens value is accurate.

Example

No code snippet is provided as the issue is more related to the logic and calculation of token usage rather than a specific code implementation.

Notes

The issue seems to be specific to the CLI backend and the way it reports token usage. The suggested fixes are based on the assumption that the CLI backend's reported usage is accurate and that openclaw's estimate is the one causing the issue.

Recommendation

Apply the workaround of trusting the CLI's reported usage as authoritative when the provider is a CLI backend. This is because the CLI backend's reported usage already includes the system prompt, tool definitions, and project context, and adding openclaw's estimate is causing the double-counting issue.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix [Bug] CLI backend token counter / context window display is unreliable — shows wildly wrong values (often >100% of cap) on small sessions [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Likely root cause

Fix Action

Fix / Workaround

Code Example

Description

Concrete repro from production setup

Actual measured context floor

Cross-validation: API path is correct

Likely root cause

Downstream impact

Suggested fixes (any one would help)

Related

Repro data

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix [Bug] CLI backend token counter / context window display is unreliable — shows wildly wrong values (often >100% of cap) on small sessions [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Likely root cause

Fix Action

Fix / Workaround

Code Example

Description

Concrete repro from production setup

Actual measured context floor

Cross-validation: API path is correct

Likely root cause

Downstream impact

Suggested fixes (any one would help)

Related

Repro data

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING