hermes - 💡(How to fix) Fix [Bug] HUD context_percent shows stale value, not current request token count [1 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Root Cause

In tui_gateway/server.py, the _get_usage() function (lines 1289-1296) computes context_percent from last_prompt_tokens:

ctx_used = getattr(comp, "last_prompt_tokens", 0) or usage["total"] or 0
ctx_max = getattr(comp, "context_length", 0) or 0
if ctx_max:
    usage["context_percent"] = max(0, min(100, round(ctx_used / ctx_max * 100)))

last_prompt_tokens is updated after each API response completes. So the HUD always shows the token count from the previous request, not the current one.
Claude Code reportedly updates its HUD before sending the request, which is why it stays in sync.
Suggested Fix
Before sending an API request, estimate the current request token count using estimate_request_tokens_rough() and update the HUD display.

Fix Action

Fixed

RAW_BUFFERClick to expand / collapse

Bug Description

The HUD context percentage display does not match the actual token count when context compression triggers.

Steps to Reproduce

  1. Start a Hermes CLI or gateway session
  2. Do several tool-heavy operations (read_file, terminal, etc.)
  3. Watch context compression trigger

Expected vs Actual

Expected: HUD shows ~80%+ when compression fires (with threshold=0.8) Actual: HUD shows 53% while compression fires at 165,349 / 204,800 tokens (80.7%)

Example from actual session:

Preflight compression: ~165,349 tokens => 163,840 threshold. This may take a moment. MiniMax-M2.7 109K/204.8K [██████████░░░░░░░░░░░░░░░░░░░] 53%

Note: 165,349 / 204,800 = 80.7%, but HUD shows 53% with "109K"

Root Cause

In tui_gateway/server.py, the _get_usage() function (lines 1289-1296) computes context_percent from last_prompt_tokens:

ctx_used = getattr(comp, "last_prompt_tokens", 0) or usage["total"] or 0
ctx_max = getattr(comp, "context_length", 0) or 0
if ctx_max:
    usage["context_percent"] = max(0, min(100, round(ctx_used / ctx_max * 100)))

last_prompt_tokens is updated after each API response completes. So the HUD always shows the token count from the previous request, not the current one.
Claude Code reportedly updates its HUD before sending the request, which is why it stays in sync.
Suggested Fix
Before sending an API request, estimate the current request token count using estimate_request_tokens_rough() and update the HUD display.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix [Bug] HUD context_percent shows stale value, not current request token count [1 pull requests]