hermes - 💡(How to fix) Fix [Bug]: state.db input_tokens and cache_read_tokens incorrectly recorded for MiMo/xiaomi provider

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Root Cause

Root Cause Analysis

Fix Action

Workaround

Use Agent log or API official dashboard for billing analysis. Do not rely on state.db token values for MiMo provider.

Code Example

SELECT input_tokens, cache_read_tokens FROM sessions WHERE id = 'e03df98b569c';
-- Result: 879344 | 0

---

# OpenAI-compatible API (MiMo uses this)
prompt_total = response.usage.prompt_tokens  # Total input (includes cache)
cache_read_tokens = details.cached_tokens    # From prompt_tokens_details
input_tokens = prompt_total - cache_read_tokens - cache_write_tokens  # Miss

---

agent.session_input_tokens += canonical_usage.input_tokens
agent.session_cache_read_tokens += canonical_usage.cache_read_tokens

---

"input_tokens": agent.session_input_tokens,
"cache_read_tokens": agent.session_cache_read_tokens,

---

providers:
  xiaomi:
    base_url: https://token-plan-cn.xiaomimimo.com/v1
    model: mimo-v2.5

---

SELECT input_tokens, cache_read_tokens FROM sessions ORDER BY started_at DESC LIMIT 1;

---

# MiMo-specific: check for alternative cache fields
if not cache_read_tokens:
    cache_read_tokens = _to_int(getattr(response_usage, "cache_tokens", 0))
if not cache_read_tokens:
    cache_read_tokens = _to_int(getattr(response_usage, "cached_tokens", 0))

---

logger.debug(f"MiMo usage response: {response.usage}")
logger.debug(f"prompt_tokens_details: {getattr(response.usage, 'prompt_tokens_details', None)}")
RAW_BUFFERClick to expand / collapse

Bug Description

Hermes state.db records incorrect token values for MiMo (xiaomi) provider:

  1. input_tokens is ~3.9x higher than actual API miss tokens
  2. cache_read_tokens is always 0 despite cache being active (90%+ hit rate confirmed by API)

This makes state.db unusable for billing analysis and cache hit rate calculation.

Evidence

API Official Data (from MiMo dashboard)

  • Total Input: 2,318,203 tokens
  • Cache Hit: 2,070,400 tokens (89.3%)
  • Cache Miss: 247,803 tokens

Hermes Agent Log

  • Total prompt_tokens (cumulative): 2,229,419 tokens
  • Cache Hit (from log): 1,981,824 tokens
  • Cache Miss (calculated): 226,818 tokens

Hermes state.db

SELECT input_tokens, cache_read_tokens FROM sessions WHERE id = 'e03df98b569c';
-- Result: 879344 | 0

DB shows input_tokens=879,344 (should be ~247,803) and cache_read_tokens=0 (should be ~2,070,400)

Root Cause Analysis

Code Flow

  1. API Response Parsing (agent/usage_pricing.py:738-757):
# OpenAI-compatible API (MiMo uses this)
prompt_total = response.usage.prompt_tokens  # Total input (includes cache)
cache_read_tokens = details.cached_tokens    # From prompt_tokens_details
input_tokens = prompt_total - cache_read_tokens - cache_write_tokens  # Miss
  1. Session Accumulation (agent/conversation_loop.py:1607-1610):
agent.session_input_tokens += canonical_usage.input_tokens
agent.session_cache_read_tokens += canonical_usage.cache_read_tokens
  1. DB Storage (agent/turn_finalizer.py:341-344):
"input_tokens": agent.session_input_tokens,
"cache_read_tokens": agent.session_cache_read_tokens,

Hypothesis

MiMo API does not return prompt_tokens_details.cached_tokens (or uses a non-standard field name), causing:

  1. normalize_usage() returns cache_read_tokens=0
  2. input_tokens = prompt_total - 0 = prompt_total (treats all input as miss)
  3. DB accumulates prompt_total instead of actual miss

This explains why:

  • DB input_tokens (879,344) is much higher than actual miss (247,803)
  • DB cache_read_tokens is 0 despite 90%+ cache hit rate

Steps to Reproduce

  1. Configure Hermes with MiMo provider:
providers:
  xiaomi:
    base_url: https://token-plan-cn.xiaomimimo.com/v1
    model: mimo-v2.5
  1. Run a multi-turn conversation with WebUI or Gateway
  2. Check state.db:
SELECT input_tokens, cache_read_tokens FROM sessions ORDER BY started_at DESC LIMIT 1;
  1. Compare with MiMo API dashboard data — values will not match

Expected Behavior

state.db should record:

  • input_tokens = actual miss tokens (new KV computation)
  • cache_read_tokens = actual cache hit tokens

Actual Behavior

state.db records:

  • input_tokens ≈ cumulative prompt_tokens (includes cache)
  • cache_read_tokens = 0

Proposed Fix

Option 1: Add MiMo-specific usage parsing

Check if MiMo returns cache data in a non-standard field. Add detection in normalize_usage():

# MiMo-specific: check for alternative cache fields
if not cache_read_tokens:
    cache_read_tokens = _to_int(getattr(response_usage, "cache_tokens", 0))
if not cache_read_tokens:
    cache_read_tokens = _to_int(getattr(response_usage, "cached_tokens", 0))

Option 2: Log raw API response for debugging

Add debug logging to capture MiMo's actual usage structure:

logger.debug(f"MiMo usage response: {response.usage}")
logger.debug(f"prompt_tokens_details: {getattr(response.usage, 'prompt_tokens_details', None)}")

Workaround

Use Agent log or API official dashboard for billing analysis. Do not rely on state.db token values for MiMo provider.

Environment

  • Hermes Agent: latest main
  • Provider: xiaomi (MiMo)
  • Model: mimo-v2.5
  • Python: 3.11+

Related

  • #29553 — cache tokens missing from SSE events (different layer)
  • #41177 — Desktop UI shows 0% cache hit (downstream effect)
  • This issue focuses on DB storage layer specifically for MiMo provider

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING