hermes - 💡(How to fix) Fix Auxiliary model API calls are invisible in analytics / workspace dashboard [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#23270Fetched 2026-05-11 03:30:15
View on GitHub
Comments
0
Participants
1
Timeline
4
Reactions
0
Author
Participants
Timeline (top)
labeled ×4

Root Cause

Three layers conspire:

  1. auxiliary_client.py:call_llm() extracts usage from API responses but never persists it to the session DB. The response object carries input_tokens/output_tokens/total_tokens — then discards them.

  2. Callers (context_compressor.py:951, vision_tools.py, etc.) call call_llm() and only extract .content — they never touch the usage data.

  3. The analytics endpoint (web_server.py:2826) queries SELECT model, SUM(input_tokens)... FROM sessions GROUP BY model — but the sessions table has a single model column per session (the main agent model). There is no per-API-call model tracking table or column.

Code Example

CREATE TABLE api_calls (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    session_id TEXT NOT NULL REFERENCES sessions(id),
    model TEXT NOT NULL,
    provider TEXT,
    input_tokens INTEGER DEFAULT 0,
    output_tokens INTEGER DEFAULT 0,
    cache_read_tokens INTEGER DEFAULT 0,
    cache_write_tokens INTEGER DEFAULT 0,
    reasoning_tokens INTEGER DEFAULT 0,
    estimated_cost_usd REAL DEFAULT 0,
    actual_cost_usd REAL DEFAULT 0,
    task TEXT,
    created_at INTEGER DEFAULT (unixepoch())
);
RAW_BUFFERClick to expand / collapse

Problem

The workspace dashboard "TOP MODELS" list and /api/analytics/usage by_model data only show the main agent model, even when auxiliary models (compression, vision, session search) are actively making API calls.

Root Cause

Three layers conspire:

  1. auxiliary_client.py:call_llm() extracts usage from API responses but never persists it to the session DB. The response object carries input_tokens/output_tokens/total_tokens — then discards them.

  2. Callers (context_compressor.py:951, vision_tools.py, etc.) call call_llm() and only extract .content — they never touch the usage data.

  3. The analytics endpoint (web_server.py:2826) queries SELECT model, SUM(input_tokens)... FROM sessions GROUP BY model — but the sessions table has a single model column per session (the main agent model). There is no per-API-call model tracking table or column.

Impact

  • Dashboard shows "1 model" (e.g. deepseek-v4-flash) at "100% of calls" — misleading when compression (glm-5) and vision (mimo-v2.5-pro) are also consuming tokens
  • Cost attribution is invisible — subscription providers like OpenCode Go pool all model costs under one monthly fee, but users on per-token providers get no visibility into aux model spend
  • Debugging: no way to see how many tokens compression vs vision vs main model are consuming
  • /api/analytics/models (line 2872) is similarly blind

Suggested Solution

Add a new api_calls table to the session DB:

CREATE TABLE api_calls (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    session_id TEXT NOT NULL REFERENCES sessions(id),
    model TEXT NOT NULL,
    provider TEXT,
    input_tokens INTEGER DEFAULT 0,
    output_tokens INTEGER DEFAULT 0,
    cache_read_tokens INTEGER DEFAULT 0,
    cache_write_tokens INTEGER DEFAULT 0,
    reasoning_tokens INTEGER DEFAULT 0,
    estimated_cost_usd REAL DEFAULT 0,
    actual_cost_usd REAL DEFAULT 0,
    task TEXT,
    created_at INTEGER DEFAULT (unixepoch())
);

Then wire it:

  • run_agent.py:12825 writes with task="main"
  • call_llm() in auxiliary_client.py accepts optional session_id + task params and writes after each API call
  • Analytics endpoint aggregates from api_calls for the by_model breakdown

Implementation Notes

  • update_token_counts() in hermes_state.py already handles incremental deltas — could accept a model override per call
  • billing_provider / billing_base_url on sessions track the main provider — aux calls may use different providers (e.g. main=OpenCode Go, aux=OpenRouter)
  • Cost estimation: agent/usage_pricing.py has per-model pricing
  • Workspace DashboardModelInfoSection type and dashboard aggregator need extension to display aux model info

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix Auxiliary model API calls are invisible in analytics / workspace dashboard [1 participants]