hermes - ✅(Solved) Fix Per-model token usage is lost during mid-session model switches [2 pull requests, 2 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#28637Fetched 2026-05-20 04:03:03
View on GitHub
Comments
2
Participants
3
Timeline
8
Reactions
0
Author
Timeline (top)
cross-referenced ×3labeled ×3commented ×2

Fix Action

Fixed

PR fix notes

PR #1: Per-model token tracking for mid-session model switches

Description (problem / solution / changelog)

Closes #28637

Adds a usage_by_model TEXT JSON column to the sessions table that records per-model token/cost breakdown on every API call, enabling accurate cost attribution and display after mid-session /model switches.

Changes

  • Schema: usage_by_model TEXT column on sessions, auto-added to existing DBs via _reconcile_columns(), bump SCHEMA_VERSION to 12
  • In-memory: agent.session_usage_by_model dict on AIAgent
  • Per-call tracking: populates per-model counters in conversation_loop.py — tokens, cache, reasoning, API calls, and cost — then serializes to JSON and persists on every turn
  • CLI /usage: per-model breakdown when multiple models detected; single-model fallback for legacy sessions
  • Gateway /usage: same breakdown in messaging platforms
  • /insights: model breakdown uses usage_by_model when present, falls back to model column for backward compat
  • TUI: usage_by_model included in Python → TypeScript payload (display rendering not yet implemented — separate follow-up)
  • Tests: round-trip serialization, cumulative sum invariant, backward compat fallback, schema version bump

Backward compatibility

  • Old sessions without the column render single-model as before
  • _reconcile_columns() adds the column on next startup — zero-downtime
  • Cumulative counters (input_tokens, output_tokens, etc.) are unchanged

Changed files

  • agent/agent_init.py (modified, +2/-1)
  • agent/conversation_loop.py (modified, +20/-0)
  • agent/insights.py (modified, +48/-20)
  • cli.py (modified, +79/-34)
  • gateway/run.py (modified, +42/-17)
  • hermes_state.py (modified, +20/-15)
  • run_agent.py (modified, +2/-1)
  • tests/agent/test_insights.py (modified, +60/-0)
  • tests/test_hermes_state.py (modified, +223/-3)
  • tui_gateway/server.py (modified, +1/-0)

PR #28842: Per-model token tracking for mid-session model switches

Description (problem / solution / changelog)

Closes #28637

Adds a usage_by_model TEXT JSON column to the sessions table that records per-model token/cost breakdown on every API call, enabling accurate cost attribution and display after mid-session /model switches.

Changes

  • Schema: usage_by_model TEXT column on sessions, auto-added to existing DBs via _reconcile_columns(), bump SCHEMA_VERSION to 12
  • In-memory: agent.session_usage_by_model dict on AIAgent
  • Per-call tracking: populates per-model counters in conversation_loop.py — tokens, cache, reasoning, API calls, and cost — then serializes to JSON and persists on every turn
  • CLI /usage: per-model breakdown when multiple models detected; single-model fallback for legacy sessions
  • Gateway /usage: same breakdown in messaging platforms
  • /insights: model breakdown uses usage_by_model when present, falls back to model column for backward compat
  • TUI: usage_by_model included in Python → TypeScript payload (display rendering not yet implemented — separate follow-up)
  • Tests: round-trip serialization, cumulative sum invariant, backward compat fallback, schema version bump

Backward compatibility

  • Old sessions without the column render single-model as before
  • _reconcile_columns() adds the column on next startup — zero-downtime
  • Cumulative counters (input_tokens, output_tokens, etc.) are unchanged

Changed files

  • agent/agent_init.py (modified, +2/-1)
  • agent/conversation_loop.py (modified, +20/-0)
  • agent/insights.py (modified, +48/-20)
  • cli.py (modified, +79/-34)
  • gateway/run.py (modified, +42/-17)
  • hermes_state.py (modified, +20/-15)
  • run_agent.py (modified, +2/-1)
  • tests/agent/test_insights.py (modified, +60/-0)
  • tests/test_hermes_state.py (modified, +223/-3)
  • tui_gateway/server.py (modified, +1/-0)
RAW_BUFFERClick to expand / collapse

The sessions table's model column uses COALESCE(model, ?) — first-writer-wins, so it records only the first model and never updates on a /model switch. Cumulative counters (input_tokens, output_tokens, etc.) keep summing across all models, but there is no per-model breakdown anywhere in the database — no column, no table, no per-message model attribution.

After a mid-session model switch, /usage and /insights attribute all tokens and cost to the single locked-in model.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING