hermes - 💡(How to fix) Fix bug: delegate_task subagent costs never persisted to state DB — in-memory only [1 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

After line 2289: parent_agent.session_estimated_cost_usd = current + _children_cost_total

try: parent_agent._session_db.update_token_counts( parent_agent.session_id, estimated_cost_usd=float(_children_cost_total), absolute=False, # additive — adds to existing DB value ) except Exception: logger.debug("Subagent cost DB flush failed", exc_info=True)

Root Cause

The fix at delegate_tool.py lines 2286–2297 (ported from Kilo-Org/kilocode#9448) correctly folds subagent costs into the parent's in-memory counter:

# delegate_tool.py line 2289
parent_agent.session_estimated_cost_usd = current + _children_cost_total

But it never calls parent_agent._session_db.update_token_counts() to persist this to the DB.

By contrast, the parent's own API calls go through conversation_loop.py line 1738 which calls update_token_counts() with per-call deltas:

# conversation_loop.py line 1738
agent._session_db.update_token_counts(
    agent.session_id,
    estimated_cost_usd=float(cost_result.amount_usd),
    ...
)

The DB uses COALESCE(estimated_cost_usd, 0) + COALESCE(?, 0) — an additive delta, not a set. Since the subagent rollup never calls this, the subagent portion never reaches the DB.

Additionally, parent_agent.session_estimated_cost_usd is written to the run_conversation() result dict (line 4221), but no caller flushes the result dict's estimated_cost_usd field back to the DB after the conversation ends.

Fix Action

Fixed

Code Example

# delegate_tool.py line 2289
parent_agent.session_estimated_cost_usd = current + _children_cost_total

---

# conversation_loop.py line 1738
agent._session_db.update_token_counts(
    agent.session_id,
    estimated_cost_usd=float(cost_result.amount_usd),
    ...
)

---

# After line 2289: parent_agent.session_estimated_cost_usd = current + _children_cost_total
try:
    parent_agent._session_db.update_token_counts(
        parent_agent.session_id,
        estimated_cost_usd=float(_children_cost_total),
        absolute=False,  # additive — adds to existing DB value
    )
except Exception:
    logger.debug("Subagent cost DB flush failed", exc_info=True)

---

# In run_agent.py close() or end_session path
agent._session_db.update_token_counts(
    agent.session_id,
    estimated_cost_usd=agent.session_estimated_cost_usd,
    absolute=True,
)
RAW_BUFFERClick to expand / collapse

Bug Description

Subagent costs from delegate_task are rolled up into the parent agent's in-memory session_estimated_cost_usd counter but are never flushed to the state DB. As a result:

  • The live cost footer during a session is correct (includes subagents)
  • hermes insights and the sessions.estimated_cost_usd column are permanently wrong — they only contain the parent's direct API call costs

This means the DB undercounts costs in direct proportion to subagent usage. For sessions that heavily use delegate_task, the gap can be enormous.

How It Was Observed

Compared OpenRouter actual billing to Hermes DB totals for May 25, 2026:

SourceTotal
OpenRouter billing$15.57
Hermes state DB (sessions.estimated_cost_usd)$3.37
Gap$12.20

The largest discrepancy came from a GLM-5.1 session (midnight–1:30 AM) that spawned subagents via delegate_task. The DB recorded only ~225K input tokens for the parent session, but hermes insights reported 4.5M total tokens for those GLM sessions — meaning ~4.25M tokens of subagent usage had costs tracked in memory but never persisted.

At GLM-5.1 pricing (~$2.50/M input tokens), that single session accounts for ~$10.63 of the missing $12.20.

Root Cause

The fix at delegate_tool.py lines 2286–2297 (ported from Kilo-Org/kilocode#9448) correctly folds subagent costs into the parent's in-memory counter:

# delegate_tool.py line 2289
parent_agent.session_estimated_cost_usd = current + _children_cost_total

But it never calls parent_agent._session_db.update_token_counts() to persist this to the DB.

By contrast, the parent's own API calls go through conversation_loop.py line 1738 which calls update_token_counts() with per-call deltas:

# conversation_loop.py line 1738
agent._session_db.update_token_counts(
    agent.session_id,
    estimated_cost_usd=float(cost_result.amount_usd),
    ...
)

The DB uses COALESCE(estimated_cost_usd, 0) + COALESCE(?, 0) — an additive delta, not a set. Since the subagent rollup never calls this, the subagent portion never reaches the DB.

Additionally, parent_agent.session_estimated_cost_usd is written to the run_conversation() result dict (line 4221), but no caller flushes the result dict's estimated_cost_usd field back to the DB after the conversation ends.

Affected files and lines:

  • tools/delegate_tool.py lines 2251–2297: Cost rollup — updates in-memory only
  • agent/conversation_loop.py line 1738: Per-call DB persistence — subagent cost never routed through here
  • hermes_state.py lines 830–834: update_token_counts — additive delta model has no path for bulk subagent rollup

Sites that read estimated_cost_usd from the DB (all affected):

  • agent/insights.py lines 430–432: _compute_overview → undercounts
  • hermes insights CLI command → undercounts
  • Any external dashboard/analytics querying the sessions table → undercounts

Proposed Fix

Option A: Flush after rollup (minimal, targeted)

In delegate_tool.py, after the rollup at line 2289, add a DB flush:

# After line 2289: parent_agent.session_estimated_cost_usd = current + _children_cost_total
try:
    parent_agent._session_db.update_token_counts(
        parent_agent.session_id,
        estimated_cost_usd=float(_children_cost_total),
        absolute=False,  # additive — adds to existing DB value
    )
except Exception:
    logger.debug("Subagent cost DB flush failed", exc_info=True)

Problem: update_token_counts() currently does not expose an absolute parameter at the call site — it always uses the non-absolute (additive) path unless called from the gateway path. Would need to either: (a) Add absolute=False as an explicit parameter, or
(b) Accept that additive mode works here (subagent cost adds to parent cost already in DB)

Option B: Final cost sync at session end (more robust)

After run_conversation() returns (in both CLI and gateway paths), sync the full in-memory agent.session_estimated_cost_usd to the DB as an absolute write. This catches ALL in-memory cost mutations (subagent rollups, mid-session model switches, etc.):

# In run_agent.py close() or end_session path
agent._session_db.update_token_counts(
    agent.session_id,
    estimated_cost_usd=agent.session_estimated_cost_usd,
    absolute=True,
)

Option C: Both (belt-and-suspenders)

Do Option A for real-time accuracy during long sessions, and Option B as a safety net at session close. This is the most thorough approach and protects against future in-memory-only cost mutations.

Impact

  • Severity: P2 — causes incorrect billing analytics but does not affect correctness of agent behavior
  • Affected users: Anyone who uses delegate_task and relies on hermes insights for cost tracking
  • Silent failure: No errors or warnings — the cost just quietly disagrees with the provider's billing

Environment

  • Hermes Agent running on Linux (Pop!_OS 22.04)
  • OpenRouter provider with deepseek/deepseek-v4-pro, z-ai/glm-5.1, moonshotai/kimi-k2.6, x-ai/grok-4.3
  • display.show_cost: show enabled
  • delegation config active with subagent spawns

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING