hermes - 💡(How to fix) Fix Custom endpoint pricing can overestimate Crof qwen3.5-9b cost by 1,000,000x [1 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Hermes recorded an impossible estimated cost for a Crof custom-endpoint session using qwen3.5-9b: 13,233 total tokens were stored as $555.17 estimated cost.

This appears to be a pricing-unit mismatch. Crof model metadata exposes cost.input / cost.output values such as 0.04 and 0.15 for qwen3.5-9b. Hermes appears to normalize those as if they were dollars-per-token, multiplying by 1,000,000 before cost calculation. For Crof/catalog-style metadata these values already look like per-million-token rates, so the resulting estimate is inflated by 1,000,000x.

Root Cause

Hermes recorded an impossible estimated cost for a Crof custom-endpoint session using qwen3.5-9b: 13,233 total tokens were stored as $555.17 estimated cost.

This appears to be a pricing-unit mismatch. Crof model metadata exposes cost.input / cost.output values such as 0.04 and 0.15 for qwen3.5-9b. Hermes appears to normalize those as if they were dollars-per-token, multiplying by 1,000,000 before cost calculation. For Crof/catalog-style metadata these values already look like per-million-token rates, so the resulting estimate is inflated by 1,000,000x.

Fix Action

Fixed

Code Example

session_id: cron_d2c8a0c8d33e_20260511_090015
model: qwen3.5-9b
billing_provider: custom
billing_base_url: https://crof.ai/v1
input_tokens: 12998
output_tokens: 235
estimated_cost_usd: 555.17
cost_status: estimated
cost_source: provider_models_api

---

Auxiliary auto-detect: using main provider custom (qwen3.5-9b)

---

{
  "id": "qwen3.5-9b",
  "cost": {
    "input": 0.04,
    "output": 0.15,
    "cache_read": 0.008
  }
}

---

12998 * 0.04 + 235 * 0.15 = 555.17

---

(12998 * 0.04 + 235 * 0.15) / 1,000,000 = 0.00055517

---

return value * _ONE_MILLION
RAW_BUFFERClick to expand / collapse

Summary

Hermes recorded an impossible estimated cost for a Crof custom-endpoint session using qwen3.5-9b: 13,233 total tokens were stored as $555.17 estimated cost.

This appears to be a pricing-unit mismatch. Crof model metadata exposes cost.input / cost.output values such as 0.04 and 0.15 for qwen3.5-9b. Hermes appears to normalize those as if they were dollars-per-token, multiplying by 1,000,000 before cost calculation. For Crof/catalog-style metadata these values already look like per-million-token rates, so the resulting estimate is inflated by 1,000,000x.

Evidence from local Hermes state

Affected session:

session_id: cron_d2c8a0c8d33e_20260511_090015
model: qwen3.5-9b
billing_provider: custom
billing_base_url: https://crof.ai/v1
input_tokens: 12998
output_tokens: 235
estimated_cost_usd: 555.17
cost_status: estimated
cost_source: provider_models_api

The session was running through the custom Crof endpoint, not a metered OpenAI route:

Auxiliary auto-detect: using main provider custom (qwen3.5-9b)

The model catalog entry for Crof has:

{
  "id": "qwen3.5-9b",
  "cost": {
    "input": 0.04,
    "output": 0.15,
    "cache_read": 0.008
  }
}

The recorded $555.17 matches treating those rates as per-token prices:

12998 * 0.04 + 235 * 0.15 = 555.17

If the values are per-million-token rates, the same usage should be:

(12998 * 0.04 + 235 * 0.15) / 1,000,000 = 0.00055517

For a free/included custom route, it should likely be $0 or unknown/included, but it should not be $555.17.

Likely code path

agent/model_metadata.py maps endpoint/catalog pricing aliases like input and output into prompt / completion pricing.

agent/usage_pricing.py then converts provider model pricing by multiplying by 1,000,000 in _pricing_entry_from_metadata():

return value * _ONE_MILLION

That conversion is correct for OpenRouter-style pricing.prompt / pricing.completion values that are dollars per token, but it is wrong for provider/catalog metadata where input / output are already per-million-token rates.

Expected behavior

Hermes should either:

  • distinguish per-token fields from per-million-token fields when extracting endpoint/catalog pricing, or
  • avoid estimating cost for custom endpoints unless the unit is known, or
  • allow Crof/free-tier custom routes to be marked included/zero-cost.

At minimum, Hermes should not persist huge estimated costs for small custom-endpoint sessions when the route is custom and the metadata uses per-million-style input / output fields.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Hermes should either:

  • distinguish per-token fields from per-million-token fields when extracting endpoint/catalog pricing, or
  • avoid estimating cost for custom endpoints unless the unit is known, or
  • allow Crof/free-tier custom routes to be marked included/zero-cost.

At minimum, Hermes should not persist huge estimated costs for small custom-endpoint sessions when the route is custom and the metadata uses per-million-style input / output fields.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING