hermes - 💡(How to fix) Fix HTTP 402 (payment required) incorrectly retried as transient error — causes runaway token spend [2 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

When an LLM provider returns HTTP 402 (Payment Required — out of credits), Hermes retries the request up to agent.api_max_retries times (default: 3) as if it were a transient rate-limit or overload error. This is incorrect: a 402 is a permanent, non-retriable condition — retrying it does not resolve the underlying problem and burns additional tokens against a depleted balance.

Error Message

When an LLM provider returns HTTP 402 (Payment Required — out of credits), Hermes retries the request up to agent.api_max_retries times (default: 3) as if it were a transient rate-limit or overload error. This is incorrect: a 402 is a permanent, non-retriable condition — retrying it does not resolve the underlying problem and burns additional tokens against a depleted balance. 3. Observe: Hermes retries the request 3x before surfacing an error HTTP 402 should be treated as non-retriable. The retry guard in the API call path should check for 402 explicitly and surface a clear user-facing error immediately:

Root Cause

When an LLM provider returns HTTP 402 (Payment Required — out of credits), Hermes retries the request up to agent.api_max_retries times (default: 3) as if it were a transient rate-limit or overload error. This is incorrect: a 402 is a permanent, non-retriable condition — retrying it does not resolve the underlying problem and burns additional tokens against a depleted balance.

Fix Action

Fixed

RAW_BUFFERClick to expand / collapse

Summary

When an LLM provider returns HTTP 402 (Payment Required — out of credits), Hermes retries the request up to agent.api_max_retries times (default: 3) as if it were a transient rate-limit or overload error. This is incorrect: a 402 is a permanent, non-retriable condition — retrying it does not resolve the underlying problem and burns additional tokens against a depleted balance.

Reproduction

  1. Configure Hermes to use OpenRouter with a low or exhausted credit balance
  2. Send any message that triggers an LLM call
  3. Observe: Hermes retries the request 3x before surfacing an error
  4. Each retry consumes credits (or, if the account recovers mid-retry, charges the user multiple times)

Impact

Real-world cost: ~$40 burned in ~48 hours (May 2026) due to this behavior compounded by a 24/7 gateway deployment routing Telegram + Discord traffic. The retry loop amplified every failed request into 3 charges before the user was notified.

Expected Behavior

HTTP 402 should be treated as non-retriable. The retry guard in the API call path should check for 402 explicitly and surface a clear user-facing error immediately:

'Provider returned 402: insufficient credits. Please top up your balance and try again.'

Suggested Fix

In the retry logic (likely run_agent.py or the model routing layer), add 402 to the non-retriable status code list alongside any other permanent errors:

NON_RETRIABLE_STATUS_CODES = {400, 401, 402, 403, 404, 422}

if response.status_code in NON_RETRIABLE_STATUS_CODES:
    raise PermanentProviderError(response.status_code, response.text)

Environment

  • Hermes version: latest (May 2026)
  • Provider: OpenRouter
  • Platform: Windows 10, gateway mode (Telegram + Discord)
  • Config: agent.api_max_retries: 3 (default)

Notes

This is distinct from the UX issue of no cost disclosure before recommending OpenRouter. That is a model knowledge problem. This is a code defect in Hermes's retry logic that applies to any pay-per-token provider that returns 402.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING