hermes - 💡(How to fix) Fix [RFC] Token budget system with auto-continuation for bounded-cost sessions

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

  1. Soft cap at 80 percent — warn the model it is approaching the limit
RAW_BUFFERClick to expand / collapse

Motivation

In long autonomous sessions (delegate_task, cron jobs), per-turn token consumption can spiral. A single bad loop can burn through the entire context budget in a few turns.

Currently there is no per-turn budget control. The only safeguard is the hard context window limit, which kills the session when hit.

Proposal: Per-turn Token Budget

An optional budget system that caps token consumption per turn:

  1. Configurable max_tokens_per_turn — default around 8000, overridable per task
  2. Soft cap at 80 percent — warn the model it is approaching the limit
  3. Hard cap at 100 percent — truncate the response and force a result
  4. Auto-continuation — if the model indicates it needs more, allow a follow-up turn
  5. Per-session tracking — cumulative spend across continuation turns

Use cases

  • Kanban workers — guarantee bounded cost per task
  • delegate_task subagents — prevent runaway spend on a single delegated job
  • Cron jobs — keep recurring tasks at predictable token consumption
  • Interactive sessions — optional cost-limiter for power users

Design notes

  • Token budget should be per-turn, not per-session (the context window handles per-session)
  • The soft-cap warning should be injected into the system prompt for that turn
  • Truncation at hard cap should be done on the assistant response, not on tool outputs
  • Auto-continuation should be opt-in: the model explicitly requests it

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix [RFC] Token budget system with auto-continuation for bounded-cost sessions