hermes - ✅(Solved) Fix Kanban defaults can auto-launch unbounded paid worker swarms across all boards [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#29034Fetched 2026-05-20 04:00:29
View on GitHub
Comments
0
Participants
1
Timeline
6
Reactions
0
Participants
Timeline (top)
labeled ×4cross-referenced ×2

The current Kanban defaults can unexpectedly launch many paid model-backed worker agents across every local board as soon as the gateway is running.

This is not only a single task retry bug. The unsafe default shape is the combination of:

  • kanban.dispatch_in_gateway defaults to true
  • the gateway-embedded dispatcher sweeps every non-archived board every tick
  • kanban.auto_decompose defaults to true
  • no safe default max_spawn / max_in_progress cap is applied on personal installs
  • workers are full hermes -p <profile> ... chat -q ... model sessions, often inheriting the user's paid default provider/model

For users on ChatGPT/Codex, Claude, Grok, or other quota-backed providers, a board state change can silently become a multi-agent paid workload.

Root Cause

Kanban is useful, but it changes Hermes from an interactive agent into an autonomous worker fleet. That needs a stronger default safety model than ordinary chat or cron. The current defaults make the fleet behavior too easy to activate accidentally and too hard to notice until quota or host resources are already being consumed.

Fix Action

Fix / Workaround

  • kanban.dispatch_in_gateway defaults to true
  • the gateway-embedded dispatcher sweeps every non-archived board every tick
  • kanban.auto_decompose defaults to true
  • no safe default max_spawn / max_in_progress cap is applied on personal installs
  • workers are full hermes -p <profile> ... chat -q ... model sessions, often inheriting the user's paid default provider/model

On a personal macOS install, the gateway-embedded dispatcher spawned workers across multiple local boards for roughly an hour. The WebUI itself had no active user run, but background Kanban workers were consuming OpenAI Codex / ChatGPT quota.

kanban dispatcher [board-a]: spawned=8 ...
kanban dispatcher [board-b]: spawned=7 ...
kanban dispatcher [board-b]: spawned=10 ...
kanban dispatcher [board-b]: spawned=12 ...
kanban dispatcher [board-c]: spawned=3 ...
kanban dispatcher [board-d]: spawned=2 ...

PR fix notes

PR #29043: fix(kanban): make gateway dispatch opt-in by default

Description (problem / solution / changelog)

Summary

  • Make gateway-embedded Kanban dispatch opt-in instead of default-on.
  • Make Kanban auto-decompose opt-in instead of default-on to avoid unexpected model-backed worker fanout.
  • Update CLI guidance, diagnostics, docs, config example, and the default-config regression test to match the safer defaults.

Verification

  • python -m py_compile hermes_cli\config.py gateway\run.py hermes_cli\kanban.py hermes_cli\kanban_diagnostics.py
  • uv run --with pytest pytest -o addopts="" tests\hermes_cli\test_kanban_core_functionality.py -k "dispatch_in_gateway or dispatcher_presence" -q -> 5 passed
  • git diff --check

Fixes #29034

Changed files

  • AGENTS.md (modified, +1104/-1104)
  • cli-config.yaml.example (modified, +1108/-1095)
  • gateway/run.py (modified, +18205/-18205)
  • hermes_cli/config.py (modified, +5545/-5545)
  • hermes_cli/kanban.py (modified, +2678/-2677)
  • hermes_cli/kanban_diagnostics.py (modified, +1058/-1058)
  • tests/hermes_cli/test_kanban_core_functionality.py (modified, +4418/-4416)
  • website/docs/reference/cli-commands.md (modified, +1265/-1265)
  • website/docs/user-guide/features/kanban.md (modified, +850/-850)

Code Example

kanban dispatcher [board-a]: spawned=8 ...
kanban dispatcher [board-b]: spawned=7 ...
kanban dispatcher [board-b]: spawned=10 ...
kanban dispatcher [board-b]: spawned=12 ...
kanban dispatcher [board-c]: spawned=3 ...
kanban dispatcher [board-d]: spawned=2 ...

---

Provider: openai-codex  Model: gpt-5.5
Endpoint: https://chatgpt.com/backend-api/codex
...
Primary model failed — switching to fallback: gpt-5.4 via openai-codex

---

kanban:
  dispatch_in_gateway: true
  dispatch_interval_seconds: 60
  failure_limit: 2

---

kanban:
  dispatch_in_gateway: false
  dispatch_interval_seconds: 60
  failure_limit: 2
  max_spawn: 1
  max_in_progress: 1
  auto_decompose: false
RAW_BUFFERClick to expand / collapse

Summary

The current Kanban defaults can unexpectedly launch many paid model-backed worker agents across every local board as soon as the gateway is running.

This is not only a single task retry bug. The unsafe default shape is the combination of:

  • kanban.dispatch_in_gateway defaults to true
  • the gateway-embedded dispatcher sweeps every non-archived board every tick
  • kanban.auto_decompose defaults to true
  • no safe default max_spawn / max_in_progress cap is applied on personal installs
  • workers are full hermes -p <profile> ... chat -q ... model sessions, often inheriting the user's paid default provider/model

For users on ChatGPT/Codex, Claude, Grok, or other quota-backed providers, a board state change can silently become a multi-agent paid workload.

Impact

High quota / billing risk and poor operator predictability.

On a personal macOS install, the gateway-embedded dispatcher spawned workers across multiple local boards for roughly an hour. The WebUI itself had no active user run, but background Kanban workers were consuming OpenAI Codex / ChatGPT quota.

Observed patterns in local logs included per-tick batches like:

kanban dispatcher [board-a]: spawned=8 ...
kanban dispatcher [board-b]: spawned=7 ...
kanban dispatcher [board-b]: spawned=10 ...
kanban dispatcher [board-b]: spawned=12 ...
kanban dispatcher [board-c]: spawned=3 ...
kanban dispatcher [board-d]: spawned=2 ...

Worker logs showed these were real paid model calls, not lightweight queue bookkeeping:

Provider: openai-codex  Model: gpt-5.5
Endpoint: https://chatgpt.com/backend-api/codex
...
Primary model failed — switching to fallback: gpt-5.4 via openai-codex

Some workers retried, compressed long contexts many times, or fell back to another paid model before exiting. That made quota drain much faster than the number of visible cards suggested.

Environment

  • Hermes Agent: current main at the time of report, plus v0.14-era Kanban behavior
  • Host: macOS personal install
  • Gateway: embedded dispatcher enabled through normal config
  • Config before mitigation:
kanban:
  dispatch_in_gateway: true
  dispatch_interval_seconds: 60
  failure_limit: 2

No max_spawn, no max_in_progress, no auto_decompose: false.

Actual behavior

Starting the gateway is enough to start the embedded dispatcher. It sweeps all local boards, auto-decomposes triage work by default, promotes eligible work, and spawns assigned workers without a safe default per-host cap.

This means a user can think they are only running the gateway/WebUI, while Hermes is actually running many background model-backed worker sessions.

Expected behavior

The default behavior should fail safe for personal installs and paid/quota-backed providers.

At minimum, one of these should be true by default:

  • embedded Kanban dispatch is opt-in, not on by default; or
  • the default embedded dispatcher is capped to a very small concurrency, e.g. one worker total; or
  • auto-decompose is manual by default; or
  • the dashboard/gateway requires an explicit operator acknowledgement before launching paid workers; or
  • there is a global budget/quota/rate guard before spawning worker swarms.

A user should not need to discover this only after seeing provider quota drop.

Related issues

This broader default-safety issue overlaps with, but is not identical to:

  • #29014 — blocked/manual-gate tasks repeatedly respawn and drain provider quota
  • #29027 — review-required: blocks are retried and duplicate work
  • #28805 — config surface for dispatcher concurrency caps
  • #28706 — CPU spikes from uncoordinated Kanban worker parallelism
  • #28992 — proposal to use oneshot mode for workers instead of long chat -q

Those are concrete failure modes. This issue is about the higher-level product/runtime default: the gateway should not silently turn local board state into uncapped paid background model work.

Local mitigation used

Disabled embedded dispatch and added conservative caps for future re-enable:

kanban:
  dispatch_in_gateway: false
  dispatch_interval_seconds: 60
  failure_limit: 2
  max_spawn: 1
  max_in_progress: 1
  auto_decompose: false

After this, the gateway can be restarted without immediately resuming the worker swarm.

Suggested fixes

  1. Change first-run / default config to kanban.dispatch_in_gateway: false, or gate it behind explicit dashboard/CLI enablement.
  2. If embedded dispatch remains default-on, apply a safe global default cap such as kanban.max_in_progress: 1.
  3. Make kanban.auto_decompose manual by default, or require a board-level opt-in.
  4. Add a visible dashboard warning when auto-dispatch or auto-decompose is enabled and the active provider is quota-backed.
  5. Add a global per-host spawn/budget guard so one gateway cannot fan out across all boards without operator intent.
  6. Surface an emergency pause switch in the dashboard and CLI that parks dispatch without stopping the whole gateway.

Why this matters

Kanban is useful, but it changes Hermes from an interactive agent into an autonomous worker fleet. That needs a stronger default safety model than ordinary chat or cron. The current defaults make the fleet behavior too easy to activate accidentally and too hard to notice until quota or host resources are already being consumed.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

The default behavior should fail safe for personal installs and paid/quota-backed providers.

At minimum, one of these should be true by default:

  • embedded Kanban dispatch is opt-in, not on by default; or
  • the default embedded dispatcher is capped to a very small concurrency, e.g. one worker total; or
  • auto-decompose is manual by default; or
  • the dashboard/gateway requires an explicit operator acknowledgement before launching paid workers; or
  • there is a global budget/quota/rate guard before spawning worker swarms.

A user should not need to discover this only after seeing provider quota drop.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - ✅(Solved) Fix Kanban defaults can auto-launch unbounded paid worker swarms across all boards [1 pull requests, 1 participants]