hermes - 💡(How to fix) Fix Kanban dispatcher repeatedly respawns blocked/manual-gate tasks, causing provider quota drain [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#29014Fetched 2026-05-20 04:00:39
View on GitHub
Comments
0
Participants
1
Timeline
6
Reactions
0
Author
Participants
Timeline (top)
labeled ×4cross-referenced ×2

Root Cause

Suspected Root Cause

Fix Action

Workaround

Set kanban.dispatch_in_gateway: false and restart the gateway. Then manually reclaim/block the affected tasks.

Code Example

2026-05-19 19:50:32,930 INFO gateway.run: kanban dispatcher [kiln-phase2-1-hyperframes-borrow]: spawned=1 reclaimed=0 crashed=0 timed_out=0 promoted=1 auto_blocked=0
2026-05-19 19:50:32,930 INFO gateway.run: kanban dispatcher [trt-010-wordpress-template-port]: spawned=1 reclaimed=0 crashed=0 timed_out=0 promoted=1 auto_blocked=0
2026-05-19 19:50:32,930 INFO gateway.run: kanban dispatcher [trt-rebuild-mission]: spawned=1 reclaimed=0 crashed=0 timed_out=0 promoted=1 auto_blocked=0
2026-05-19 19:52:33,252 INFO gateway.run: kanban dispatcher [kiln-phase2-1-hyperframes-borrow]: spawned=2 reclaimed=0 crashed=0 timed_out=0 promoted=2 auto_blocked=0
2026-05-19 19:54:33,579 INFO gateway.run: kanban dispatcher [kiln-phase2-1-hyperframes-borrow]: spawned=1 reclaimed=0 crashed=0 timed_out=0 promoted=1 auto_blocked=0
2026-05-19 19:54:33,580 INFO gateway.run: kanban dispatcher [trt-010-wordpress-template-port]: spawned=1 reclaimed=0 crashed=0 timed_out=0 promoted=1 auto_blocked=0
2026-05-19 19:54:33,580 INFO gateway.run: kanban dispatcher [trt-rebuild-mission]: spawned=1 reclaimed=0 crashed=0 timed_out=0 promoted=1 auto_blocked=0

---

--- trt-rebuild-mission t_dd4e99d0
136 blocked default 25s 2026-05-19 19:55
137 blocked default 23s 2026-05-19 20:02
138 crashed default 1m 2026-05-19 20:03
139 blocked default 24s 2026-05-19 20:04

--- trt-010-wordpress-template-port t_e37e1762
47 crashed default 6m 2026-05-19 19:55
48 crashed default 2m 2026-05-19 20:02
49 reclaimed default 3m 2026-05-19 20:04
50 blocked default 0s 2026-05-19 20:07

--- kiln-phase2-1-hyperframes-borrow t_c489a7c7
43 crashed kiln-pm 6m 2026-05-19 19:55
44 crashed kiln-pm 2m 2026-05-19 20:02
45 reclaimed kiln-pm 3m 2026-05-19 20:04
46 blocked kiln-pm 0s 2026-05-19 20:07

--- kiln-phase2-1-hyperframes-borrow t_10a6f605
130 crashed kiln-mcp-dev 1m 2026-05-19 20:02
131 crashed kiln-mcp-dev 1m 2026-05-19 20:03
132 reclaimed kiln-mcp-dev 3m 2026-05-19 20:04
133 blocked kiln-mcp-dev 0s 2026-05-19 20:07

---

kanban:
  dispatch_in_gateway: false

---

2026-05-19 20:06:40,289 INFO gateway.run: kanban dispatcher: disabled via config kanban.dispatch_in_gateway=false
RAW_BUFFERClick to expand / collapse

Bug Description

The embedded gateway Kanban dispatcher can repeatedly respawn workers for tasks that are already blocked/manual-gated, causing rapid provider quota drain. In this incident, the dispatcher spawned full Hermes worker agents every tick for approval-gate / governance-gate / dirty-tree-preflight tasks that immediately re-blocked themselves.

This created a loop where workers repeatedly consumed OpenAI Codex quota without making progress.

Impact

High cost / quota risk. Four workers were being spawned roughly every dispatcher tick across multiple boards, despite the tasks being blocked or requiring explicit human approval.

Observed task attempt counts:

  • t_dd4e99d0 on trt-rebuild-mission: 139 runs
  • t_10a6f605 on kiln-phase2-1-hyperframes-borrow: 133 runs
  • t_e37e1762 on trt-010-wordpress-template-port: 50 runs
  • t_c489a7c7 on kiln-phase2-1-hyperframes-borrow: 46 runs

Environment

  • Hermes Agent: v0.14.0 (2026.5.16)
  • Commit: 64a9a199b
  • Host: WSL
  • Gateway: systemd user service, hermes_cli.main gateway run --replace
  • Kanban dispatcher: embedded gateway dispatcher
  • Config before mitigation: kanban.dispatch_in_gateway: true
  • Config after mitigation: kanban.dispatch_in_gateway: false

Actual Behavior

Gateway logs showed repeated spawning/promoting of the same blocked/manual-gate tasks:

2026-05-19 19:50:32,930 INFO gateway.run: kanban dispatcher [kiln-phase2-1-hyperframes-borrow]: spawned=1 reclaimed=0 crashed=0 timed_out=0 promoted=1 auto_blocked=0
2026-05-19 19:50:32,930 INFO gateway.run: kanban dispatcher [trt-010-wordpress-template-port]: spawned=1 reclaimed=0 crashed=0 timed_out=0 promoted=1 auto_blocked=0
2026-05-19 19:50:32,930 INFO gateway.run: kanban dispatcher [trt-rebuild-mission]: spawned=1 reclaimed=0 crashed=0 timed_out=0 promoted=1 auto_blocked=0
2026-05-19 19:52:33,252 INFO gateway.run: kanban dispatcher [kiln-phase2-1-hyperframes-borrow]: spawned=2 reclaimed=0 crashed=0 timed_out=0 promoted=2 auto_blocked=0
2026-05-19 19:54:33,579 INFO gateway.run: kanban dispatcher [kiln-phase2-1-hyperframes-borrow]: spawned=1 reclaimed=0 crashed=0 timed_out=0 promoted=1 auto_blocked=0
2026-05-19 19:54:33,580 INFO gateway.run: kanban dispatcher [trt-010-wordpress-template-port]: spawned=1 reclaimed=0 crashed=0 timed_out=0 promoted=1 auto_blocked=0
2026-05-19 19:54:33,580 INFO gateway.run: kanban dispatcher [trt-rebuild-mission]: spawned=1 reclaimed=0 crashed=0 timed_out=0 promoted=1 auto_blocked=0

Representative task run tails:

--- trt-rebuild-mission t_dd4e99d0
136 blocked default 25s 2026-05-19 19:55
137 blocked default 23s 2026-05-19 20:02
138 crashed default 1m 2026-05-19 20:03
139 blocked default 24s 2026-05-19 20:04

--- trt-010-wordpress-template-port t_e37e1762
47 crashed default 6m 2026-05-19 19:55
48 crashed default 2m 2026-05-19 20:02
49 reclaimed default 3m 2026-05-19 20:04
50 blocked default 0s 2026-05-19 20:07

--- kiln-phase2-1-hyperframes-borrow t_c489a7c7
43 crashed kiln-pm 6m 2026-05-19 19:55
44 crashed kiln-pm 2m 2026-05-19 20:02
45 reclaimed kiln-pm 3m 2026-05-19 20:04
46 blocked kiln-pm 0s 2026-05-19 20:07

--- kiln-phase2-1-hyperframes-borrow t_10a6f605
130 crashed kiln-mcp-dev 1m 2026-05-19 20:02
131 crashed kiln-mcp-dev 1m 2026-05-19 20:03
132 reclaimed kiln-mcp-dev 3m 2026-05-19 20:04
133 blocked kiln-mcp-dev 0s 2026-05-19 20:07

The tasks were not productive work. They repeatedly concluded variations of:

  • approval required before any staging write
  • governance decision required before Phase 3
  • repo dirty, block rather than overwrite unrelated work

Expected Behavior

Once a task is blocked, especially with a human approval / manual gate / dirty-tree preflight reason, the dispatcher should not promote or respawn it on future ticks until an explicit unblock or state-changing action occurs.

A task that repeatedly blocks with the same reason should trip a circuit breaker and stay parked instead of consuming provider quota every tick.

Mitigation Used

Emergency mitigation was to disable embedded gateway dispatch globally:

kanban:
  dispatch_in_gateway: false

Then restart the gateway. After restart, the log confirmed:

2026-05-19 20:06:40,289 INFO gateway.run: kanban dispatcher: disabled via config kanban.dispatch_in_gateway=false

No active work kanban task ... workers remained after parking/reclaiming the tasks.

Suspected Root Cause

Unknown, but likely one of:

  1. The dispatcher promotion logic treats certain blocked tasks as promotable every tick.
  2. Manual gate tasks were left in a state that promote_ready/dependency recomputation turns back into ready despite blocked status.
  3. The circuit breaker/failure limit does not apply to repeated blocked outcomes, only spawn failures/crashes/timeouts.
  4. Multi-board embedded dispatch has no global safety cap for repeated same-task spawn/block cycles.

Suggested Fixes

  • Do not auto-promote tasks from blocked unless there is an explicit unblock or approved state transition.
  • Add a repeated-block circuit breaker keyed by (task_id, normalized_reason).
  • Add a dispatcher-level safety cap, e.g. max same-task spawns per hour/day.
  • Emit a high-severity gateway warning when a task is spawned and blocked more than N times in a short window.
  • Consider treating "approval-required", "manual gate", "dirty tree", and "governance decision needed" block reasons as hard parks.

Workaround

Set kanban.dispatch_in_gateway: false and restart the gateway. Then manually reclaim/block the affected tasks.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix Kanban dispatcher repeatedly respawns blocked/manual-gate tasks, causing provider quota drain [1 participants]