hermes - 💡(How to fix) Fix kanban: workers spawned under custom profiles read profile-scoped kanban.db, not the host DB → infinite crash loop [2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#18959Fetched 2026-05-03 04:53:19
View on GitHub
Comments
2
Participants
2
Timeline
6
Reactions
0
Timeline (top)
labeled ×4commented ×2

Tasks created via hermes kanban create land in the host kanban DB at ~/.hermes/kanban.db, but when the in-gateway dispatcher spawns a worker under a non-default profile, the worker reads its profile-scoped ~/.hermes/profiles/<profile>/kanban.db instead. That file is empty, so the worker's kanban_show <task_id> call returns "task not found", the worker exits, the dispatcher reclaims it as crashed, respawns, crashes again — infinite loop until the task is manually archived. The loop burns API tokens for every cycle.

Error Message

┊ ⚡ kanban_sh 0.0s [error]

Root Cause

Tasks created via hermes kanban create land in the host kanban DB at ~/.hermes/kanban.db, but when the in-gateway dispatcher spawns a worker under a non-default profile, the worker reads its profile-scoped ~/.hermes/profiles/<profile>/kanban.db instead. That file is empty, so the worker's kanban_show <task_id> call returns "task not found", the worker exits, the dispatcher reclaims it as crashed, respawns, crashes again — infinite loop until the task is manually archived. The loop burns API tokens for every cycle.

Fix Action

Workaround

For now, run kanban only with the default profile, or symlink/copy ~/.hermes/profiles/<profile>/kanban.db to a hardlink of ~/.hermes/kanban.db before spawning workers (untested — flagged for posterity).

Code Example

hermes kanban init
hermes kanban create "shakedown: confirm dispatcher spawns iris" \
  --body "Reply with one line confirming receipt, then call kanban_complete with summary='ok'." \
  --assignee iris --priority 100 --max-runtime 5m
hermes kanban dispatch
# wait ~60s
hermes kanban show <task_id>

---

[19:43] created {'assignee': 'iris', 'status': 'ready', 'tenant': 'minerva'}
[19:43] [run 1] claimed {'lock': 'hermes:4145577', ...}
[19:43] [run 1] spawned {'pid': 4145579}
[19:44] [run 1] crashed {'pid': 4145579}
[19:44] [run 2] claimed {'lock': 'hermes:4128822', ...}
[19:44] [run 2] spawned {'pid': 4148273}
[19:45] [run 2] crashed ...
# ... continues until manual archive

---

Profile: iris
35 tools · 98 skills · /help for commands

Query: work kanban task t_faa08ca4
Initializing agent...

  ┊ 📋 preparing kanban_show…
  ┊ ⚡ kanban_sh   0.0s [error]

╭─ ⚕ Hermes ───────────────────────────────────────────────────────────────────╮
    Task t_faa08ca4 not found in the kanban board. It may have been archived,
    deleted, or the ID may be incorrect.

    Could you double-check the task ID? You can list current tasks with:

      hermes kanban list
╰──────────────────────────────────────────────────────────────────────────────╯

---

import sqlite3
for p in ["/home/hermes/.hermes/kanban.db",
          "/home/hermes/.hermes/profiles/iris/kanban.db"]:
    print("==", p)
    c = sqlite3.connect(p)
    for r in c.cursor().execute(
        "select id,title,tenant,status from tasks order by created_at desc limit 5"):
        print(r)

# Output:
# == /home/hermes/.hermes/kanban.db
# ('t_625a5f76', 'shakedown 2: no tenant', None, 'running')
# ('t_faa08ca4', 'shakedown: confirm dispatcher spawns iris', 'minerva', 'archived')
# == /home/hermes/.hermes/profiles/iris/kanban.db
# (empty)
RAW_BUFFERClick to expand / collapse

Summary

Tasks created via hermes kanban create land in the host kanban DB at ~/.hermes/kanban.db, but when the in-gateway dispatcher spawns a worker under a non-default profile, the worker reads its profile-scoped ~/.hermes/profiles/<profile>/kanban.db instead. That file is empty, so the worker's kanban_show <task_id> call returns "task not found", the worker exits, the dispatcher reclaims it as crashed, respawns, crashes again — infinite loop until the task is manually archived. The loop burns API tokens for every cycle.

Environment

  • v0.12.0 (current main, head b208d03d1)
  • Linux (placer-jy01), Python venv install
  • Gateway running in-process (PID-tracked, not under systemd in this case but the embedded dispatcher behaves the same)
  • Three custom profiles on disk under ~/.hermes/profiles/{iris,hephaestus,calliope}/
  • Default delegation: google/gemini-2.5-flash via Nous (verified working via delegate_task)

Reproducer

hermes kanban init
hermes kanban create "shakedown: confirm dispatcher spawns iris" \
  --body "Reply with one line confirming receipt, then call kanban_complete with summary='ok'." \
  --assignee iris --priority 100 --max-runtime 5m
hermes kanban dispatch
# wait ~60s
hermes kanban show <task_id>

Repeated with and without --tenant minerva. Same outcome both ways, so this is not tenant-scoped — it reproduces with tenant=NULL too.

Observed event stream

[19:43] created {'assignee': 'iris', 'status': 'ready', 'tenant': 'minerva'}
[19:43] [run 1] claimed {'lock': 'hermes:4145577', ...}
[19:43] [run 1] spawned {'pid': 4145579}
[19:44] [run 1] crashed {'pid': 4145579}
[19:44] [run 2] claimed {'lock': 'hermes:4128822', ...}
[19:44] [run 2] spawned {'pid': 4148273}
[19:45] [run 2] crashed ...
# ... continues until manual archive

Worker stdout (captured via hermes kanban log <id>):

Profile: iris
35 tools · 98 skills · /help for commands

Query: work kanban task t_faa08ca4
Initializing agent...

  ┊ 📋 preparing kanban_show…
  ┊ ⚡ kanban_sh   0.0s [error]

╭─ ⚕ Hermes ───────────────────────────────────────────────────────────────────╮
    Task t_faa08ca4 not found in the kanban board. It may have been archived,
    deleted, or the ID may be incorrect.

    Could you double-check the task ID? You can list current tasks with:

      hermes kanban list
╰──────────────────────────────────────────────────────────────────────────────╯

Smoking gun — two databases

import sqlite3
for p in ["/home/hermes/.hermes/kanban.db",
          "/home/hermes/.hermes/profiles/iris/kanban.db"]:
    print("==", p)
    c = sqlite3.connect(p)
    for r in c.cursor().execute(
        "select id,title,tenant,status from tasks order by created_at desc limit 5"):
        print(r)

# Output:
# == /home/hermes/.hermes/kanban.db
# ('t_625a5f76', 'shakedown 2: no tenant', None, 'running')
# ('t_faa08ca4', 'shakedown: confirm dispatcher spawns iris', 'minerva', 'archived')
# == /home/hermes/.hermes/profiles/iris/kanban.db
# (empty)

The dispatcher (host context) writes/reads the host DB. The worker (profile context) reads its profile-scoped DB. They never meet.

Expected behavior

A worker spawned for a kanban task should read the same kanban.db the dispatcher wrote the task to. Either:

  1. One canonical DB for the whole installation — workers, regardless of profile, point at ~/.hermes/kanban.db.
  2. Profile DB-per-installation is intentional — then the dispatcher should write to the correct one when it claims the task, or the worker spawn should pass the host DB path explicitly (env var, CLI arg, etc.).

Either is fine; today's behavior (silent split-brain) is not.

Severity

High for anyone using kanban with custom profiles (which is the primary use case — kanban's value prop is per-profile workers). The infinite respawn loop also means a single misrouted task is a token-burn footgun.

Workaround

For now, run kanban only with the default profile, or symlink/copy ~/.hermes/profiles/<profile>/kanban.db to a hardlink of ~/.hermes/kanban.db before spawning workers (untested — flagged for posterity).

Related commits

  • c86842546 feat(kanban): durable multi-profile collaboration board (#17805) — original
  • 60e75674e feat(kanban): dispatcher runs in the gateway by default; retire standalone daemon

The DB-resolution path likely needs to bypass per-profile state for kanban specifically.

extent analysis

TL;DR

The most likely fix is to ensure that the worker reads from the same kanban database that the dispatcher writes to, either by using a single canonical database or by passing the host database path to the worker.

Guidance

  • Identify the database resolution path in the code and modify it to bypass per-profile state for kanban tasks.
  • Consider using a single canonical database for the whole installation, or pass the host database path to the worker via an environment variable or command-line argument.
  • Verify that the worker is reading from the correct database by checking the database connection string or querying the database directly.
  • Test the fix by running the reproducer with the modified code and verifying that the worker can read the task from the correct database.

Example

# Example of passing the host database path to the worker
import os
os.environ['KANBAN_DB_PATH'] = '/home/hermes/.hermes/kanban.db'

Notes

The fix may require changes to the database resolution path in the code, and may involve modifying the dispatcher to write to the correct database or passing the host database path to the worker. The exact changes will depend on the specific implementation details of the kanban system.

Recommendation

Apply a workaround by running kanban only with the default profile or by symlinking/copying the profile database to the host database, until a permanent fix can be implemented. This will prevent the infinite respawn loop and token burn, but may not provide the full functionality of the kanban system.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

A worker spawned for a kanban task should read the same kanban.db the dispatcher wrote the task to. Either:

  1. One canonical DB for the whole installation — workers, regardless of profile, point at ~/.hermes/kanban.db.
  2. Profile DB-per-installation is intentional — then the dispatcher should write to the correct one when it claims the task, or the worker spawn should pass the host DB path explicitly (env var, CLI arg, etc.).

Either is fine; today's behavior (silent split-brain) is not.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING