hermes - ✅(Solved) Fix kanban workers fabricate card IDs in complete() — no verification that claimed cards exist [2 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#20017Fetched 2026-05-06 06:39:12
View on GitHub
Comments
0
Participants
1
Timeline
8
Reactions
0
Participants
Timeline (top)
labeled ×3cross-referenced ×2referenced ×2closed ×1

Kanban workers completing verification tasks routinely claim to have created remediation cards that do not exist in the database. Workers call kanban_complete() naming card IDs (e.g., "created remediation cards t_X, t_Y, t_Z") that were never actually created. The kernel accepts kanban_complete() with no validation that the claimed IDs exist.

Error Message

  • Silently failed (tool layer error, credential issue, etc.)
  1. Add optional created_cards field on kanban_complete() metadata. If a worker claims "created_cards": ["t_A", "t_B"], the kernel verifies each ID exists in the DB and has the completing worker as parent/creator before accepting the completion. Reject with an error message naming the missing IDs.

Root Cause

kanban_complete() has no validation layer. A worker can say "created cards t_A, t_B, t_C" in their summary/metadata and the kernel accepts it unconditionally. There's no cross-check against task_links or the task table itself.

Fix Action

Fix / Workaround

In a single 2026-05-03 dispatcher session, 5 of 6 kanban verification workers reported creating remediation cards with 11 specific task IDs. Zero of those IDs existed in the database. The kanban_create() calls either:

  • Silently failed (tool layer error, credential issue, etc.)
  • Were hallucinated by the worker (LLM invented plausible-looking IDs)
  • Were described in prose but never actually executed

PR fix notes

PR #20022: fix(kanban): validate created card handoffs

Description (problem / solution / changelog)

Fixes #20017

Summary

  • validate optional metadata.created_cards before completing a Kanban task
  • reject claimed card IDs that do not exist, or that are neither linked children nor created by the completing worker profile
  • document the structured created_cards field in the kanban_complete tool schema

Scope

This only validates explicit structured claims in metadata.created_cards. Existing summaries/results without that field continue to complete normally.

Verification

  • scripts/run_tests.sh tests/hermes_cli/test_kanban_core_functionality.py::test_complete_rejects_missing_created_cards_metadata tests/hermes_cli/test_kanban_core_functionality.py::test_complete_accepts_created_cards_linked_to_parent tests/hermes_cli/test_kanban_core_functionality.py::test_complete_rejects_unrelated_created_cards_metadata tests/hermes_cli/test_kanban_core_functionality.py::test_complete_accepts_created_cards_created_by_worker tests/hermes_cli/test_kanban_core_functionality.py::test_completed_event_payload_carries_summary -> 5 passed
  • scripts/run_tests.sh tests/tools/test_kanban_tools.py tests/hermes_cli/test_kanban_core_functionality.py::test_complete_rejects_missing_created_cards_metadata tests/hermes_cli/test_kanban_core_functionality.py::test_complete_accepts_created_cards_linked_to_parent tests/hermes_cli/test_kanban_core_functionality.py::test_complete_rejects_unrelated_created_cards_metadata tests/hermes_cli/test_kanban_core_functionality.py::test_complete_accepts_created_cards_created_by_worker -> 40 passed
  • git diff --check

Changed files

  • hermes_cli/kanban_db.py (modified, +63/-0)
  • tests/hermes_cli/test_kanban_core_functionality.py (modified, +93/-0)
  • tools/kanban_tools.py (modified, +4/-2)

PR #20232: feat(kanban): hallucination gate + recovery UX for worker-created-card claims (closes #20017)

Description (problem / solution / changelog)

Workers claiming cards they created on kanban_complete now have those claims verified by the kernel, so phantom/hallucinated ids can't leak into downstream automation. Dashboard surfaces flagged tasks with a ⚠ badge + attention strip, and every task drawer gains a Recovery section with Reclaim / Reassign / Change-profile-model actions so operators can act on stuck workers without waiting for TTL.

Closes #20017.

What changed

Kernel gate (hermes_cli/kanban_db.py):

  • New created_cards: list[str] param on complete_task. Each id is checked against tasks.created_by matching the completing task's assignee profile.
  • Phantom id → raise HallucinatedCardsError (subclass of ValueError so existing tool-error handlers treat it as recoverable), and emit a completion_blocked_hallucination event so the rejected attempt is auditable. Task stays in its prior state.
  • All-verified → record the manifest as verified_cards on the completed event payload.
  • Prose scan: after successful completion, regex-scan summary/result for t_<hex> patterns; any id that doesn't resolve emits a non-blocking suspected_hallucinated_references event.

Recovery helpers (hermes_cli/kanban_db.py):

  • reclaim_task(conn, task_id, reason=...) — release active claim immediately (unlike release_stale_claims which only acts post-TTL). Emits reclaimed event with manual: True payload.
  • reassign_task(conn, task_id, profile, reclaim_first=..., reason=...) — switch a task's profile, optionally releasing a live claim in the same op.

Tool (tools/kanban_tools.py):

  • kanban_complete accepts created_cards; tool-layer catches HallucinatedCardsError and returns a structured tool_error with phantom ids listed so the worker can retry with a correct list.

CLI (hermes_cli/kanban.py):

  • hermes kanban reclaim <task_id> [--reason ...]
  • hermes kanban reassign <task_id> <profile> [--reclaim] [--reason ...]

API (plugins/kanban/dashboard/plugin_api.py):

  • POST /api/plugins/kanban/tasks/{id}/reclaim (body: {reason})
  • POST /api/plugins/kanban/tasks/{id}/reassign (body: {profile, reclaim_first, reason})
  • _compute_warnings_for_tasks() helper reused by /board (all tasks) and /tasks/{id} (single). Returns {count, kinds, latest_at} per task.
  • Active-vs-stale rule: a completed or edited event AFTER the hallucination event clears the warning. Events persist for audit, but the badge doesn't permanently stigmatise.

Dashboard UI (plugins/kanban/dashboard/dist/index.js + style.css):

  • ⚠ badge on cards with active warnings.
  • AttentionStrip — dismissible strip at top of board listing flagged tasks with Open buttons.
  • Events-tab callout — hallucination events render with red left border, amber icon, phantom ids as styled red chips.
  • RecoverySection in task drawer — keyed by task id so state doesn't leak between drawers. Auto-opens when the task has warnings. Three actions: Reclaim (button + reason input), Reassign (profile picker + reclaim-first checkbox), Change profile model (copy-to-clipboard CLI hint since profile config lives on disk).

Skills:

  • skills/devops/kanban-worker/SKILL.md — new "Claiming cards you actually created" section with good/bad examples.
  • skills/devops/kanban-orchestrator/SKILL.md — new "Recovering stuck workers" section.

Validation

Result
Full kanban test suite359/359 pass (test_kanban_{db,cli,boards,core_functionality} + dashboard + tools)
E2E: phantom id blocked, audit event lands, task stays in prior state
E2E: prose scan fires on phantom prose refs, doesn't block
E2E: cross-worker card rejection (card exists but different created_by)
Live dashboard: attention strip, card badges, events callout render✓ (screenshots in PR comments)
Live dashboard: Reassign action flips assignee + refreshes drawer
Live dashboard: Reclaim on running task flips status to ready + emits manual reclaimed event
Recovery section doesn't leak state between task drawers✓ (keyed by task id)
Warnings cleared after clean re-completion, events retained for audit

Honest caveat on scope

This ships the behavioral contract + visible recovery path. It does NOT include:

  • Fleet-health roll-up per-profile (can come later if manual intervention is slow in practice).
  • Circuit breaker that auto-disables a profile after N hallucination-blocks (deliberately deferred — the blast radius risk of auto-disabling outweighs the benefit until we have data).

Both are straightforward follow-ons once we see how often this fires in the wild.

Changed files

  • hermes_cli/kanban.py (modified, +70/-0)
  • hermes_cli/kanban_db.py (modified, +280/-6)
  • plugins/kanban/dashboard/dist/index.js (modified, +376/-5)
  • plugins/kanban/dashboard/dist/style.css (modified, +253/-0)
  • plugins/kanban/dashboard/plugin_api.py (modified, +162/-1)
  • skills/devops/kanban-orchestrator/SKILL.md (modified, +10/-0)
  • skills/devops/kanban-worker/SKILL.md (modified, +26/-0)
  • tests/hermes_cli/test_kanban_cli.py (modified, +78/-0)
  • tests/hermes_cli/test_kanban_core_functionality.py (modified, +266/-0)
  • tests/plugins/test_kanban_dashboard_plugin.py (modified, +218/-0)
  • tools/kanban_tools.py (modified, +52/-5)
RAW_BUFFERClick to expand / collapse

Summary

Kanban workers completing verification tasks routinely claim to have created remediation cards that do not exist in the database. Workers call kanban_complete() naming card IDs (e.g., "created remediation cards t_X, t_Y, t_Z") that were never actually created. The kernel accepts kanban_complete() with no validation that the claimed IDs exist.

Evidence

In a single 2026-05-03 dispatcher session, 5 of 6 kanban verification workers reported creating remediation cards with 11 specific task IDs. Zero of those IDs existed in the database. The kanban_create() calls either:

  • Silently failed (tool layer error, credential issue, etc.)
  • Were hallucinated by the worker (LLM invented plausible-looking IDs)
  • Were described in prose but never actually executed

Root cause

kanban_complete() has no validation layer. A worker can say "created cards t_A, t_B, t_C" in their summary/metadata and the kernel accepts it unconditionally. There's no cross-check against task_links or the task table itself.

Impact

Downstream orchestrators and humans see completion summaries claiming remediation work that never happened. Tasks stay broken while looking like they were addressed. This undermines the multi-agent reliability model — you can't trust worker handoffs.

Suggested Fixes (in order of preference)

  1. Add optional created_cards field on kanban_complete() metadata. If a worker claims "created_cards": ["t_A", "t_B"], the kernel verifies each ID exists in the DB and has the completing worker as parent/creator before accepting the completion. Reject with an error message naming the missing IDs.

  2. Post-completion validation hook. After completing, if the summary or metadata contains task ID patterns (regex: t_[a-f0-9]+), verify those IDs exist. If any are missing, auto-comment on the task warning the human.

  3. Tighten the kanban-worker skill (skills/devops/kanban-worker/SKILL.md): require workers to verify kanban_create() return values before naming card IDs in completion. Add a rule: "Never claim you created a card unless you have a successful return value containing the ID."

Option 1 is the strongest — structural verification beats prose promises.

Environment

Hermes Agent v0.11.0 (a7fb79efb)

extent analysis

TL;DR

Implementing validation for kanban_complete() calls to verify the existence of claimed remediation card IDs in the database is the most likely fix.

Guidance

  • Add an optional created_cards field to kanban_complete() metadata to verify each claimed ID exists in the database before accepting completion.
  • Consider implementing a post-completion validation hook to check for task ID patterns in the summary or metadata and verify their existence.
  • Review and tighten the kanban-worker skill to require verification of kanban_create() return values before claiming card creation.

Example

{
  "metadata": {
    "created_cards": ["t_A", "t_B"]
  }
}

This example shows how the created_cards field could be added to the kanban_complete() metadata to enable verification of claimed card IDs.

Notes

The suggested fixes assume that the kanban_create() function returns a unique ID for each created card, and that this ID can be used to verify the card's existence in the database.

Recommendation

Apply workaround by implementing the first suggested fix: add an optional created_cards field on kanban_complete() metadata to verify each claimed ID exists in the database. This approach provides structural verification and is the strongest of the suggested fixes.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING