hermes - 💡(How to fix) Fix [Bug] kanban.db corrupted on WSL2 with multiple gateway processes

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

Error messages:

  • sqlite3.OperationalError: disk I/O error
  • ERROR kanban dispatcher: board default database is not a valid SQLite database

Root Cause

  1. Multiple gateway processes (12+) open the same ~/.hermes/kanban.db file simultaneously
  2. Each process holds 2-6 file descriptors to the database file
  3. WAL mode's -shm (shared memory) file has synchronization issues on WSL2's 9p filesystem
  4. Even with flock-based serialization in write_txn(), corruption still occurs
  5. Corruption happens at the filesystem level, not SQLite protocol level

Fix Action

Fix / Workaround

Error messages:

  • sqlite3.DatabaseError: database disk image is malformed
  • sqlite3.OperationalError: disk I/O error
  • ERROR kanban dispatcher: board default database is not a valid SQLite database

Attempted Fixes

  1. Added flock (LOCK_EX/LOCK_UN) in write_txn() — works in isolated tests but corruption still in production
  2. Added PRAGMA busy_timeout=5000 — reduces but doesn't eliminate corruption
  3. Tried tmpfs for kanban.db — works but data lost on WSL shutdown
  4. Current workaround: watchdog script auto-rebuilds database on corruption

Reproduction

  1. Start 12+ gateway processes (multiple profiles)
  2. Dispatch 5+ concurrent kanban tasks
  3. Within minutes, database corruption occurs
RAW_BUFFERClick to expand / collapse

Environment

  • OS: WSL2 (Windows 11), Ubuntu
  • hermes-agent: v0.14.0 (latest main branch)
  • SQLite: 3.37.2
  • Filesystem: ext4 on WSL2 virtual disk (VHDX)

Problem

kanban.db gets corrupted when multiple gateway processes run simultaneously.

Error messages:

  • sqlite3.DatabaseError: database disk image is malformed
  • sqlite3.OperationalError: disk I/O error
  • ERROR kanban dispatcher: board default database is not a valid SQLite database

Root Cause

  1. Multiple gateway processes (12+) open the same ~/.hermes/kanban.db file simultaneously
  2. Each process holds 2-6 file descriptors to the database file
  3. WAL mode's -shm (shared memory) file has synchronization issues on WSL2's 9p filesystem
  4. Even with flock-based serialization in write_txn(), corruption still occurs
  5. Corruption happens at the filesystem level, not SQLite protocol level

Attempted Fixes

  1. Added flock (LOCK_EX/LOCK_UN) in write_txn() — works in isolated tests but corruption still in production
  2. Added PRAGMA busy_timeout=5000 — reduces but doesn't eliminate corruption
  3. Tried tmpfs for kanban.db — works but data lost on WSL shutdown
  4. Current workaround: watchdog script auto-rebuilds database on corruption

Reproduction

  1. Start 12+ gateway processes (multiple profiles)
  2. Dispatch 5+ concurrent kanban tasks
  3. Within minutes, database corruption occurs

Request

Is there a recommended configuration or fix for running multiple gateway processes with a shared kanban.db on WSL2?

Related: microsoft/WSL#2395 (sqlite write locks aren't respected in WSL)

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix [Bug] kanban.db corrupted on WSL2 with multiple gateway processes