hermes - 💡(How to fix) Fix WAL file grows to 90+ MB due to long-running read transactions blocking auto-checkpoint

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Root Cause

The gateway process holds long-running read transactions that prevent the auto-checkpoint from completing. SQLite cannot checkpoint while a reader still references old pages in the WAL, so the WAL keeps growing until the reader releases its snapshot.

Fix Action

Workaround

Running sqlite3 ~/.hermes/state.db "PRAGMA wal_checkpoint(TRUNCATE);" periodically (e.g. via cron) keeps the WAL in check, but this is a band-aid.

Code Example

$ sqlite3 ~/.hermes/state.db "PRAGMA wal_autocheckpoint;"
1000
$ sqlite3 ~/.hermes/state.db "PRAGMA page_size;"
4096
# Auto-checkpoint should trigger at ~4 MB, but WAL grows to 90+ MB
RAW_BUFFERClick to expand / collapse

Issue

The SQLite WAL file (state.db-wal) grows continuously to 90+ MB, despite wal_autocheckpoint being set to the default of 1000 pages (4 MB).

Root Cause

The gateway process holds long-running read transactions that prevent the auto-checkpoint from completing. SQLite cannot checkpoint while a reader still references old pages in the WAL, so the WAL keeps growing until the reader releases its snapshot.

Evidence

$ sqlite3 ~/.hermes/state.db "PRAGMA wal_autocheckpoint;"
1000
$ sqlite3 ~/.hermes/state.db "PRAGMA page_size;"
4096
# Auto-checkpoint should trigger at ~4 MB, but WAL grows to 90+ MB

PRAGMA wal_checkpoint(TRUNCATE) immediately shrinks the WAL back to ~4 KB, confirming the issue is blocked checkpoints, not a leak.

Workaround

Running sqlite3 ~/.hermes/state.db "PRAGMA wal_checkpoint(TRUNCATE);" periodically (e.g. via cron) keeps the WAL in check, but this is a band-aid.

Suggested Fix

The gateway code should ensure read transactions are short-lived and properly closed before starting new write batches. Specifically:

  1. Review the session store read patterns in hermes_state.py — long-lived SELECT statements or unclosed cursor objects may hold read locks.
  2. Consider calling PRAGMA wal_checkpoint(PASSIVE) or PRAGMA wal_checkpoint(TRUNCATE) after each session write completes, rather than relying solely on the auto-checkpoint.
  3. Alternatively, reduce wal_autocheckpoint to a lower value (e.g. 100) so checkpoints are attempted more frequently, giving more opportunities for the checkpoint to succeed when no reader is active.

Environment

  • Hermes Agent: latest (git install)
  • OS: macOS 15 / Sequoia
  • SQLite: 3.x (system)
  • Gateway runs continuously (Telegram)

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix WAL file grows to 90+ MB due to long-running read transactions blocking auto-checkpoint