hermes - 💡(How to fix) Fix TUI Native Memory Leak - RSS grows to 13+ GB after ~40 min active usage [1 comments, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#15141Fetched 2026-04-25 06:24:21
View on GitHub
Comments
1
Participants
1
Timeline
7
Reactions
0
Participants
Timeline (top)
labeled ×3commented ×1renamed ×1subscribed ×1

Since the Apr 23–24 update (commits bd929ea5 and 67bfd4b8), the TUI frontend (Node.js Ink renderer) leaks native memory at an alarming rate. The JS heap stays bounded at ~2.7 GB while RSS climbs to 13+ GB, causing the process to freeze and get killed via SIGTERM within ~1 hour of active streaming.

The gateway backend remains healthy (~98 MB) throughout. Auto-heap-dump triggers are miscalibrated because they measure JS heapUsed, which is much smaller than the actual RSS leak.

Root Cause

The gateway backend remains healthy (~98 MB) throughout. Auto-heap-dump triggers are miscalibrated because they measure JS heapUsed, which is much smaller than the actual RSS leak.

Fix Action

Fix / Workaround

  • Hermes version: v0.11.0 / commit 34c3e671 (Apr 24 hotfix)
  • Base commit: bf196a3f (v0.11.0 tag)
  • OS: Fedora Linux 41 (Wayland, KDE)
  • Node.js: v22.22.0
  • Display: TUI (hermes --tui)
  • Config defaults: thinking: expanded, tools: expanded, activity: hidden (from 67bfd4b8)

Code Example

hermes --tui

---

watch -n 5 'ps -o pid,rss,vsz,comm -p $(pgrep -f "ui-tui/dist/entry.js")'

---

{
  "memoryUsage": {
    "arrayBuffers": 768803,
    "external": 21238571,
    "heapTotal": 2798071808,
    "heapUsed": 2728422704,
    "rss": 9572970496
  },
  "memoryGrowthRate": {
    "mbPerHour": 17684.5
  }
}

---

PID     RSS      COMMAND
76262  13,978 MB  node /.../ui-tui/dist/entry.js   ← leaking
58744      98 MB  python -m hermes_cli.main gateway run  ← stable

---

perf(ink): cache text measurements across yoga flex re-passes
RAW_BUFFERClick to expand / collapse

TUI Native Memory Leak - RSS grows to 13+ GB after ~40 min active usage

Summary

Since the Apr 23–24 update (commits bd929ea5 and 67bfd4b8), the TUI frontend (Node.js Ink renderer) leaks native memory at an alarming rate. The JS heap stays bounded at ~2.7 GB while RSS climbs to 13+ GB, causing the process to freeze and get killed via SIGTERM within ~1 hour of active streaming.

The gateway backend remains healthy (~98 MB) throughout. Auto-heap-dump triggers are miscalibrated because they measure JS heapUsed, which is much smaller than the actual RSS leak.

Environment

  • Hermes version: v0.11.0 / commit 34c3e671 (Apr 24 hotfix)
  • Base commit: bf196a3f (v0.11.0 tag)
  • OS: Fedora Linux 41 (Wayland, KDE)
  • Node.js: v22.22.0
  • Display: TUI (hermes --tui)
  • Config defaults: thinking: expanded, tools: expanded, activity: hidden (from 67bfd4b8)

Reproduction Steps

  1. Start Hermes TUI:
    hermes --tui
  2. Engage in normal streaming conversation with tool calls and reasoning blocks
  3. Leave thinking and tools sections expanded (default since Apr 24)
  4. Observe RSS every 30 s:
    watch -n 5 'ps -o pid,rss,vsz,comm -p $(pgrep -f "ui-tui/dist/entry.js")'

Expected Behavior

RSS should stay under ~1 GB for indefinite usage. Occasional bump during large streaming payloads, but stable between turns.

Actual Behavior

PhaseTimeRSS (MB)Notes
Startt+0~157Baseline
Idle/light~10 min~247Slow growth
Active streaming~20–40 min~525 → 6,066Accelerating
Peak~52 min13,978Process unresponsive
Crash~53 minSIGTERM, auto-restart with new PID

Two confirmed crash cycles (same day)

PIDStartPeak RSSDuration before crash
58836 (morning)~9.4 GB~20 min
76262 (afternoon)14:0313.978 GB~53 min

Diagnostic Evidence

Heap dump .diagnostics.json at peak (auto-critical)

{
  "memoryUsage": {
    "arrayBuffers": 768803,
    "external": 21238571,
    "heapTotal": 2798071808,
    "heapUsed": 2728422704,
    "rss": 9572970496
  },
  "memoryGrowthRate": {
    "mbPerHour": 17684.5
  }
}

RSS (9.5 GB) >> heapUsed (2.7 GB). The leak is entirely outside V8.

Process comparison at peak

PID     RSS      COMMAND
76262  13,978 MB  node /.../ui-tui/dist/entry.js   ← leaking
58744      98 MB  python -m hermes_cli.main gateway run  ← stable

Suspected Root Cause

Primary suspect: bd929ea5 — Ink text measurement cache

perf(ink): cache text measurements across yoga flex re-passes

File: ui-tui/packages/hermes-ink/src/ink/dom.ts

The commit added _textMeasureCache to ink-text DOM elements, keyed by ${width}|${widthMode}. While bounded to 16 entries per node (FIFO eviction), the underlying Yoga layout system is backed by C++ WASM state. When the Ink reconciler tears down a subtree via freeRecursive() / clearYogaNodeReferences(), it nulls JS references but may leave:

  • WASM text measurement buffers
  • Yoga layout node C++ instances
  • Cache generation counter objects that hold references

Each streaming update triggers markDirty() on expanded sections (default since 67bfd4b8), causing Yoga to re-measure. With continuous thinking + tools streaming, this becomes a fast leak.

Amplifier: 67bfd4b8 - expanded sections by default

From Apr 24, thinking: expanded and tools: expanded dramatically increase the number of Yoga measure/re-layout cycles per frame compared to the previous collapsed-by-default UI.

Additional Context

Heap dump misfire

The memoryMonitor.ts triggers on JS heapUsed (high=1.5 GB, critical=2.5 GB). Because this leak is native, a process can climb to 13+ GB RSS while JS heap sits at 2.7 GB. The monitor dumps 2.5 GB .heapsnapshot files repeatedly with zero diagnostic value for this bug, and disk usage in ~/.hermes/heapdumps/ grows to 24+ GB.

Gateway unaffected, crash log confirms TUI death

~/.hermes/logs/tui_gateway_crash.log has Python bridge alive in sys.stdin loop at SIGTERM delivery. The Node parent dies first; the Python subprocess is orphaned.

Related PRs checked

  • PR #12822 (commit 904f20d6, "idle queue OOM fix") is already present. It fixed a JS-heap idle-time loop — not this native leak. The two bugs have different signatures and triggers.

Possible Fixes (for discussion)

  1. Investigate clearYogaNodeReferences — ensure all WASM nodes are explicitly freed before nulling. Check if yoga-layout WASM bindings need explicit free() calls.
  2. Invalidate _textMeasureCache before clearYogaNodeReferences — the cache is cleared in clearYogaNodeReferences via _textMeasureCache = undefined, but if the Map retains entries referenced by the WASM side, this doesn't help.
  3. Cap _textMeasureCache.entries growth — already 16 entries, but keyed by ${width}|${widthMode}. If width probes are sparse, the cache may churn without actually hitting. Consider a global/shared cache with TTL.
  4. Monitor RSS in memoryMonitor.ts — add rss alongside heapUsed to detect native leaks earlier.

Data Available

Full monitor log at ~/.hermes/logs/tui-rss-monitor.log:

  • 30-second RSS samples across two crash cycles
  • Format: pid,time,rss_kb,rss_mb,vsz_kb,command
  • Captures transition from PID 76262 (peak 13,978 MB) to new PID 81703

Checklist

  • Confirmed on latest main (34c3e671)
  • Confirmed with default config (no custom sections overrides)
  • Reproduced in normal usage (not a stress test)
  • Gateway unaffected

extent analysis

TL;DR

Investigate and modify the clearYogaNodeReferences function to ensure proper deallocation of WASM nodes and text measurement buffers to fix the native memory leak.

Guidance

  • Review the ink/dom.ts file and the _textMeasureCache implementation to understand how text measurements are cached and cleared.
  • Verify that the clearYogaNodeReferences function properly frees all WASM nodes and text measurement buffers before nulling JS references.
  • Consider adding a global/shared cache with a TTL to cap _textMeasureCache growth and prevent churn.
  • Modify the memoryMonitor.ts to monitor RSS alongside heapUsed to detect native leaks earlier.

Example

No code snippet is provided as the issue requires investigation and modification of the existing codebase.

Notes

The native memory leak is caused by the improper deallocation of WASM nodes and text measurement buffers in the clearYogaNodeReferences function. The leak is amplified by the expanded sections default setting, which increases the number of Yoga measure/re-layout cycles per frame.

Recommendation

Apply a workaround by modifying the clearYogaNodeReferences function to ensure proper deallocation of WASM nodes and text measurement buffers. This will help mitigate the native memory leak until a permanent fix is implemented.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix TUI Native Memory Leak - RSS grows to 13+ GB after ~40 min active usage [1 comments, 1 participants]