hermes - 💡(How to fix) Fix Bug: slash_worker lifecycle gaps and system fragility observed during intensive Dashboard usage (#21370)

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Fix Action

Fix / Workaround

I have the patch ready for PR. Please let me know if you would like me to submit it.

RAW_BUFFERClick to expand / collapse

During high-frequency interaction tests with hermes dashboard on macOS, we observed a failure state that highlights critical gaps in subprocess lifecycle management. While #21370 identifies the leak, our experience suggests that these leaks contribute to a state of system fragility.

Observations

  1. Accumulation: Intensive UI usage orphans dozens of slash_worker processes (parented to PID 1).
  2. Update Collision: A routine "hermes update" was executed while orphans were present. The update became "stuck" and the environment was corrupted.
  3. Failure State: Subsequent commands failed with .../venv/bin/python3: No module named pip.
  4. Recovery Issue: Orphans survived atexit hooks and remained active, complicating manual recovery and environment sync.

Architectural Gaps

  • Unmanaged Resource: slash_worker is spawned outside tools.process_registry, making it invisible to global teardown logic.
  • Fragile Cleanup: Reliance on atexit is insufficient for update-induced restarts or hard crashes.
  • Lack of Self-Termination: The worker cannot detect when its parent has died.

Proposed Fix

I have verified a "Defense-in-Depth" solution on macOS following AGENTS.md:

  1. Unified Management: Extended ProcessRegistry with register_host_process to track these workers.
  2. Fingerprinted Watchdog: Added a thread to slash_worker.py monitoring parent PID + create_time.

I have the patch ready for PR. Please let me know if you would like me to submit it.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix Bug: slash_worker lifecycle gaps and system fragility observed during intensive Dashboard usage (#21370)