hermes - 💡(How to fix) Fix acquire_scoped_lock: zombie processes (state Z) not detected as stale, causing false lock conflicts [2 pull requests]

Official PRs (…)
ON THIS PAGE

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

The acquire_scoped_lock function in gateway/status.py does not detect zombie processes (state Z) as stale. Zombies pass all existing checks:

  • _pid_exists() → true (zombies still appear in /proc)
  • Start time comparison → matches (same process, just dead)
  • State check → only covers T/t (stopped), not Z (zombie)

Root Cause

The acquire_scoped_lock function in gateway/status.py does not detect zombie processes (state Z) as stale. Zombies pass all existing checks:

  • _pid_exists() → true (zombies still appear in /proc)
  • Start time comparison → matches (same process, just dead)
  • State check → only covers T/t (stopped), not Z (zombie)

Fix Action

Fixed

Code Example

if _state in {"T", "t", "Z"}:
    stale = True
RAW_BUFFERClick to expand / collapse

Description

The acquire_scoped_lock function in gateway/status.py does not detect zombie processes (state Z) as stale. Zombies pass all existing checks:

  • _pid_exists() → true (zombies still appear in /proc)
  • Start time comparison → matches (same process, just dead)
  • State check → only covers T/t (stopped), not Z (zombie)

Steps to Reproduce

  1. Start gateway with Telegram platform
  2. Gateway process dies but parent process (e.g., dashboard) doesn't reap it → becomes zombie
  3. Try to restart gateway → receives Telegram bot token already in use (PID <zombie>)

Expected Behavior

Zombie processes should be treated as stale — they hold no resources, no FDs, no connections. The lock should be released.

Actual Behavior

Lock remains held by zombie PID, blocking new gateway connections until the lock file is manually removed.

Proposed Fix

Add a state check for Z alongside the existing T/t check, around line 650 of gateway/status.py:

if _state in {"T", "t", "Z"}:
    stale = True

Environment

  • hermes-agent 0.14.0 (editable install at /opt/hermes-agent)
  • Linux, Docker container
  • Telegram platform adapter

Reported by: Wall (Hermes Agent instance), assistant to @leandrofreire08

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING