hermes - 💡(How to fix) Fix macOS gateway eventually hits [Errno 24] Too many open files and needs restart [1 comments, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#14209Fetched 2026-04-23 07:46:05
View on GitHub
Comments
1
Participants
1
Timeline
7
Reactions
0
Author
Participants
Timeline (top)
labeled ×4closed ×1commented ×1cross-referenced ×1

Hermes gateway on macOS hit OSError: [Errno 24] Too many open files and eventually became unable to process Telegram messages, cron jobs, .env loads, dynamic imports, and outbound LLM/API requests. Restarting the launch agent temporarily recovers the service, but the failure suggests a file-descriptor leak or repeated resource retention under normal runtime load.

Error Message

OSError: [Errno 24] Too many open files: '/Users/qinxian/.hermes/hermes-agent/agent' File "/Users/qinxian/.hermes/hermes-agent/gateway/run.py", line 2920, in _handle_message_with_agent File "/Users/qinxian/.hermes/hermes-agent/gateway/run.py", line 7179, in _run_agent File "/Users/qinxian/.hermes/hermes-agent/gateway/run.py", line 6718, in run_sync File "/Users/qinxian/.hermes/hermes-agent/run_agent.py", line 757, in init File "<frozen importlib._bootstrap_external>", line 1662, in _fill_cache

Root Cause

Once this state is reached, Hermes effectively degrades across multiple subsystems at once:

  • messaging
  • cron jobs
  • session summarization
  • tool execution
  • import/loading logic

So the impact is broad, not isolated.

Fix Action

Fix / Workaround

If useful, I can provide more logs or test a diagnostic patch.

Code Example

OSError: [Errno 24] Too many open files: '/Users/qinxian/.hermes/hermes-agent/agent'
  File "/Users/qinxian/.hermes/hermes-agent/gateway/run.py", line 2920, in _handle_message_with_agent
  File "/Users/qinxian/.hermes/hermes-agent/gateway/run.py", line 7179, in _run_agent
  File "/Users/qinxian/.hermes/hermes-agent/gateway/run.py", line 6718, in run_sync
  File "/Users/qinxian/.hermes/hermes-agent/run_agent.py", line 757, in __init__
  File "<frozen importlib._bootstrap_external>", line 1662, in _fill_cache

---

OSError: [Errno 24] Too many open files: '/Users/qinxian/.hermes/.env'
  File "/Users/qinxian/.hermes/hermes-agent/cron/scheduler.py", line 559, in run_job
  File "/Users/qinxian/.hermes/hermes-agent/venv/lib/python3.11/site-packages/dotenv/main.py", line 63, in _get_stream

---

openai.APIConnectionError: Connection error.
httpx.ConnectError: [Errno 24] Too many open files
  File "/Users/qinxian/.hermes/hermes-agent/tools/session_search_tool.py", line 155, in _summarize_session
  File "/Users/qinxian/.hermes/hermes-agent/agent/auxiliary_client.py", line 2289, in async_call_llm
RAW_BUFFERClick to expand / collapse

Summary

Hermes gateway on macOS hit OSError: [Errno 24] Too many open files and eventually became unable to process Telegram messages, cron jobs, .env loads, dynamic imports, and outbound LLM/API requests. Restarting the launch agent temporarily recovers the service, but the failure suggests a file-descriptor leak or repeated resource retention under normal runtime load.

Environment

  • OS: macOS (Apple Silicon)
  • Runtime: launchd LaunchAgent (ai.hermes.gateway)
  • Hermes command:
    • /Users/qinxian/.hermes/hermes-agent/venv/bin/python -m hermes_cli.main gateway run --replace
  • Hermes home:
    • /Users/qinxian/.hermes
  • Repo:
    • NousResearch/hermes-agent

Symptoms

After running for a while, Hermes starts failing broadly with [Errno 24] Too many open files, including:

  • Telegram handling failures for inbound DM sessions
  • Cron scheduler failures opening temp files and .env
  • gh CLI helper/tool invocations failing with the same error
  • OpenAI/httpx connection errors caused by FD exhaustion
  • Python import machinery failing to scan the agent/ package directory

Representative failing paths observed:

  • /Users/qinxian/.hermes/.env
  • /Users/qinxian/.hermes/cron/.jobs_5dllpqbb.tmp
  • /Users/qinxian/.hermes/.channel_directory_ok1gaxt8.tmp
  • /Users/qinxian/.hermes/hermes-agent/agent

Representative stack traces

Gateway / import failure

OSError: [Errno 24] Too many open files: '/Users/qinxian/.hermes/hermes-agent/agent'
  File "/Users/qinxian/.hermes/hermes-agent/gateway/run.py", line 2920, in _handle_message_with_agent
  File "/Users/qinxian/.hermes/hermes-agent/gateway/run.py", line 7179, in _run_agent
  File "/Users/qinxian/.hermes/hermes-agent/gateway/run.py", line 6718, in run_sync
  File "/Users/qinxian/.hermes/hermes-agent/run_agent.py", line 757, in __init__
  File "<frozen importlib._bootstrap_external>", line 1662, in _fill_cache

Cron / dotenv failure

OSError: [Errno 24] Too many open files: '/Users/qinxian/.hermes/.env'
  File "/Users/qinxian/.hermes/hermes-agent/cron/scheduler.py", line 559, in run_job
  File "/Users/qinxian/.hermes/hermes-agent/venv/lib/python3.11/site-packages/dotenv/main.py", line 63, in _get_stream

OpenAI/httpx failure under FD exhaustion

openai.APIConnectionError: Connection error.
httpx.ConnectError: [Errno 24] Too many open files
  File "/Users/qinxian/.hermes/hermes-agent/tools/session_search_tool.py", line 155, in _summarize_session
  File "/Users/qinxian/.hermes/hermes-agent/agent/auxiliary_client.py", line 2289, in async_call_llm

Additional observations

At the time of failure, the Hermes process had a high FD count and many repeated opens around SQLite-related files:

  • ~/.hermes/state.db
  • ~/.hermes/state.db-wal
  • ~/.hermes/response_store.db
  • ~/.hermes/response_store.db-wal

There were also socket entries such as:

  • 127.0.0.1:54467->127.0.0.1:7897 (CLOSE_WAIT)

This may indicate one or both of:

  1. Repeated DB handle creation without timely close/reuse
  2. Network/client/socket leakage (e.g. lingering CLOSE_WAIT connections)

Recovery

A full restart of the launch agent recovers Hermes immediately:

  • unload/bootout current ai.hermes.gateway
  • bootstrap the LaunchAgent again

After restart, the new Hermes process came up healthy with a low FD count (~42 open files), which supports the theory that the process accumulates descriptors over time rather than starting high.

Why this matters

Once this state is reached, Hermes effectively degrades across multiple subsystems at once:

  • messaging
  • cron jobs
  • session summarization
  • tool execution
  • import/loading logic

So the impact is broad, not isolated.

Request

Please help investigate potential file descriptor leaks in the gateway runtime, especially around:

  • agent/auxiliary_client.py
  • tools/session_search_tool.py
  • cron dotenv loading
  • repeated SQLite handle reuse (response_store.db, state.db)
  • lingering network connections / CLOSE_WAIT

If useful, I can provide more logs or test a diagnostic patch.

extent analysis

TL;DR

The Hermes gateway on macOS is experiencing a file descriptor leak, likely due to repeated DB handle creation without timely close/reuse or network/client/socket leakage, causing the process to accumulate descriptors over time and eventually fail with an OSError: [Errno 24] Too many open files error.

Guidance

  • Investigate the agent/auxiliary_client.py and tools/session_search_tool.py files for potential file descriptor leaks, particularly around SQLite handle creation and reuse.
  • Review the cron dotenv loading process to ensure that files are properly closed after use.
  • Check for lingering network connections in a CLOSE_WAIT state and implement a mechanism to close them after a reasonable timeout.
  • Consider implementing a file descriptor limit check to detect and prevent the leak before it causes the process to fail.

Example

import sqlite3

# Ensure that SQLite connections are properly closed
def execute_query(db_path, query):
    with sqlite3.connect(db_path) as conn:
        cursor = conn.cursor()
        cursor.execute(query)
        # Close the connection when done
        conn.close()

Notes

The provided stack traces and observations suggest that the issue is related to file descriptor leaks, but further investigation is needed to determine the root cause. The fact that a full restart of the launch agent recovers the process suggests that the leak is cumulative over time.

Recommendation

Apply a workaround to detect and prevent file descriptor leaks, such as implementing a file descriptor limit check or using a library that provides automatic file descriptor management. This will help prevent the process from failing due to excessive file descriptor usage.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING