hermes - 💡(How to fix) Fix double-.hermes path mismatch, the HOME env var leak, and the fallback-notification UX problem

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

'error': 'No Codex credentials stored. Run hermes auth to authenticate.'}

ERROR gateway.run: ✗ telegram error: [Errno 13] Permission denied: Independent of the above, a UX note: when the primary provider's call returns a malformed response and triggers TypeError: 'NoneType' object is not iterable (which the agent loop classifies as Non-retryable client error), Telegram users see two error notifications before the fallback's actual reply:

Fix Action

Fix / Workaround

docker exec hermes-agent python -c "from hermes_cli.auth import get_codex_auth_status; print(get_codex_auth_status())"

→ {'logged_in': False, 'auth_store': '/opt/data/auth.json',

'error': 'No Codex credentials stored. Run hermes auth to authenticate.'}

get_hermes_home() / "auth.json" resolves to /opt/data/auth.json (top level), but the actual file is at /opt/data/.hermes/auth.json. Workaround that fixes it

Move auth.json to where the runtime expects it:

docker exec hermes-agent cp /opt/data/.hermes/auth.json /opt/data/auth.json docker exec hermes-agent chown hermes:hermes /opt/data/auth.json docker exec hermes-agent chmod 600 /opt/data/auth.json

Suppress the gateway-level "model provider failed after retries" notice when a fallback chain is configured and will be attempted on the same turn. Or expose a config knob (e.g. display.fallback_notice.style: verbose|minimal|silent) so operators can choose how to surface the switch. A minimal marker like a single character / emoji would be enough to tell users the response came from a fallback model. (I worked around this locally by patching both message strings — happy to send a PR if it'd be useful.)

RAW_BUFFERClick to expand / collapse

Docker setup: wizard writes config + auth.json to $HERMES_HOME/.hermes/ but runtime reads $HERMES_HOME/ (double-.hermes mismatch) TL;DR. When running hermes setup in the Docker image, the wizard reports paths under $HERMES_HOME/.hermes/ and persists auth.json, config.yaml, and other state there. The runtime (hermes config show, hermes profile show, the gateway) instead reads from $HERMES_HOME/ (one directory shallower). As a result, the OAuth-completed openai-codex provider appears not logged in, the user's model selections are ignored, and the only way to get the runtime to see the wizard's choices is hermes config set ... (which writes to the runtime-canonical path).

Environment Image: nousresearch/hermes-agent:main (digest sha256:7a47d19ed1d4fa98f178756fd33772c914d9853e414e8366c268773f55517944, built 2026-05-26) Host: macOS / Docker Desktop, M-series Apple Silicon Run command: docker run -it --rm
-v ~/.hermes:/opt/data
nousresearch/hermes-agent:main
hermes setup (After completing the wizard, an identical issue surfaces when the persistent container is started detached with -e HERMES_DASHBOARD=1 and sleep infinity as CMD.)

Reproduction Run the command above from an empty ~/.hermes. In the wizard pick OpenAI Codex as the provider; complete the device-code OAuth flow successfully (browser sign-in returns Login successful). The wizard prints its summary screen: 📁 All your files are in ~/.hermes/: Settings: /opt/data/.hermes/config.yaml API Keys: /opt/data/.hermes/.env Data: /opt/data/.hermes/cron/, sessions/, logs/ Wizard exits; container exits. Inspect the host bind mount — both paths now exist: ~/.hermes/config.yaml (57 KB — image template, top level) ~/.hermes/.hermes/config.yaml (10 KB — wizard-written, inner) ~/.hermes/.env (23 KB — top level) ~/.hermes/.hermes/auth.json (~5 KB — wizard-written, inner; codex OAuth tokens) Start the persistent container and inspect runtime view: docker exec hermes-agent hermes config path

→ /opt/data/config.yaml (top level — NOT the wizard's file)

docker exec hermes-agent hermes config show | grep -A1 Model

→ Model: {'default': 'anthropic/claude-opus-4.6', 'provider': 'auto', ...}

(i.e. defaults; user's openai-codex/gpt-5.4 selection NOT visible)

docker exec hermes-agent hermes profile show default

→ Model: anthropic/claude-opus-4.6 (auto)

docker exec hermes-agent python -c "from hermes_cli.auth import get_codex_auth_status; print(get_codex_auth_status())"

→ {'logged_in': False, 'auth_store': '/opt/data/auth.json',

'error': 'No Codex credentials stored. Run hermes auth to authenticate.'}

get_hermes_home() / "auth.json" resolves to /opt/data/auth.json (top level), but the actual file is at /opt/data/.hermes/auth.json. Workaround that fixes it

Move auth.json to where the runtime expects it:

docker exec hermes-agent cp /opt/data/.hermes/auth.json /opt/data/auth.json docker exec hermes-agent chown hermes:hermes /opt/data/auth.json docker exec hermes-agent chmod 600 /opt/data/auth.json

Re-apply model selection via the runtime API (which writes to the canonical path):

docker exec hermes-agent hermes config set model.provider openai-codex docker exec hermes-agent hermes config set model.default gpt-5.3-codex docker exec hermes-agent hermes config set model.base_url https://chatgpt.com/backend-api/codex After both, hermes profile show default reports Model: gpt-5.3-codex (openai-codex) and get_codex_auth_status() returns logged_in: True.

Related: HOME env var leaks as empty//root for the gateway process A secondary issue I hit while diagnosing the above: the gateway service runs as the hermes user (UID 10000) via s6-setuidgid hermes in main-wrapper.sh. However HOME does not appear in the gateway process's environment — tr '\0' '\n' < /proc/$PID/environ | grep HOME returns nothing. Python's os.path.expanduser('~') then falls back to /root (root's home per /etc/passwd lookup), and Hermes attempts to write its gateway-lock file at:

/root/.local/state/hermes/gateway-locks/telegram-bot-token-<hash>.lock /root is mode 0700 inside the image, so the hermes user can't even traverse it. Result:

ERROR gateway.run: ✗ telegram error: [Errno 13] Permission denied: '/root/.local/state/hermes/gateway-locks/telegram-bot-token-<hash>.lock' WARNING gateway.run: Gateway started with no connected platforms — 1 platform(s) queued for retry: telegram The gateway retries with exponential backoff forever. No Telegram delivery happens.

The hermes user's passwd entry already declares /opt/data as its home:

hermes:x:10000:10000::/opt/data:/bin/sh …so adding -e HOME=/opt/data to the docker run invocation fixes this. Suggested upstream fix: have main-wrapper.sh (or the cont-init scripts) export HOME=/opt/data before exec s6-setuidgid hermes ..., or have the s6 service definition do it. That would make Docker users not have to know about the HOME quirk.

User-facing UX consequence when fallback handles the case Independent of the above, a UX note: when the primary provider's call returns a malformed response and triggers TypeError: 'NoneType' object is not iterable (which the agent loop classifies as Non-retryable client error), Telegram users see two error notifications before the fallback's actual reply:

⚠️ The model provider failed after retries. I kept raw provider details out of chat; check gateway logs for diagnostics. 🔄 Primary model failed — switching to fallback: <model> via <provider> <actual response from fallback> The first message in particular feels misplaced — the fallback successfully handled the request, so the user shouldn't see a "failed after retries" notice. Two possible improvements:

Suppress the gateway-level "model provider failed after retries" notice when a fallback chain is configured and will be attempted on the same turn. Or expose a config knob (e.g. display.fallback_notice.style: verbose|minimal|silent) so operators can choose how to surface the switch. A minimal marker like a single character / emoji would be enough to tell users the response came from a fallback model. (I worked around this locally by patching both message strings — happy to send a PR if it'd be useful.)

What I think is going on Best guess: HERMES_HOME and the wizard's notion of "the .hermes data dir" diverged at some point. The Docker image sets HERMES_HOME=/opt/data in the Dockerfile (visible via docker image inspect), so get_hermes_home() returns /opt/data. But the wizard appears to compute its own "data dir" via expanduser("~/.hermes") or similar — which under Docker (with HOME empty/root) resolves to /root/.hermes, but the bind mount makes the visible path /opt/data/.hermes. So writes land in a .hermes subdirectory under the bind-mounted data root, one level below where the runtime reads. The two HOME-related bugs may be the same underlying issue surfacing in different code paths.

Happy to provide additional diagnostics, gateway/agent logs, or test specific fixes.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING