hermes - ✅(Solved) Fix [Bug]: Docker-deployed Hermes can create root-owned config.yaml when CLI is run as root, causing gateway permission errors [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#16480Fetched 2026-04-28 06:53:02
View on GitHub
Comments
0
Participants
1
Timeline
6
Reactions
0
Participants
Timeline (top)
labeled ×4cross-referenced ×1unsubscribed ×1

Error Message

Operating System

Ubuntu 24.04.1 LTS WSL

Python Version

3.13.5

Hermes Version

Hermes Agent v0.11.0 (2026.4.23), clone from main branch

Additional Logs / Traceback (optional)

Root Cause

After restarting the gateway, the service may fail with file permission / ownership errors because files under $HERMES_HOME have become root-owned.

Fix Action

Fixed

PR fix notes

PR #16502: fix(cli): refuse-or-drop privileges when launched as root in Docker (#16480)

Description (problem / solution / changelog)

What does this PR do?

When a user enters a running Hermes container via docker exec -it … bash and launches hermes interactively, the Docker entrypoint's privilege drop is bypassed and the CLI runs as root. Any files it writes under $HERMES_HOME (config.yaml, logs, cache, state DBs, …) become root-owned, and the long-running gateway running as the hermes user later hits Permission denied / Operation not permitted errors — exactly the symptom in #16480 and the related #15865 / #16096.

This PR adds a small startup-time guard in the CLI that detects "running as root inside a container with a declared non-root runtime user" and either transparently drops privileges (re-execs via gosu / su-exec) or refuses with an actionable error. It is purely additive and a no-op outside that exact situation.

Related Issue

Fixes #16480.

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 🔒 Security fix
  • 📝 Documentation update
  • ✅ Tests (adding or improving test coverage)
  • ♻️ Refactor (no behavior change)
  • 🎯 New skill (bundled or hub)

Changes Made

  • hermes_cli/docker_guard.py — new module. Exposes enforce_docker_non_root(). Triggers only when all of the following hold:

    1. Running on Linux.
    2. os.geteuid() == 0.
    3. Inside a container — detected via /.dockerenv, /run/.containerenv, container hints in /proc/1/cgroup (docker / containerd / kubepods / podman / crio), or HERMES_DOCKER=1.
    4. The container declared a non-root runtime user via HERMES_UID (and optionally HERMES_GID).

    When triggered:

    • Transparently re-execs the same argv via gosu / su-exec as HERMES_UID:HERMES_GID (preferred — invisible to the user, same mechanism the entrypoint already uses).
    • Falls back to refusing with an actionable error if no privilege-dropper is installed, telling the user how to relaunch (gosu … / docker exec -u …).
    • HERMES_DOCKER_GUARD_REEXEC=1 loop-guard prevents infinite re-exec.
    • hermes version / --version / --help are exempt because they don't write state.
    • Opt-out via HERMES_DISABLE_DOCKER_ROOT_GUARD=1 for maintenance / debugging.
  • hermes_cli/main.py — call enforce_docker_non_root() at the very top of main(), wrapped in a broad try / except so an unforeseen guard failure can never break CLI startup.

  • tests/hermes_cli/test_docker_guard.py — 15 new tests.

How to Test

Reproduce the original bug:

  1. docker compose -p hermes-agent-dev up -d --build
  2. docker exec -it hermes-agent-dev bash (lands you in the container as root).
  3. hermes — before this PR: runs as root, can write root-owned files under $HERMES_HOME.
  4. After this PR: the CLI either re-execs itself as hermes (UID 10000) via gosu and continues, or, if no gosu / su-exec is present, exits with a clear instruction to relaunch via docker exec -u hermes ….

Verify the guard's no-op cases:

  • HERMES_DISABLE_DOCKER_ROOT_GUARD=1 hermes → no-op.
  • hermes --version / hermes --help → no-op (writes no state).
  • Non-Linux host, non-root user, outside container, or HERMES_UID unset → no-op.

Unit tests:

$ .venv/bin/python -m pytest tests/hermes_cli/test_docker_guard.py -q
...............  15 passed in 0.90s

Tests cover: no-op on non-Linux / non-root / non-container / missing HERMES_UID / HERMES_UID=0; disable env override; version / --version / --help bypass; re-exec loop marker breaks recursion; gosu re-exec argv shape (/usr/local/bin/gosu, 10000:10000, hermes, chat) and env propagation; refuse path: exit code 1 + actionable stderr + mention of the disable env; malformed HERMES_UID ignored; malformed HERMES_GID falls back to UID; container detection via /.dockerenv, HERMES_DOCKER=1, and bare-host (cgroup) negative case.

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(cli): …)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix (no unrelated commits)
  • I've run pytest tests/hermes_cli/test_docker_guard.py -q and all tests pass
  • I've added tests for my changes
  • I've tested on my platform: macOS 26.5 (Apple Silicon) — Linux/Docker behaviour exercised via unit tests that simulate geteuid, /.dockerenv, /proc/1/cgroup, etc.

Documentation & Housekeeping

  • I've updated relevant documentation — N/A (internal startup guard; behaviour is invisible in the success path and prints actionable guidance in the refuse path).
  • I've updated cli-config.yaml.example if I added/changed config keys — N/A (no config keys; opt-out via env var only).
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — N/A.
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — guard short-circuits to no-op on non-Linux platforms; covered by tests.
  • I've updated tool descriptions/schemas if I changed tool behavior — N/A.

Screenshots / Logs

$ .venv/bin/python -m pytest tests/hermes_cli/test_docker_guard.py -q
...............
15 passed in 0.90s

Refuse path stderr (when gosu / su-exec are absent):

hermes: refusing to run as root inside a Docker container.
Detected: euid=0, container=yes, HERMES_UID=10000.
Re-launch as the unprivileged user, e.g.:
    docker exec -u hermes <container> hermes
or install `gosu` / `su-exec` in the image so Hermes can drop privileges automatically.
To bypass this guard (not recommended), set HERMES_DISABLE_DOCKER_ROOT_GUARD=1.

Risk / rollback

  • The guard is wrapped in a broad try / except inside main() — even an unforeseen failure cannot break CLI startup.
  • If a deployment ever needs the old behaviour, set HERMES_DISABLE_DOCKER_ROOT_GUARD=1.

Changed files

  • hermes_cli/docker_guard.py (added, +202/-0)
  • hermes_cli/main.py (modified, +11/-0)
  • tests/hermes_cli/test_docker_guard.py (added, +167/-0)

Code Example

chown: changing ownership of '/opt/data/config.yaml': Operation not permitted

---

chown: changing ownership of '/opt/data/config.yaml': Operation not permitted

### Affected Component

Gateway (Telegram/Discord/Slack/WhatsApp), CLI (interactive chat), Configuration (config.yaml, .env, hermes setup)

### Messaging Platform (if gateway-related)

N/A (CLI only)

### Debug Report

---

### Operating System

Ubuntu 24.04.1 LTS WSL

### Python Version

3.13.5

### Hermes Version

Hermes Agent v0.11.0 (2026.4.23), clone from main branch

### Additional Logs / Traceback (optional)
RAW_BUFFERClick to expand / collapse

Bug Description

In Docker deployments, Hermes normally starts correctly: the entrypoint runs as root, fixes ownership, and then drops privileges to the hermes user before running hermes gateway run.

The problem occurs when users enter the running container as root via docker exec and manually run hermes. This bypasses the entrypoint privilege-dropping logic, so the CLI runs as root and may create or modify $HERMES_HOME/config.yaml and other state files as root-owned.

For example, the following screenshot shows Hermes running as root even though HERMES_UID and HERMES_GID are set to 10000:

<img width="600" alt="Image" src="https://github.com/user-attachments/assets/8460a768-d655-4365-9920-0a5f6d755654" />

This demonstrates that:

  • The CLI is executing as root
  • The environment variables (HERMES_UID, HERMES_GID) are present but not enforced
  • No privilege drop occurs in this execution path

Later, when the gateway runs normally as the hermes user, it can no longer access those files, causing permission errors.

The issue is not that the official gateway startup fails to drop privileges, but that interactive root execution is not guarded against and can silently poison the file ownership under $HERMES_HOME.

This may be related to previous permission-related reports such as #15865, where the reported symptom was:

chown: changing ownership of '/opt/data/config.yaml': Operation not permitted

PR #16096 fixed one specific entrypoint-side issue by moving the config.yaml chown/chmod handling into the root section before the gosu privilege drop, and by guarding the chown failure path.

However, the case described here is a different execution path: the user manually enters the running container as root and launches hermes, bypassing the entrypoint entirely. In that path, the fix from #16096 may repair config.yaml in some cases after gateway restart, but other files touched by the root-launched CLI or agent tools may still become root-owned.

This is also related in spirit to #4426 / #7357, where subprocess/tool environment isolation was improved by injecting a persistent HOME under $HERMES_HOME. That fix helps subprocesses write tool configuration into the persistent volume, but it does not prevent the main Hermes CLI process itself from being launched as root inside the container.

Therefore, the remaining issue is broader than config.yaml: any file under $HERMES_HOME created or modified by a root-launched Hermes CLI session may later become inaccessible to the normal gateway process running as the hermes user.


In later testing, restarting the gateway appeared to repair the ownership issue for config.yaml in some cases. However, permission errors may still occur for other files under $HERMES_HOME that were previously created or modified by the root-launched CLI or agent tools. For example, files related to history, memory, cache, logs, or state may also become root-owned and later fail to be read by the normal gateway process running as the hermes user.

Therefore, config.yaml is only one concrete example of the issue. The broader problem is that any file touched by a root-launched Hermes CLI session can become inaccessible to the normal non-root runtime.

Steps to Reproduce

  1. Build and start Hermes in a Docker dev environment: docker compose -p hermes-agent-dev up -d --build

  2. Enter the running container as root: docker exec -it hermes-agent-dev bash

  3. Launch the Hermes from this root shell: hermes setup

  4. Configure Hermes in the root-launched setup flow and trigger the config backup/update path.

  5. Verify file ownership: ls -l config.yaml The file is now owned by root.

  6. Launch hermes: hermes

  7. Observe permission-related errors when the gateway runs as the hermes user. chown: changing ownership of '/opt/data/config.yaml': Operation not permitted

Expected Behavior

When Hermes CLI is launched inside a Docker container, it should not execute any agent/tool command as root.

If the CLI is started from a root shell inside the container, Hermes should either:

  1. drop privileges to the hermes user before running any agent/tool command, or
  2. refuse to continue and instruct the user to run Hermes as the hermes user.

After restarting the gateway, it should continue to run normally as the hermes user without failing due to root-owned files under $HERMES_HOME.

Actual Behavior

After restarting the gateway, the service may fail with file permission / ownership errors because files under $HERMES_HOME have become root-owned.

config.yaml is one concrete example, but the same issue can also affect logs, cache files, state databases, or any other files created/modified by the root-launched CLI or agent tools.

Example error:

chown: changing ownership of '/opt/data/config.yaml': Operation not permitted

### Affected Component

Gateway (Telegram/Discord/Slack/WhatsApp), CLI (interactive chat), Configuration (config.yaml, .env, hermes setup)

### Messaging Platform (if gateway-related)

N/A (CLI only)

### Debug Report

```shell
--- hermes dump ---
version:          0.11.0 (2026.4.23) [(unknown)]
os:               Linux 6.6.87.2-microsoft-standard-WSL2 x86_64
python:           3.13.5
openai_sdk:       2.32.0
profile:          default
hermes_home:      /opt/data
model:            deepseek-v4-flash
provider:         deepseek
terminal:         local

api_keys:
  openrouter           not set
  openai               not set
  anthropic            not set
  anthropic_token      not set
  nous                 not set
  google/gemini        not set
  gemini               not set
  glm/zai              not set
  zai                  not set
  kimi                 not set
  minimax              not set
  deepseek             set
  dashscope            not set
  huggingface          not set
  nvidia               not set
  ai_gateway           not set
  opencode_zen         not set
  opencode_go          not set
  kilocode             not set
  firecrawl            not set
  tavily               not set
  browserbase          not set
  fal                  not set
  elevenlabs           not set
  github               not set

features:
  toolsets:           hermes-cli
  mcp_servers:        0
  memory_provider:    built-in
  gateway:            running (docker (foreground), pid 7)
  platforms:          none
  cron_jobs:          0
  skills:             81

config_overrides:
  display.streaming: True
--- end dump ---


--- agent.log (last 200 lines) ---
  in "/opt/data/config.yaml", line 400, column 1
2026-04-27 09:04:42,447 INFO run_agent: Loaded environment variables from /opt/data/.env
2026-04-27 09:04:59,887 INFO hermes_cli.plugins: Plugin 'openai' registered image_gen provider: openai
2026-04-27 09:04:59,888 INFO hermes_cli.plugins: Plugin 'openai-codex' registered image_gen provider: openai-codex
2026-04-27 09:04:59,919 INFO hermes_cli.plugins: Plugin 'xai' registered image_gen provider: xai
2026-04-27 09:04:59,956 INFO hermes_cli.plugins: Plugin discovery complete: 5 found, 4 enabled
2026-04-27 09:05:00,588 INFO run_agent: Loaded environment variables from /opt/data/.env
2026-04-27 09:05:01,337 INFO agent.auxiliary_client: Vision auto-detect: using main provider deepseek (deepseek-v4-flash)
2026-04-27 09:05:02,221 WARNING gateway.config: Failed to process config.yaml — falling back to .env / gateway.json values. Check /opt/data/config.yaml for syntax errors. Error: [Errno 13] Permission denied: '/opt/data/config.yaml'
2026-04-27 09:05:02,268 WARNING gateway.config: Failed to process config.yaml — falling back to .env / gateway.json values. Check /opt/data/config.yaml for syntax errors. Error: [Errno 13] Permission denied: '/opt/data/config.yaml'
2026-04-27 09:05:03,042 INFO agent.auxiliary_client: Vision auto-detect: using main provider deepseek (deepseek-v4-flash)
2026-04-27 09:05:03,274 INFO agent.auxiliary_client: Vision auto-detect: using main provider deepseek (deepseek-v4-flash)
2026-04-27 09:05:06,946 INFO agent.auxiliary_client: Vision auto-detect: using main provider deepseek (deepseek-v4-flash)
2026-04-27 09:05:08,270 INFO agent.auxiliary_client: Auxiliary auto-detect: using main provider deepseek (deepseek-v4-flash)
2026-04-27 09:05:10,413 INFO agent.auxiliary_client: Auxiliary auto-detect: using main provider deepseek (deepseek-v4-flash)
2026-04-27 09:05:10,413 INFO agent.auxiliary_client: Auxiliary title_generation: using auto (deepseek-v4-flash) at https://api.deepseek.com
2026-04-27 09:06:02,203 WARNING gateway.config: Failed to process config.yaml — falling back to .env / gateway.json values. Check /opt/data/config.yaml for syntax errors. Error: [Errno 13] Permission denied: '/opt/data/config.yaml'
2026-04-27 09:06:02,244 WARNING gateway.config: Failed to process config.yaml — falling back to .env / gateway.json values. Check /opt/data/config.yaml for syntax errors. Error: [Errno 13] Permission denied: '/opt/data/config.yaml'
2026-04-27 09:06:19,116 INFO gateway.run: Received SIGTERM/SIGINT — initiating shutdown
2026-04-27 09:06:19,124 WARNING gateway.run: Shutdown diagnostic — other hermes processes running:
  root           1  0.0  0.0   2564  1280 ?        Ss   08:04   0:00 /usr/bin/tini -g -- /opt/hermes/docker/entrypoint.sh gateway run
  hermes        29  0.0  0.0   4320  3520 pts/0    Ss+  08:04   0:00 bash
  hermes       324  0.0  0.0   6792  3680 ?        R    09:06   0:00 ps aux
2026-04-27 09:06:19,125 INFO gateway.run: Stopping gateway...
2026-04-27 09:06:19,128 INFO gateway.platforms.feishu: [Feishu] Disconnected
2026-04-27 09:06:19,128 INFO gateway.run: ✓ feishu disconnected
2026-04-27 09:06:19,130 INFO gateway.run: Gateway stopped
2026-04-27 09:06:19,130 INFO gateway.run: Cron ticker stopped
2026-04-27 09:06:19,131 INFO gateway.run: Exiting with code 1 (signal-initiated shutdown without restart request) so systemd Restart=on-failure can revive the gateway.
2026-04-27 09:06:21,479 INFO hermes_cli.plugins: Plugin 'openai' registered image_gen provider: openai
2026-04-27 09:06:21,479 INFO hermes_cli.plugins: Plugin 'openai-codex' registered image_gen provider: openai-codex
2026-04-27 09:06:21,540 INFO hermes_cli.plugins: Plugin 'xai' registered image_gen provider: xai
2026-04-27 09:06:21,602 INFO hermes_cli.plugins: Plugin discovery complete: 5 found, 4 enabled
2026-04-27 09:06:22,356 INFO gateway.run: Starting Hermes Gateway...
2026-04-27 09:06:22,356 INFO gateway.run: Session storage: /opt/data/sessions

Operating System

Ubuntu 24.04.1 LTS WSL

Python Version

3.13.5

Hermes Version

Hermes Agent v0.11.0 (2026.4.23), clone from main branch

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

The normal Docker startup path appears to be correct: the entrypoint starts as root, fixes ownership, and then runs the gateway as the hermes user.

The issue occurs when users enter the running container as root and manually launch hermes. This bypasses the entrypoint, so the CLI and any agent/tool commands run as root.

In hermes_cli/main.py, the CLI startup path loads environment/config state and initializes logging early. It also reads config.yaml for settings such as security.redact_secrets, but there is no guard that prevents root execution inside Docker or switches the process to the hermes user before agent/tool commands run.

The screenshot shows the CLI running as root while HERMES_UID=10000 and HERMES_GID=10000 are present. This suggests that these variables are not enforced in the manual CLI path; they only apply when the Docker entrypoint runs.

Therefore, files created or rewritten under $HERMES_HOME by the root-launched CLI can become root-owned. config.yaml is one concrete example, but logs, cache files, state databases, or other tool-modified files may be affected as well.

Proposed Fix (optional)

Add a Docker-root execution guard in hermes_cli/main.py.

When Hermes CLI starts inside a Docker container as root, it should detect this before running any agent/tool command. If detected, Hermes should either:

  1. drop privileges to the hermes user, or
  2. refuse to continue and instruct the user to run Hermes as the hermes user.

This would prevent root-launched CLI sessions from creating root-owned files under $HERMES_HOME, such as config.yaml, logs, cache files, or state databases.

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

extent analysis

TL;DR

To fix the issue, add a Docker-root execution guard in hermes_cli/main.py to drop privileges to the hermes user or refuse to continue when Hermes CLI is launched as root inside a Docker container.

Guidance

  • Identify the Docker environment and check if the process is running as root to determine if the guard should be triggered.
  • Use the os and pwd modules in Python to get the current user ID and compare it with the HERMES_UID environment variable.
  • If running as root, use the os.setuid() function to drop privileges to the hermes user before executing any agent/tool commands.
  • Alternatively, print an error message and exit the process if the user is root, instructing them to run Hermes as the hermes user.

Example

import os

# Check if running in Docker and as root
if os.environ.get('DOCKER_CONTAINER') and os.getuid() == 0:
    # Drop privileges to hermes user
    hermes_uid = int(os.environ.get('HERMES_UID'))
    os.setuid(hermes_uid)
    # Continue with the rest of the CLI startup

Notes

This fix assumes that the HERMES_UID environment variable is set and accurate. Additional error handling may be necessary to ensure the fix is robust.

Recommendation

Apply the proposed fix by adding a Docker-root execution guard in hermes_cli/main.py to prevent root-launched CLI sessions from creating root-owned files under $HERMES_HOME.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - ✅(Solved) Fix [Bug]: Docker-deployed Hermes can create root-owned config.yaml when CLI is run as root, causing gateway permission errors [1 pull requests, 1 participants]