hermes - 💡(How to fix) Fix Multi-peer agent dispatch via proxy + Tailnet IPs as a complement to #9295

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

PR #9295 adds gateway.tailscale_serve for the "expose one machine's API server over Tailscale" case. A neighboring deployment shape benefits from a complementary pattern: N peers, each running its own hermes-agent, fronted by a small auth proxy and coordinated by a separate process. Filing this issue as design input alongside #9295, not as competition with it.

This is the agent-side companion to a WebUI-side issue at nesquena/hermes-webui that covers the same pattern from the composer perspective. Cross-link added once both posts are live.

Root Cause

  1. Loading .env deterministically from ~/.hermes/.env. A related issue I observed: API_SERVER_KEY set in ~/.hermes/.env was visible to hermes config show but NOT visible to the hermes gateway subprocess at start time on Windows — diagnostics suggest a load-order issue in env_loader.load_hermes_dotenv vs. config.py's env-var reads. This shows up at NousResearch/hermes-agent#31144 already. Workaround: also set the variable at User scope so the gateway picks it up via os.environ. Mentioning here because the multi-peer pattern hits this trap the first time someone tries to deploy the proxy on a fresh machine.

Fix Action

Fix / Workaround

  1. Loading .env deterministically from ~/.hermes/.env. A related issue I observed: API_SERVER_KEY set in ~/.hermes/.env was visible to hermes config show but NOT visible to the hermes gateway subprocess at start time on Windows — diagnostics suggest a load-order issue in env_loader.load_hermes_dotenv vs. config.py's env-var reads. This shows up at NousResearch/hermes-agent#31144 already. Workaround: also set the variable at User scope so the gateway picks it up via os.environ. Mentioning here because the multi-peer pattern hits this trap the first time someone tries to deploy the proxy on a fresh machine.

  2. Multi-machine deployment guide section in docs/. Even a one-paragraph "If you want to dispatch across several peers, here's the shape" — pointing at #9295 for the single-machine case and noting the proxy alternative for multi-peer — would help users get oriented. Today the docs assume a single-machine deployment.

Code Example

client  --[Tailnet ACL]-->  proxy@<tailscale-ip>:9121  --[Basic Auth]-->  hermes-agent@127.0.0.1
                                                                          (untouched)

---

2026-05-23: POST /api/remote-agent/invoke (target: peer-A) returned 200 with "pong" in 33s

---

# from the coordinator host, against the peer's Tailnet IP
curl -u coordinator:<32-char-password> \
     http://<peer-tailscale-ip>:9121/v1/models
# 200 OK -> the proxy is up AND the upstream API server has API_SERVER_KEY set

# Chat Completions round-trip
curl -u coordinator:<32-char-password> \
     -X POST http://<peer-tailscale-ip>:9121/v1/chat/completions \
     -H "Content-Type: application/json" \
     -d "{\"model\":\"<profile-default>\",\"messages\":[{\"role\":\"user\",\"content\":\"ping\"}],\"stream\":false}"
# 200 OK -> assistant reply streams back; SSE works the same way with stream=true
RAW_BUFFERClick to expand / collapse

Summary

PR #9295 adds gateway.tailscale_serve for the "expose one machine's API server over Tailscale" case. A neighboring deployment shape benefits from a complementary pattern: N peers, each running its own hermes-agent, fronted by a small auth proxy and coordinated by a separate process. Filing this issue as design input alongside #9295, not as competition with it.

This is the agent-side companion to a WebUI-side issue at nesquena/hermes-webui that covers the same pattern from the composer perspective. Cross-link added once both posts are live.

Why a separate proxy instead of --host 0.0.0.0

The constraint that led to this shape: as of v0.14.0, the agent has no built-in WebUI auth, and the only non-loopback bind path (hermes dashboard --insecure --host 0.0.0.0) exposes the dashboard on every local interface, not just the Tailnet — anything on the same LAN/Wi-Fi could hit it. --help flags this with a literal "DANGEROUS" warning. The API_SERVER_KEY Bearer covers /v1/* but the dashboard surface has no comparable gate.

So the pattern keeps the agent at its safe loopback default and adds two perimeter layers in front of it:

client  --[Tailnet ACL]-->  proxy@<tailscale-ip>:9121  --[Basic Auth]-->  [email protected]
                                                                          (untouched)

The proxy is ~300 lines of stdlib Python (http.server.ThreadingHTTPServer + urllib.request), no pip install on the peer. It:

  • Binds to a specific Tailscale interface IP (passed as argv), not 0.0.0.0.
  • Terminates HTTP Basic Auth from the coordinator (single user, 32-char password from env).
  • For /v1/* requests: strips inbound Authorization, injects Authorization: Bearer ${API_SERVER_KEY} from the peer's env, forwards to 127.0.0.1:8642.
  • For everything else (/, dashboard, static, SSE): forwards to 127.0.0.1:9119 with no inner auth.
  • Streams chunked responses for SSE without buffering.

The agent never moves off 127.0.0.1 — the local dashboard shortcut on the peer still works untouched.

How this differs from #9295

Axis#9295 (tailscale_serve)This pattern
ScopeOne machine exposes itself over TailscaleN peers fronted by a small auth proxy each
Dependencytailscale CLI present + authenticatedTailnet membership; no managed Serve
AuthAPI_SERVER_KEY Bearer (Tailscale Serve handles HTTPS)Outer Basic per-peer (auto-injected Bearer inside)
Lifecyclehermes gateway owns the Tailscale Serve listenerPeer-local Startup-folder shortcut runs the proxy
Best forQuick remote access to one machineCoordinator fan-out across several peers

These are not in competition. #9295 is the right answer when the goal is "make my one machine reachable." This pattern is the right answer when the goal is "my coordinator picks among several peers per message."

What would be nice from hermes-agent for this pattern

Not asking for any of these in #9295. Listing the rough edges in case any of them line up with separate work the maintainers are already doing:

  1. A documented bind=<tailnet-ip> knob with built-in WebUI auth. Closely related to #10567 (--host + CORS) and #15731 (dashboard chat with --host). Both are open. The end-state would be: the dashboard can bind to a non-loopback IP with first-class HTTP auth, removing the need for a side-car proxy. PR #9295 covers a different cut (Tailscale Serve does the binding, not the agent) — both shapes can coexist.

  2. Loading .env deterministically from ~/.hermes/.env. A related issue I observed: API_SERVER_KEY set in ~/.hermes/.env was visible to hermes config show but NOT visible to the hermes gateway subprocess at start time on Windows — diagnostics suggest a load-order issue in env_loader.load_hermes_dotenv vs. config.py's env-var reads. This shows up at NousResearch/hermes-agent#31144 already. Workaround: also set the variable at User scope so the gateway picks it up via os.environ. Mentioning here because the multi-peer pattern hits this trap the first time someone tries to deploy the proxy on a fresh machine.

  3. Multi-machine deployment guide section in docs/. Even a one-paragraph "If you want to dispatch across several peers, here's the shape" — pointing at #9295 for the single-machine case and noting the proxy alternative for multi-peer — would help users get oriented. Today the docs assume a single-machine deployment.

Verification (honest)

The pattern is wired end-to-end on a coordinator host. Both peers in the local Tailnet are showing offline as of this filing (8h and 1d since last contact respectively), so a fresh live transcript is not available right now. The most recent successful round-trip is preserved in the coordinator's tools/remote-agent/README.md:

2026-05-23: POST /api/remote-agent/invoke (target: peer-A) returned 200 with "pong" in 33s

Wire shape that produced it (Basic-only outbound; the proxy injects Bearer upstream for /v1/*):

# from the coordinator host, against the peer's Tailnet IP
curl -u coordinator:<32-char-password> \
     http://<peer-tailscale-ip>:9121/v1/models
# 200 OK -> the proxy is up AND the upstream API server has API_SERVER_KEY set

# Chat Completions round-trip
curl -u coordinator:<32-char-password> \
     -X POST http://<peer-tailscale-ip>:9121/v1/chat/completions \
     -H "Content-Type: application/json" \
     -d "{\"model\":\"<profile-default>\",\"messages\":[{\"role\":\"user\",\"content\":\"ping\"}],\"stream\":false}"
# 200 OK -> assistant reply streams back; SSE works the same way with stream=true

If this issue picks up traction, happy to extract a generic reference implementation of the proxy (it's stdlib-only and ~300 lines) into a gist for anyone wanting to try the same pattern.

Why no PR

The two "useful from the agent" pieces (bind + native auth; multi-peer doc page) are decisions the maintainers are best positioned to make. PR #9295 already lands one approach. This issue documents an alternative shape so the design surface is visible — it's not a PR proposal.

If this doesn't fit, no action needed. The pattern works fine as an out-of-tree convention.

If this is a direction the agent wants to take, happy to contribute the implementation -- the --host=<tailnet-ip> + native auth path on the agent side, and the multi-machine doc page. The local proxy is stdlib-only and ~300 lines: easy to either generalize into a reference or replace outright once the agent provides the native shape. Happy to work with maintainers on which path fits and on the test/verification harness for either.

Related

  • #9295 — Tailscale Serve PR by Dusk1e (the issue's companion)
  • #9269 — parent issue PR #9295 closes
  • #10567 — --host + CORS config for dashboard (open)
  • #15731 — dashboard chat tab + --host non-localhost (open)
  • #31144 — env_loader.load_hermes_dotenv vs. hermes config show disagreement
  • Companion WebUI issue: nesquena/hermes-webui#2962

AI Usage Disclosure

This issue was drafted with Claude (Opus 4.7) assistance and reviewed by a human contributor before posting. Architecture description, smoke-test transcript, and cross-issue links were verified by the human reviewer. The proxy implementation referenced is stdlib-only and ~300 lines; happy to extract a generic reference for the issue if useful, or replace it outright once the agent provides the native shape.

Edits: cross-link to companion issue nesquena/hermes-webui#2962 added; explicit offer to contribute the implementation upstream if maintainers want this direction added. Softened the framing in the AI disclosure to reflect the offer.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING