hermes - 💡(How to fix) Fix Design discussion: multi-profile deployments in a single gateway process

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Today a hermes gateway run --profile <name> process is bound to exactly one Hermes profile, which means one model, one system prompt, one MEMORY.md, one set of skills/tools. For deployments where an operator wants to run multiple agent personalities in the same Hermes installation — each backed by a different model or prompt, talking to its own platform endpoint — the canonical answer today is "run N gateway processes, one per profile."

This works (and we're doing it), but it scales linearly in operational cost (N supervisors, N ports, N tunnels, N memory footprints). At small N it's fine; past ~5–10 it starts to bite.

This issue isn't asking for a PR — it's asking whether multi-profile-per-gateway is on the roadmap, what shape you'd want it to take if so, or whether "N gateway processes" is the canonical answer indefinitely.

Root Cause

Today a hermes gateway run --profile <name> process is bound to exactly one Hermes profile, which means one model, one system prompt, one MEMORY.md, one set of skills/tools. For deployments where an operator wants to run multiple agent personalities in the same Hermes installation — each backed by a different model or prompt, talking to its own platform endpoint — the canonical answer today is "run N gateway processes, one per profile."

This works (and we're doing it), but it scales linearly in operational cost (N supervisors, N ports, N tunnels, N memory footprints). At small N it's fine; past ~5–10 it starts to bite.

This issue isn't asking for a PR — it's asking whether multi-profile-per-gateway is on the roadmap, what shape you'd want it to take if so, or whether "N gateway processes" is the canonical answer indefinitely.

RAW_BUFFERClick to expand / collapse

Design discussion: multi-profile deployments in a single gateway process

Summary

Today a hermes gateway run --profile <name> process is bound to exactly one Hermes profile, which means one model, one system prompt, one MEMORY.md, one set of skills/tools. For deployments where an operator wants to run multiple agent personalities in the same Hermes installation — each backed by a different model or prompt, talking to its own platform endpoint — the canonical answer today is "run N gateway processes, one per profile."

This works (and we're doing it), but it scales linearly in operational cost (N supervisors, N ports, N tunnels, N memory footprints). At small N it's fine; past ~5–10 it starts to bite.

This issue isn't asking for a PR — it's asking whether multi-profile-per-gateway is on the roadmap, what shape you'd want it to take if so, or whether "N gateway processes" is the canonical answer indefinitely.

Use case

We're shipping a Hermes integration with Microsoft Agent 365 (Hermes-A365). An A365 "blueprint" is Microsoft's term for an agent definition: an Entra app + service principal + bot messaging endpoint + permissions, registered in a tenant. Each blueprint has its own identity and its own purpose.

A realistic enterprise deployment looks like:

BlueprintBacking Hermes agentHermes profile config
"Inbox Triage"Sonnet 4.6, email-tool surfaceprofile: inbox-triage, port 3978
"Calendar Concierge"Sonnet 4.6, calendar-tool surfaceprofile: calendar-concierge, port 3979
"Security Review Buddy"Opus 4.7, file-system + code-search toolsprofile: security-buddy, port 3980
"Onboarding Helper"Haiku 4.5, narrow read-only skillsprofile: onboarding-helper, port 3981

Today this is 4 hermes gateway run processes. Each binds a port, each is fronted by a tunnel (cloudflared) or reverse proxy, each is its own process-supervisor entry.

Adapter-level multi-instance is already in Hermes

For reference: the framework already has adapters that handle multiple instances of the same platform internally. gateway/platforms/slack.py:507 documents the multi-workspace pattern (comma-separated bot tokens, internal routing); weixin.py keys state by account_id; matrix.py similar.

That pattern works great for same personality, many endpoints — one Slack adapter, one Hermes agent, N workspaces feeding into it.

But it doesn't help with different personalities, many endpoints — which is the multi-blueprint A365 case above and (we suspect) a class of use cases for other platforms too: agencies running differentiated bots for different clients in the same Hermes install, MSPs running per-tenant bots, etc.

Three deployment shapes (today's space)

  1. N gateway processes, one profile each — today's canonical path. Works at small N. Cost: N supervisor entries, N ports, N tunnels, N memory footprints.

  2. One gateway, multi-adapter for one platform — adapter-internal multi-instance (Slack pattern). One Hermes agent backs all instances. Useful for "same personality, many endpoints" but not the multi-blueprint case above.

  3. One gateway, multi-profile — what this issue is about. One process, N profiles co-resident, each with its own agent loop. Adapter selects the right profile based on incoming activity metadata (BF aaInstanceId, Slack workspace ID, etc.).

Path 3 is the one that isn't possible today as far as we can tell. If we've missed an existing mechanism, please redirect.

What we're asking

  • Is multi-profile-in-one-gateway on Hermes' roadmap, or is "N processes" the canonical answer indefinitely?
  • If on the roadmap: what shape would you want it to take? Some axes:
    • ~/.hermes/config.yaml representing multiple profiles inline, vs. one config file per profile (today's shape) with the gateway loading N of them.
    • Profile selection: keyed by inbound metadata, by adapter, by routing rule, or operator-pinned per platform?
    • Session-store / memory isolation: presumably already correct since SessionSource includes platform + chat_id, but worth confirming for cross-profile guarantees.
    • Plugin / hook scoping: do pre_tool_call hooks fire per-profile or globally? Backwards compatibility?
  • If not on the roadmap: we'll lean further into path 1 + reverse proxy fronting N bridges, and skip the design work. Useful to know.

Why we're asking now (not later)

We just shipped v0.1.x of Hermes-A365 to PyPI. The wrapper's register → publish → cleanup loop works end-to-end against a live tenant from a fresh pip install. The 1-blueprint case is fully supported.

The first operator with a 5+ blueprint deployment is when path 1's operational tax starts to land. We'd rather have the design conversation asynchronously now than scramble when that operator surfaces. We're not asking for a PR or commitment to ship — just a read on whether this is a direction Hermes wants to grow in, and what shape would be acceptable if we eventually contribute it.

Happy to spec further or open a discussion in whatever forum makes most sense (issue, discussion, RFC doc).

— from the Hermes-A365 maintainers

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix Design discussion: multi-profile deployments in a single gateway process