Today a hermes gateway run --profile <name> process is bound to exactly one Hermes profile, which means one model, one system prompt, one MEMORY.md, one set of skills/tools. For deployments where an operator wants to run multiple agent personalities in the same Hermes installation — each backed by a different model or prompt, talking to its own platform endpoint — the canonical answer today is "run N gateway processes, one per profile."

This works (and we're doing it), but it scales linearly in operational cost (N supervisors, N ports, N tunnels, N memory footprints). At small N it's fine; past ~5–10 it starts to bite.

This issue isn't asking for a PR — it's asking whether multi-profile-per-gateway is on the roadmap, what shape you'd want it to take if so, or whether "N gateway processes" is the canonical answer indefinitely.

Root Cause

This works (and we're doing it), but it scales linearly in operational cost (N supervisors, N ports, N tunnels, N memory footprints). At small N it's fine; past ~5–10 it starts to bite.

Design discussion: multi-profile deployments in a single gateway process

StepCodex · 2026-05-11T11:02:24Z

[hermes] Today a hermes gateway run --profile ` process is bound to exactly one Hermes profile, which means one model, one system prompt, one MEMORY.md, one set of skills/tools. For deployments where an operator wants to run **multiple agent personalities** in the same Hermes installation — each backed by a different model or prompt, talking to its own platform endpoint — the canonical answer today is "run N gateway processes, one per profile." This works (and we're doing it), but it scales linearly in operational cost (N supervisors, N ports, N tunnels, N memory footprints). At small N it's fine; past ~5–10 it starts to bite. This issue isn't asking for a PR — it's asking whether multi-profile-per-gateway is on the roadmap, what shape you'd want it to take if so, or whether "N gateway processes" is the canonical answer indefinitely. # Design discussion: multi-profile deployments in a single gateway process ## Summary Today a `hermes gateway run --profile ` process is bound to exactly one Hermes profile, which means one model, one system prompt, one MEMORY.md, one set of skills/tools. For deployments where an operator wants to run **multiple agent personalities** in the same Hermes installation — each backed by a different model or prompt, talking to its own platform endpoint — the canonical answer today is "run N gateway processes, one per profile." This works (and we're doing it), but it scales linearly in operational cost (N supervisors, N ports, N tunnels, N memory footprints). At small N it's fine; past ~5–10 it starts to bite. This issue isn't asking for a PR — it's asking whether multi-profile-per-gateway is on the roadmap, what shape you'd want it to take if so, or whether "N gateway processes" is the canonical answer indefinitely. ## Use case We're shipping a Hermes integration with Microsoft Agent 365 ([Hermes-A365](https://github.com/satscryption/Hermes-A365)). An A365 "blueprint" is Microsoft's term for an agent definition: an Entra app + service principal + bot messaging endpoint + permissions, registered in a tenant. Each blueprint has its own identity and its own purpose. A realistic enterprise deployment looks like: | Blueprint | Backing Hermes agent | Hermes profile config | |---|---|---| | "Inbox Triage" | Sonnet 4.6, email-tool surface | profile: `inbox-triage`, port 3978 | | "Calendar Concierge" | Sonnet 4.6, calendar-tool surface | profile: `calendar-concierge`, port 3979 | | "Security Review Buddy" | Opus 4.7, file-system + code-search tools | profile: `security-buddy`, port 3980 | | "Onboarding Helper" | Haiku 4.5, narrow read-only skills | profile: `onboarding-helper`, port 3981 | Today this is 4 `hermes gateway run` processes. Each binds a port, each is fronted by a tunnel (cloudflared) or reverse proxy, each is its own process-supervisor entry. ## Adapter-level multi-instance is already in Hermes For reference: the framework already has adapters that handle multiple *instances* of the same platform internally. `gateway/platforms/slack.py:507` documents the multi-workspace pattern (comma-separated bot tokens, internal routing); `weixin.py` keys state by `account_id`; `matrix.py` similar. That pattern works great for **same personality, many endpoints** — one Slack adapter, one Hermes agent, N workspaces feeding into it. But it doesn't help with **different personalities, many endpoints** — which is the multi-blueprint A365 case above and (we suspect) a class of use cases for other platforms too: agencies running differentiated bots for different clients in the same Hermes install, MSPs running per-tenant bots, etc. ## Three deployment shapes (today's space) 1. **N gateway processes, one profile each** — today's canonical path. Works at small N. Cost: N supervisor entries, N ports, N tunnels, N memory footprints. 2. **One gateway, multi-adapter for one platform** — adapter-internal multi-instance (Slack pattern). One Hermes agent backs all instances. Useful for "same personality, many endpoints" but not the multi-blueprint case above. 3. **One gateway, multi-profile** — what this issue is about. One process, N profiles co-resident, each with its own agent loop. Adapter selects the right profile based on incoming activity metadata (BF `aaInstanceId`, Slack workspace ID, etc.). Path 3 is the one that isn't possible today as far as we can tell. If we've missed an existing mechanism, please redirect. ## What we're asking - Is multi-profile-in-one-gateway on Hermes' roadmap, or is "N processes" the canonical answer indefinitely? - If on the roadmap: what shape would you want it to take? Some axes: - `~/.hermes/config.yaml` representing multiple profiles inline, vs. one config file per profile (today's shape) with the gateway loading N of them. - Profile selection: keyed by inbound metadata, by adapter, by routing rule, or oper

Summary

This works (and we're doing it), but it scales linearly in operational cost (N supervisors, N ports, N tunnels, N memory footprints). At small N it's fine; past ~5–10 it starts to bite.

Use case

We're shipping a Hermes integration with Microsoft Agent 365 (Hermes-A365). An A365 "blueprint" is Microsoft's term for an agent definition: an Entra app + service principal + bot messaging endpoint + permissions, registered in a tenant. Each blueprint has its own identity and its own purpose.

A realistic enterprise deployment looks like:

Blueprint	Backing Hermes agent	Hermes profile config
"Inbox Triage"	Sonnet 4.6, email-tool surface	profile: `inbox-triage`, port 3978
"Calendar Concierge"	Sonnet 4.6, calendar-tool surface	profile: `calendar-concierge`, port 3979
"Security Review Buddy"	Opus 4.7, file-system + code-search tools	profile: `security-buddy`, port 3980
"Onboarding Helper"	Haiku 4.5, narrow read-only skills	profile: `onboarding-helper`, port 3981

Today this is 4 hermes gateway run processes. Each binds a port, each is fronted by a tunnel (cloudflared) or reverse proxy, each is its own process-supervisor entry.

Adapter-level multi-instance is already in Hermes

For reference: the framework already has adapters that handle multiple instances of the same platform internally. gateway/platforms/slack.py:507 documents the multi-workspace pattern (comma-separated bot tokens, internal routing); weixin.py keys state by account_id; matrix.py similar.

That pattern works great for same personality, many endpoints — one Slack adapter, one Hermes agent, N workspaces feeding into it.

But it doesn't help with different personalities, many endpoints — which is the multi-blueprint A365 case above and (we suspect) a class of use cases for other platforms too: agencies running differentiated bots for different clients in the same Hermes install, MSPs running per-tenant bots, etc.

Three deployment shapes (today's space)

N gateway processes, one profile each — today's canonical path. Works at small N. Cost: N supervisor entries, N ports, N tunnels, N memory footprints.
One gateway, multi-adapter for one platform — adapter-internal multi-instance (Slack pattern). One Hermes agent backs all instances. Useful for "same personality, many endpoints" but not the multi-blueprint case above.
One gateway, multi-profile — what this issue is about. One process, N profiles co-resident, each with its own agent loop. Adapter selects the right profile based on incoming activity metadata (BF aaInstanceId, Slack workspace ID, etc.).

Path 3 is the one that isn't possible today as far as we can tell. If we've missed an existing mechanism, please redirect.

What we're asking

Is multi-profile-in-one-gateway on Hermes' roadmap, or is "N processes" the canonical answer indefinitely?
If on the roadmap: what shape would you want it to take? Some axes:
- ~/.hermes/config.yaml representing multiple profiles inline, vs. one config file per profile (today's shape) with the gateway loading N of them.
- Profile selection: keyed by inbound metadata, by adapter, by routing rule, or operator-pinned per platform?
- Session-store / memory isolation: presumably already correct since SessionSource includes platform + chat_id, but worth confirming for cross-profile guarantees.
- Plugin / hook scoping: do pre_tool_call hooks fire per-profile or globally? Backwards compatibility?
If not on the roadmap: we'll lean further into path 1 + reverse proxy fronting N bridges, and skip the design work. Useful to know.

Why we're asking now (not later)

We just shipped v0.1.x of Hermes-A365 to PyPI. The wrapper's register → publish → cleanup loop works end-to-end against a live tenant from a fresh pip install. The 1-blueprint case is fully supported.

The first operator with a 5+ blueprint deployment is when path 1's operational tax starts to land. We'd rather have the design conversation asynchronously now than scramble when that operator surfaces. We're not asking for a PR or commitment to ship — just a read on whether this is a direction Hermes wants to grow in, and what shape would be acceptable if we eventually contribute it.

Happy to spec further or open a discussion in whatever forum makes most sense (issue, discussion, RFC doc).

— from the Hermes-A365 maintainers

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix Design discussion: multi-profile deployments in a single gateway process

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Design discussion: multi-profile deployments in a single gateway process

Summary

Use case

Adapter-level multi-instance is already in Hermes

Three deployment shapes (today's space)

What we're asking

Why we're asking now (not later)

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix Design discussion: multi-profile deployments in a single gateway process

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Design discussion: multi-profile deployments in a single gateway process

Summary

Use case

Adapter-level multi-instance is already in Hermes

Three deployment shapes (today's space)

What we're asking

Why we're asking now (not later)

Still need to ship something?

RELATED_DISCOVERY

TRENDING