hermes - 💡(How to fix) Fix Matrix gateway: slash command + free-text question in a single multi-line message is rejected, blocking inline orchestration patterns

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

  • The error message lies — it talks about spaces, never mentions newlines. Users waste time on the wrong workarounds. We wrote a local Hermès plugin (hermes-sovepro) that automates a "replay" pattern: when a transient cloud error happens, store the user's last question for 15 minutes (indexed by Matrix room). On the next /model X from the user, the plugin rewrites the incoming event.text to: …so the gateway should switch model and re-run the question in one shot. The plugin's pytest suite passes (8/8 cases, including TTL expiry and multi-room isolation), but live testing immediately fails: the gateway parser rejects the rewritten event with the very "Model names cannot contain spaces" error this issue describes. We currently have the plugin disabled in our runtime, waiting on a parser fix here (and on the broader in-band channel discussed in #22714). If for some reason multi-line slash messages should remain forbidden, an honest error such as "Slash commands must be single-line; please send /model X first, then your question separately" would at least not lie about whitespace.

Fix Action

Fix / Workaround

Behind Hermès, we run a custom OpenAI-compatible LLM dispatcher that mediates every chat completions call the agent makes. Its purpose is to enable concurrent use of local and cloud LLMs, routed automatically per request according to rules we define (admission policy, queue priorities, context-size thresholds, fallbacks). Default routing is local Ollama; cloud is selected when rules say so.

  1. The input does not actually contain a literal space — it contains a newline. Users who try the obvious workarounds (URL-encoding a space, removing punctuation) hit a wall before realising the parser path itself is the problem.
  2. Both the model switch and the question silently fail. The question portion of the message is dropped — Hermès does not answer it after the rejection.

Issue #22714 requests a documented mechanism for users (and plugins) to pass per-message orchestration hints to a downstream dispatcher — for example via metadata or extra_body on the OpenAI payload. Whatever shape that mechanism takes, users will want to express the override and the question in the same Matrix message — exactly the multi-line pattern this parser bug forbids today.

Code Example

/model ollama-cloud/glm-5.1
Translate the following document into FR: <attached>

---

/model ollama-cloud/glm-5.1
Bonjour test, repond OK

---

/model <new model>
<stored question>

---

# pseudocode in the slash command entry point
first_line, _, rest = text.partition("\n")
command, _, args = first_line.partition(" ")
# parse `args` as the slash command argument
# if `rest` is non-empty, route it as a follow-up user message in the same context
RAW_BUFFERClick to expand / collapse

Context — how we use Hermès

Our deployment runs Hermès Agent as a long-lived profile (hermes profile create) with the Matrix gateway enabled, connected to a private Synapse 1.151.0 homeserver. An operations team interacts with Hermès in a single dedicated Matrix room via Element clients.

Behind Hermès, we run a custom OpenAI-compatible LLM dispatcher that mediates every chat completions call the agent makes. Its purpose is to enable concurrent use of local and cloud LLMs, routed automatically per request according to rules we define (admission policy, queue priorities, context-size thresholds, fallbacks). Default routing is local Ollama; cloud is selected when rules say so.

In this setup, slash commands typed by a user in the Matrix room are the natural in-band lever for per-message overrides on top of those automatic rules — for instance: "for this specific, harder question, force a more capable cloud model". The most natural way to express such an override and the question it applies to is a single multi-line Matrix message:

/model ollama-cloud/glm-5.1
Translate the following document into FR: <attached>

Element supports inline newlines via Shift+Enter, so this is a one-message, single-send interaction. The user types both lines, presses Send once, and expects the gateway to process the slash command first and then handle the remaining text as a normal user message answered by the just-selected model.

That's the use case we want to support. Today we cannot.

What we observe

When a Matrix message starts with /model <model_name> followed by a newline and additional text, the gateway's slash command parser does not stop at the newline. It treats everything after /model (including the newline and the trailing question) as the model name argument. The model name comparison then sees "<model>\n<rest of message>", detects whitespace inside it, and rejects the command with:

Model names cannot contain spaces

The rejection is misleading on two counts:

  1. The input does not actually contain a literal space — it contains a newline. Users who try the obvious workarounds (URL-encoding a space, removing punctuation) hit a wall before realising the parser path itself is the problem.
  2. Both the model switch and the question silently fail. The question portion of the message is dropped — Hermès does not answer it after the rejection.

Reproduced on tag v2026.5.7 (latest stable as of 2026-05-09).

The same parser path serves both /model X (session-only) and /model X --global, so this rejection affects both forms identically. The bug is at the slash command parser layer, before the override store/resolve logic that's the subject of related Issue #22714.

Reproduction

In a Matrix room where a Hermès Matrix gateway is connected, send a single message containing exactly:

/model ollama-cloud/glm-5.1
Bonjour test, repond OK

(In Element: type /model ollama-cloud/glm-5.1, press Shift+Enter to insert a newline within the same message, type Bonjour test, repond OK, then send.)

Expected: gateway processes the slash command first (with ollama-cloud/glm-5.1 as the model argument), then handles Bonjour test, repond OK as a normal user question — replied to by the just-selected model.

Actual: gateway replies Model names cannot contain spaces (or equivalent rejection — exact wording may differ slightly between versions). The model is not switched. The question is not answered.

Why this matters independently of #22714

This issue is solvable on its own, but it's also a structural blocker for any in-band orchestration pattern Hermès might support in the future.

Issue #22714 requests a documented mechanism for users (and plugins) to pass per-message orchestration hints to a downstream dispatcher — for example via metadata or extra_body on the OpenAI payload. Whatever shape that mechanism takes, users will want to express the override and the question in the same Matrix message — exactly the multi-line pattern this parser bug forbids today.

In other words: even if #22714 ships a clean in-band channel, users will still hit this parser at the entry point of every chat-driven slash command. Fixing this issue is a prerequisite for any natural multi-line slash + payload UX going forward.

It also matters today on its own:

  • Multi-line messages are normal in chat clients — they are the obvious way to attach context, paste code, or combine intent + payload. Forbidding multi-line slash commands is a UX cliff.
  • The error message lies — it talks about spaces, never mentions newlines. Users waste time on the wrong workarounds.
  • No silent recovery — the question is dropped, not answered. The rejection consumes the whole message.
  • Cross-command suspicion — if /model exhibits this multi-line parsing quirk, other slash commands likely share the same handler path and may have the same surprise.

What we tried locally

We wrote a local Hermès plugin (hermes-sovepro) that automates a "replay" pattern: when a transient cloud error happens, store the user's last question for 15 minutes (indexed by Matrix room). On the next /model X from the user, the plugin rewrites the incoming event.text to:

/model <new model>
<stored question>

…so the gateway should switch model and re-run the question in one shot. The plugin's pytest suite passes (8/8 cases, including TTL expiry and multi-room isolation), but live testing immediately fails: the gateway parser rejects the rewritten event with the very "Model names cannot contain spaces" error this issue describes. We currently have the plugin disabled in our runtime, waiting on a parser fix here (and on the broader in-band channel discussed in #22714).

Suggested fix (no pressure to take this approach)

The simplest behavior that would unblock the use case is to split the input on the first newline before parsing the slash command argument:

# pseudocode in the slash command entry point
first_line, _, rest = text.partition("\n")
command, _, args = first_line.partition(" ")
# parse `args` as the slash command argument
# if `rest` is non-empty, route it as a follow-up user message in the same context

This pattern would generalise cleanly to any other slash command that may currently exhibit the same quirk.

If for some reason multi-line slash messages should remain forbidden, an honest error such as "Slash commands must be single-line; please send /model X first, then your question separately" would at least not lie about whitespace.

Environment

  • hermes-agent at tag v2026.5.7.
  • Matrix gateway profile, deployed via hermes profile create + custom config.yaml.
  • Synapse 1.151.0 backend, mautrix==0.21.0.
  • Element X client (and other Matrix clients tested) — the issue is server-side, not client-side.

Related issue

  • Issue #22714: broader request for an in-band channel to drive per-message LLM orchestration in a downstream dispatcher. This issue (#22716) is independent but a prerequisite for any multi-line natural UX in any future in-band pattern.

Happy to provide gateway logs or test variants if that helps narrow down the parser path on your side.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix Matrix gateway: slash command + free-text question in a single multi-line message is rejected, blocking inline orchestration patterns