hermes - ✅(Solved) Fix platforms: split storage from LLM-invocation gate (group-chat 'observe but don't invoke' mode) [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#15621Fetched 2026-04-26 05:26:07
View on GitHub
Comments
0
Participants
1
Timeline
8
Reactions
0
Author
Participants
Timeline (top)
referenced ×4labeled ×3cross-referenced ×1

Group-chat platforms (WhatsApp, Slack, Telegram, Discord) today have a single gate (require_mention / equivalent) that conflates two semantically distinct concerns:

  1. Should this message be persisted as conversation history?
  2. Should the LLM be invoked for this turn?

When require_mention=true, untagged messages are dropped entirely — they never reach state.db. When the bot is later @-mentioned, the agent has no context of the conversation that preceded the mention. When require_mention=false, the LLM runs on every message — expensive in tokens, wider prompt-injection surface, often unwanted because the user explicitly does NOT want the bot replying to ambient banter.

The natural mode for group chats is observe always, invoke on tag:

  • Untagged group messages → store in conversation history, agent has context for next mention
  • @-mentioned messages → store + invoke LLM as today

Neither current setting expresses this. Filing this issue to discuss the design before putting up a PR.

Root Cause

When require_mention=true, untagged messages are dropped entirely — they never reach state.db. When the bot is later @-mentioned, the agent has no context of the conversation that preceded the mention. When require_mention=false, the LLM runs on every message — expensive in tokens, wider prompt-injection surface, often unwanted because the user explicitly does NOT want the bot replying to ambient banter.

Fix Action

Fix / Workaround

Dispatcher branch on the result. For STORE_ONLY:

  • Append [<sender_name>] <body> to the session's conversation_history (using the existing sender-prefix patch from #15413)
  • Persist to state.db/messages via the same SessionState write path the agent loop uses
  • Skip agent.run / LLM invocation entirely
  • No SSE event flows to clients (no reply being computed)

PR fix notes

PR #15633: fix(auxiliary): generalize unsupported-parameter detector and harden max_tokens retry

Description (problem / solution / changelog)

Summary

Generalizes the temperature-specific 400 retry that landed in PR #15621 so the same reactive strategy covers any rejected request parameter, and fixes a latent bug in the pre-existing max_tokensmax_completion_tokens retry branch.

Credit @nicholasrae (PR #15416) for the generalization pattern. His PR also proposed the temperature retry, which landed independently via #15621 + #15623.

Changes

  • agent/auxiliary_client.py:
    • New _is_unsupported_parameter_error(exc, param) — matches the same phrasings as the old temperature detector plus unrecognized parameter / invalid parameter, against any named param.
    • _is_unsupported_temperature_error is now a back-compat wrapper (existing imports/tests unchanged).
    • max_tokens retry branch now gates on max_tokens is not None (was silently assigning max_completion_tokens = None on the retry) and ALSO matches via the generic helper, so phrasings like Unknown parameter: max_tokens no longer slip through.
  • tests/agent/test_unsupported_parameter_retry.py: 18 new tests.

Validation

BeforeAfter
tests/agent/test_unsupported_temperature_retry.py19 pass19 pass
tests/agent/test_unsupported_parameter_retry.py18 pass
tests/run_agent/test_flush_memories_codex.py + tests/agent/test_auxiliary_client.py86 pass86 pass

No behavior change for the reported bug (that's fixed by #15621 + #15623 on main). This PR only hardens the surrounding retry ladder for future provider quirks and the latent None-max_tokens edge case.

Changed files

  • agent/auxiliary_client.py (modified, +40/-15)
  • tests/agent/test_unsupported_parameter_retry.py (added, +201/-0)

Code Example

class HandlingMode(Enum):
    DROP = "drop"               # filtered (allowlist fail, bot's own echo, etc.)
    STORE_ONLY = "store_only"   # append to history, DON'T invoke agent loop
    PROCESS = "process"         # full path — store + invoke

def _classify(self, data) -> HandlingMode:
    if not self._is_allowed(data):
        return HandlingMode.DROP
    if not self._is_group(data):
        return HandlingMode.PROCESS
    if data["chatId"] in self._whatsapp_free_response_chats():
        return HandlingMode.PROCESS
    if self._whatsapp_require_mention():
        if self._message_addresses_bot(data):
            return HandlingMode.PROCESS
        return HandlingMode.STORE_ONLY    # ← NEW: today this is DROP
    return HandlingMode.PROCESS
RAW_BUFFERClick to expand / collapse

Summary

Group-chat platforms (WhatsApp, Slack, Telegram, Discord) today have a single gate (require_mention / equivalent) that conflates two semantically distinct concerns:

  1. Should this message be persisted as conversation history?
  2. Should the LLM be invoked for this turn?

When require_mention=true, untagged messages are dropped entirely — they never reach state.db. When the bot is later @-mentioned, the agent has no context of the conversation that preceded the mention. When require_mention=false, the LLM runs on every message — expensive in tokens, wider prompt-injection surface, often unwanted because the user explicitly does NOT want the bot replying to ambient banter.

The natural mode for group chats is observe always, invoke on tag:

  • Untagged group messages → store in conversation history, agent has context for next mention
  • @-mentioned messages → store + invoke LLM as today

Neither current setting expresses this. Filing this issue to discuss the design before putting up a PR.

Proposed shape

Tri-state gate in gateway/platforms/<adapter>.py:

class HandlingMode(Enum):
    DROP = "drop"               # filtered (allowlist fail, bot's own echo, etc.)
    STORE_ONLY = "store_only"   # append to history, DON'T invoke agent loop
    PROCESS = "process"         # full path — store + invoke

def _classify(self, data) -> HandlingMode:
    if not self._is_allowed(data):
        return HandlingMode.DROP
    if not self._is_group(data):
        return HandlingMode.PROCESS
    if data["chatId"] in self._whatsapp_free_response_chats():
        return HandlingMode.PROCESS
    if self._whatsapp_require_mention():
        if self._message_addresses_bot(data):
            return HandlingMode.PROCESS
        return HandlingMode.STORE_ONLY    # ← NEW: today this is DROP
    return HandlingMode.PROCESS

Dispatcher branch on the result. For STORE_ONLY:

  • Append [<sender_name>] <body> to the session's conversation_history (using the existing sender-prefix patch from #15413)
  • Persist to state.db/messages via the same SessionState write path the agent loop uses
  • Skip agent.run / LLM invocation entirely
  • No SSE event flows to clients (no reply being computed)

Same abstraction applies cleanly to slack/telegram/discord — they all have the same should_process shape today and would benefit identically.

Why this matters cross-platform

Every group-chat bot deployment hits this. Discord bots, Slack bots, even IRC bots have the same problem — you want context awareness across the channel, but you don't want the model running on every message. The current binary is the wrong abstraction.

Backward compatibility

  • Existing require_mention=false behavior unchanged — every message still triggers PROCESS.
  • Existing require_mention=true behavior CHANGES — instead of dropping untagged, store them. New WHATSAPP_DROP_UNTAGGED=true (default false) preserves the old drop-entirely behavior for users who explicitly want that (cost minimization, privacy, etc.).

If maintainers prefer not to flip the default, an explicit WHATSAPP_GROUP_MODE: drop|store|process enum is also fine.

PR coming

I'll put up a PR shortly with the implementation. Wanted to file the issue first so the design is debatable before code. The reference deployment is my own setup — group chat with two friends; today I have to choose between "agent has no context of any banter" (require_mention=true) and "agent runs on every troll message" (require_mention=false). The store-only mode resolves it cleanly.

extent analysis

TL;DR

Implement a tri-state gate in the gateway to handle messages as "drop", "store_only", or "process" to address the issue of conflating conversation history and LLM invocation.

Guidance

  • Introduce a new HandlingMode enum with "drop", "store_only", and "process" values to replace the binary require_mention flag.
  • Update the _classify method to return the new HandlingMode based on the message data and platform-specific rules.
  • Implement a dispatcher branch to handle "store_only" messages by appending to conversation history and persisting to state.db without invoking the LLM.
  • Consider adding a new configuration option, such as WHATSAPP_GROUP_MODE, to allow users to choose between the new behavior and the old "drop" behavior.

Example

class HandlingMode(Enum):
    DROP = "drop"
    STORE_ONLY = "store_only"
    PROCESS = "process"

def _classify(self, data) -> HandlingMode:
    # ...
    if self._whatsapp_require_mention():
        if self._message_addresses_bot(data):
            return HandlingMode.PROCESS
        return HandlingMode.STORE_ONLY
    # ...

Notes

The proposed solution assumes that the existing sender-prefix patch from #15413 can be reused to append sender information to the conversation history.

Recommendation

Apply the proposed tri-state gate workaround to address the issue, as it provides a more fine-grained control over message handling and resolves the conflation of conversation history and LLM invocation.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING