hermes - 💡(How to fix) Fix Stabilize messaging gateway tests and clarify Kakao-agent OSS direction

hermes2026-05-25 10:36:42

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

This issue records the current debugging pass around the messaging-gateway / Kakao-agent-adjacent open-source direction, plus what was changed locally, what failed during full-suite verification, and what the project should prioritize next.

The immediate theme: the gateway layer is powerful, but test behavior is currently too dependent on the operator's live environment and import order. To make the Kakao/Discord-style always-on agent story solid as OSS, we need stricter test isolation, clearer adapter contracts, and a product direction that treats chat-platform agents as first-class runtimes.

Root Cause

Discord mock completeness
- Added missing fake Discord SDK pieces used by production/tests:
  - Forbidden
  - MessageType
  - Object
  - AllowedMentions
- This reduced failures caused by partial mocks diverging from production expectations.

Fix Action

Fix / Workaround

Tests that need env should opt in with local monkeypatch.setenv(...).

Code Example

15 files changed, 248 insertions(+), 16 deletions(-)

---

agent/agent_init.py
gateway/platforms/api_server.py
gateway/platforms/discord.py
tests/agent/test_anthropic_adapter.py
tests/agent/test_auxiliary_client.py
tests/e2e/conftest.py
tests/gateway/conftest.py
tests/gateway/test_discord_allowed_mentions.py
tests/gateway/test_discord_channel_controls.py
tests/gateway/test_discord_clarify_buttons.py
tests/gateway/test_discord_roles_dm_scope.py
tests/gateway/test_discord_slash_commands.py
tests/tools/test_file_tools.py
tools/file_tools.py
tools/memory_tool.py
tests/tools/test_memory_scope.py  # new, untracked locally

---

tests/gateway/test_discord_allowed_mentions.py: 19 passed
tests/gateway/test_discord_clarify_buttons.py: 14 passed

---

tests/e2e/test_discord_adapter.py
tests/gateway/test_discord_allowed_mentions.py
tests/gateway/test_discord_clarify_buttons.py
=> 39 passed

---

tests/gateway/test_discord_free_response.py::test_discord_free_response_channel_skips_auto_thread
tests/gateway/test_discord_channel_controls.py
=> 16 passed

---

tests/e2e/test_discord_adapter.py
tests/gateway/test_discord_allowed_mentions.py
tests/gateway/test_discord_clarify_buttons.py
tests/gateway/test_discord_free_response.py
tests/gateway/test_discord_channel_controls.py
=> 88 passed

---

tests/gateway/test_discord_roles_dm_scope.py
tests/gateway/test_discord_slash_auth.py
tests/gateway/test_discord_channel_controls.py
=> 59 passed

---

FAILED tests/gateway/test_discord_clarify_buttons.py

---

1 failed, 5685 passed, 10 skipped, 170 warnings
FAILED tests/gateway/test_discord_free_response.py::test_discord_free_response_channel_skips_auto_thread

---

1 failed, 5759 passed, 10 skipped, 171 warnings
FAILED tests/gateway/test_discord_roles_dm_scope.py::test_slash_authorization_allows_in_scope_guild_role

---

(False, 'channel not in DISCORD_ALLOWED_CHANNELS')

---

process exit code: 143
progress reached ~13%

---

Platform SDK event -> normalized MessageEvent/Interaction -> policy engine -> send action

---

tests/fakes/discord_sdk.py
tests/fakes/kakao_sdk.py
tests/fakes/platform_events.py

---

DISCORD_*
TELEGRAM_*
SLACK_*
KAKAO_*
HERMES_GATEWAY_*
HERMES_SESSION_*

---

free-response channel => mention bypass + inline reply + no auto-thread
normal mentioned channel => auto-thread if enabled
thread => same thread, mention policy depends on thread_require_mention
DM => always direct, role auth only with explicit trusted guild opt-in

RAW_BUFFERClick to expand / collapse

Summary

Local changes made in this pass

Current working tree summary at time of issue creation:

15 files changed, 248 insertions(+), 16 deletions(-)

Files touched:

agent/agent_init.py
gateway/platforms/api_server.py
gateway/platforms/discord.py
tests/agent/test_anthropic_adapter.py
tests/agent/test_auxiliary_client.py
tests/e2e/conftest.py
tests/gateway/conftest.py
tests/gateway/test_discord_allowed_mentions.py
tests/gateway/test_discord_channel_controls.py
tests/gateway/test_discord_clarify_buttons.py
tests/gateway/test_discord_roles_dm_scope.py
tests/gateway/test_discord_slash_commands.py
tests/tools/test_file_tools.py
tools/file_tools.py
tools/memory_tool.py
tests/tools/test_memory_scope.py  # new, untracked locally

Discord/gateway-specific fixes completed

Discord mock completeness
- Added missing fake Discord SDK pieces used by production/tests:
  - Forbidden
  - MessageType
  - Object
  - AllowedMentions
- This reduced failures caused by partial mocks diverging from production expectations.
Discord import-order stabilization
- Full-suite collection could import gateway.platforms.discord first through e2e tests using a smaller Discord mock.
- Later gateway tests expected the richer mock, but the production module had already captured the smaller mock.
- Added rebinding/redefinition logic in gateway test conftest so gateway.platforms.discord.discord is rebound to the comprehensive fake module and Discord view classes are rebuilt when needed.
Clarify button tests stabilized
- tests/gateway/test_discord_clarify_buttons.py now forces the comprehensive mock before importing/binding ClarifyChoiceView.
- Root cause was import-order-dependent UI class inheritance: the view could accidentally subclass a MagicMock-based fake instead of a deterministic fake View.
E2E Discord env isolation
- Added autouse fixture in tests/e2e/conftest.py to clear live routing env vars:
  - DISCORD_ALLOWED_CHANNELS
  - DISCORD_IGNORED_CHANNELS
  - DISCORD_FREE_RESPONSE_CHANNELS
  - DISCORD_NO_THREAD_CHANNELS
  - DISCORD_REQUIRE_MENTION
  - DISCORD_AUTO_THREAD
- This prevents tests from inheriting the developer/operator's live gateway config.
DM fake channel class alignment
- make_fake_dm_channel() now uses the current production module's discord.DMChannel reference instead of a stale locally imported fake class.
- This avoids isinstance(..., discord.DMChannel) drift after mock rebinding.
Free-response channel semantics aligned with docs
- Current docs state that free-response channels answer inline and skip auto-threading.
- Production logic was clarified so skip_thread = is_free_channel or no_thread_channels.
- Added/updated regression test: free-response channels bypass mention checks and keep replies inline.
Role/DM authorization tests isolated from channel env
- tests/gateway/test_discord_roles_dm_scope.py was testing guild-scoped role authorization, but live DISCORD_ALLOWED_CHANNELS caused slash authorization to fail earlier at the channel gate.
- Added autouse fixture clearing DISCORD_ALLOWED_CHANNELS and DISCORD_IGNORED_CHANNELS for that file.

Verification run history

Passed targeted checks

tests/gateway/test_discord_allowed_mentions.py: 19 passed
tests/gateway/test_discord_clarify_buttons.py: 14 passed

Import-order reproduction group:

tests/e2e/test_discord_adapter.py
tests/gateway/test_discord_allowed_mentions.py
tests/gateway/test_discord_clarify_buttons.py
=> 39 passed

Free-response/channel controls after fix:

tests/gateway/test_discord_free_response.py::test_discord_free_response_channel_skips_auto_thread
tests/gateway/test_discord_channel_controls.py
=> 16 passed

Discord order/free-response/channel group:

tests/e2e/test_discord_adapter.py
tests/gateway/test_discord_allowed_mentions.py
tests/gateway/test_discord_clarify_buttons.py
tests/gateway/test_discord_free_response.py
tests/gateway/test_discord_channel_controls.py
=> 88 passed

Role/slash/channel authorization group:

tests/gateway/test_discord_roles_dm_scope.py
tests/gateway/test_discord_slash_auth.py
tests/gateway/test_discord_channel_controls.py
=> 59 passed

Full-suite failures encountered and root causes

1. Clarify buttons failed only in full-suite order

Observed:

FAILED tests/gateway/test_discord_clarify_buttons.py

Root cause:

e2e tests installed/imported a smaller Discord mock earlier in collection.
gateway.platforms.discord captured that mock.
Gateway clarify tests expected the richer fake UI implementation.
Result: view children behaved like MagicMock/empty children instead of deterministic fake buttons.

Fix:

Rebind Discord mock and rebuild Discord view classes before gateway clarify tests bind ClarifyChoiceView.

2. Free-response channel skipped-auto-thread expectation failed

Observed full-suite result:

1 failed, 5685 passed, 10 skipped, 170 warnings
FAILED tests/gateway/test_discord_free_response.py::test_discord_free_response_channel_skips_auto_thread

Root cause:

Docs/new regression test expected free-response channels to respond inline.
Production comment/logic and one older channel-controls test still allowed auto-threading for free-response channels.

Fix:

Align production behavior with docs: free-response channels skip auto-thread.
Update channel-controls test expectation to group/inline behavior.

3. Slash role authorization positive control failed

Observed full-suite result:

1 failed, 5759 passed, 10 skipped, 171 warnings
FAILED tests/gateway/test_discord_roles_dm_scope.py::test_slash_authorization_allows_in_scope_guild_role

Root cause:

The test intended to validate in-scope guild role authorization.
The local live environment had DISCORD_ALLOWED_CHANNELS set.
Slash authorization evaluates channel scope before user/role scope, so the test failed with:

(False, 'channel not in DISCORD_ALLOWED_CHANNELS')

Fix:

Clear channel gating env vars for role/DM scope tests.

4. Final full-suite run was intentionally stopped

User requested to stop repeating full-suite attempts.

Observed stopped run:

process exit code: 143
progress reached ~13%

This was an intentional kill, not a newly diagnosed test failure.

Current risks / unresolved items

Full suite has not completed green after the latest fixes
- Targeted groups pass.
- Last full run was stopped by request at ~13%.
Tests still depend on live operator environment unless explicitly isolated
- The Discord issues likely point to a broader pattern.
- Gateway tests should not inherit live DISCORD_*, TELEGRAM_*, Kakao, Slack, or profile runtime env unless the test explicitly opts in.
Discord mock strategy is duplicated and order-sensitive
- e2e and gateway suites each define Discord fakes.
- The production module captures module-level SDK references, making mock replacement fragile.
Messaging platform semantics are not fully centralized
- Free-response, auto-thread, no-thread, allowed/ignored channel rules are scattered across docs, tests, and adapter code.
- The docs were the clearest source of truth in this pass, but tests/code had drift.
Kakao-agent OSS story is not yet carved out as a clean platform-runtime product
- Current Hermes gateway supports many platforms, but Kakao-style always-on personal/relationship agent automation needs explicit packaging, contracts, and safety boundaries.

Recommended direction for Kakao-agent / messaging-agent OSS

1. Treat chat-platform agents as first-class runtimes

Instead of positioning Kakao/Discord/Telegram as thin notification channels, define them as runtime surfaces with consistent contracts:

identity model: user, room, thread/topic, guild/server
session scoping model: per user, per room, per thread, shared/private
reply model: inline, thread, DM fallback, media attachments
permission model: allow users, roles, rooms, admin-only commands
lifecycle model: startup, reconnect, dedup, replay, graceful shutdown

2. Split platform adapter logic from policy logic

Current adapter logic mixes SDK event parsing, authorization, routing, threading, and product UX.

Recommended extraction:

gateway/policy/channel_scope.py
gateway/policy/authorization.py
gateway/policy/threading.py
gateway/policy/session_scope.py

Then adapters become mostly translators:

Platform SDK event -> normalized MessageEvent/Interaction -> policy engine -> send action

This would make Kakao, Discord, Telegram, and Slack behavior easier to test consistently.

3. Centralize fake platform SDKs for tests

Create shared test fakes instead of per-suite mocks:

tests/fakes/discord_sdk.py
tests/fakes/kakao_sdk.py
tests/fakes/platform_events.py

Requirements:

deterministic classes for DMChannel, Thread, MessageType, UI View/Button, etc.
no MagicMock as a base class for important SDK types
explicit support for import-order rebinding or, better, dependency injection

4. Add global test env isolation

A root pytest fixture should clear platform runtime env by default:

DISCORD_*
TELEGRAM_*
SLACK_*
KAKAO_*
HERMES_GATEWAY_*
HERMES_SESSION_*

Tests that need env should opt in with local monkeypatch.setenv(...).

This is especially important because Hermes agents are often tested on the same machine that is running real gateways.

5. Define product-level modes for Kakao-agent

Kakao-agent OSS should probably expose clear modes rather than asking users to infer behavior from config flags:

Personal assistant mode: only owner-authorized chats, conservative responses, memory enabled.
Family/group helper mode: room-scoped behavior, stricter safety/consent, no surprise proactive messages.
Business support mode: CRM-ish memory, templates, audit logging.
Automation bridge mode: command execution gated, no conversational memory by default.

Each mode should document:

what messages are read
what is stored
who can trigger actions
what external side effects require confirmation
how to pause/disable quickly

6. Make safety and privacy a feature, not an afterthought

For open-source adoption, Kakao-agent needs a strong trust story:

room allowlist and denylist
per-room memory on/off
per-contact consent / visibility rules
redaction of secrets and phone numbers in logs
dry-run mode for sends/actions
admin audit feed
easy kill switch

7. Add a small conformance suite for every platform adapter

Each platform adapter should pass the same behavioral tests:

ignores unauthorized user
allows authorized user
respects room allowlist/denylist
handles DM vs group vs thread/session scope
deduplicates replayed messages
refuses unsafe commands without approval
delivers media attachments consistently

This would prevent Kakao-specific fixes from regressing Discord/Telegram and vice versa.

8. Document the canonical UX decisions

The free-response/auto-thread mismatch came from docs/tests/code drift.

Recommendation:

Keep a short canonical behavior table under docs.
Tests should reference those product rules.
Adapter comments should link to the docs or a policy module.

Example:

free-response channel => mention bypass + inline reply + no auto-thread
normal mentioned channel => auto-thread if enabled
thread => same thread, mention policy depends on thread_require_mention
DM => always direct, role auth only with explicit trusted guild opt-in

Proposed next engineering steps

Add a root tests/conftest.py env-isolation fixture for all platform/gateway env vars.
Move Discord fake SDK into a shared reusable fake module.
Audit all gateway tests for live env assumptions.
Extract authorization/channel/thread policy helpers and write pure unit tests for them.
Add a Kakao-agent conformance test skeleton using normalized gateway events.
Re-run full suite in CI-like clean environment.
Only then split this working tree into focused PRs:
- test isolation / fake SDK cleanup
- Discord free-response semantics
- memory/file/API changes currently also present in the diff
- Kakao-agent OSS runtime direction docs

Notes

This issue intentionally records both technical fixes and product direction because the failures are symptoms of the same underlying problem: messaging agents need deterministic runtime policy, isolated tests, and clear platform UX contracts before they can be reliable as open-source infrastructure.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

hermes - 💡(How to fix) Fix Stabilize messaging gateway tests and clarify Kakao-agent OSS direction

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Code Example

Summary

Local changes made in this pass

Discord/gateway-specific fixes completed

Verification run history

Passed targeted checks

Full-suite failures encountered and root causes

1. Clarify buttons failed only in full-suite order

2. Free-response channel skipped-auto-thread expectation failed

3. Slash role authorization positive control failed

4. Final full-suite run was intentionally stopped

Current risks / unresolved items

Recommended direction for Kakao-agent / messaging-agent OSS

1. Treat chat-platform agents as first-class runtimes

2. Split platform adapter logic from policy logic

3. Centralize fake platform SDKs for tests

4. Add global test env isolation

5. Define product-level modes for Kakao-agent

6. Make safety and privacy a feature, not an afterthought

7. Add a small conformance suite for every platform adapter

8. Document the canonical UX decisions

Proposed next engineering steps

Notes

Still need to ship something?

TRENDING