hermes - 💡(How to fix) Fix refactor(agent/cli/gateway): decompose runtime god objects and unify command architecture

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Latest audit was run against origin/main at commit 3034eee38ec516109566c00975be4d0276747c34 using a detached worktree, so findings are not polluted by local uncommitted changes in the primary checkout.

The current codebase has a small set of high-leverage structural smells that show up repeatedly across the agent, gateway, and CLI layers:

  1. runtime god objects
  2. giant transaction-script methods on hot paths
  3. split-brain CLI entry architecture
  4. core runtime depending directly on CLI package code
  5. command-definition drift despite an intended central registry

These are not just style issues. They increase regression risk, slow down feature work, make targeted tests harder to write, and encourage more bug-fix patches to land in already overloaded files.

Error Message

  • Each class owns orchestration, policy, config access, I/O, formatting, error handling, and lifecycle concerns simultaneously.

Root Cause

  • The hottest execution paths live inside very large methods/classes, so even small changes carry high blast radius.
  • Architectural boundaries are blurry, especially between core runtime and CLI code, which makes reuse and testing harder.
  • The repo already contains evidence of drift and local workaround patterns rather than stable seams.
  • Several open issues appear as symptom-fixes around these hotspots; reducing structural complexity should lower future bug volume.

Fix Action

Fix / Workaround

These are not just style issues. They increase regression risk, slow down feature work, make targeted tests harder to write, and encourage more bug-fix patches to land in already overloaded files.

  • The hottest execution paths live inside very large methods/classes, so even small changes carry high blast radius.
  • Architectural boundaries are blurry, especially between core runtime and CLI code, which makes reuse and testing harder.
  • The repo already contains evidence of drift and local workaround patterns rather than stable seams.
  • Several open issues appear as symptom-fixes around these hotspots; reducing structural complexity should lower future bug volume.

Evidence

  • AIAgent.run_conversation mixes history hydration, runtime setup, memory behavior, tool execution, fallback/retry policy, streaming, persistence, and output assembly in one method.
  • GatewayRunner._run_agent mixes config resolution, proxy/runtime selection, callback wiring, delivery behavior, queue/progress management, and cleanup policy in one method.
  • HermesCLI.process_command mixes alias resolution, UX rendering, confirmation flow, session/title handling, model switching, and command dispatch in one method.

Code Example

flowchart TD
    A[User entrypoints] --> B[AIAgent]
    A --> C[GatewayRunner]
    A --> D[HermesCLI]

    B --> B1[run_conversation 3935 LOC]
    B --> B2[183 methods]
    C --> C1[_run_agent 2051 LOC]
    C --> C2[203 methods]
    D --> D1[run 2305 LOC]
    D --> D2[175 methods]

    B --> E[CLI package helpers]
    C --> E
    D --> F[commands and parser wiring]
    G[hermes_cli main] --> D
    G --> F
    F --> H[Manual sync points]

    B1 --> I[High blast radius]
    C1 --> I
    D1 --> I
    E --> J[Layering leak]
    H --> K[Drift risk]

---

flowchart LR
    AIA[AIAgent facade] --> CE[ConversationEngine]
    AIA --> TEC[ToolExecutionCoordinator]
    AIA --> TH[TurnHistoryLoader]
    AIA --> PR[ProviderRuntime]

    GR[GatewayRunner facade] --> GMP[GatewayMessagePipeline]
    GR --> AR[AgentRunCoordinator]
    GR --> DD[DeliveryDispatcher]

    CLI[HermesCLI facade] --> CD[CLICommandDispatcher]
    CLI --> UX[CLIInteractionFlow]
    CLI --> PS[ParserSpec from registry]

    Core[core runtime modules] --> Shared[shared config env timeout helpers]
    CLI --> Shared
    GR --> Shared
    AIA --> Shared
RAW_BUFFERClick to expand / collapse

Summary

Latest audit was run against origin/main at commit 3034eee38ec516109566c00975be4d0276747c34 using a detached worktree, so findings are not polluted by local uncommitted changes in the primary checkout.

The current codebase has a small set of high-leverage structural smells that show up repeatedly across the agent, gateway, and CLI layers:

  1. runtime god objects
  2. giant transaction-script methods on hot paths
  3. split-brain CLI entry architecture
  4. core runtime depending directly on CLI package code
  5. command-definition drift despite an intended central registry

These are not just style issues. They increase regression risk, slow down feature work, make targeted tests harder to write, and encourage more bug-fix patches to land in already overloaded files.

Why this matters

  • The hottest execution paths live inside very large methods/classes, so even small changes carry high blast radius.
  • Architectural boundaries are blurry, especially between core runtime and CLI code, which makes reuse and testing harder.
  • The repo already contains evidence of drift and local workaround patterns rather than stable seams.
  • Several open issues appear as symptom-fixes around these hotspots; reducing structural complexity should lower future bug volume.

Audit snapshot

  • Audit target: origin/main
  • Commit audited: 3034eee38ec516109566c00975be4d0276747c34
  • Detached worktree used for isolation
  • Repo scan sample:
    • run_agent.py: 16,408 LOC
    • gateway/run.py: 17,073 LOC
    • cli.py: 14,166 LOC
    • hermes_cli/main.py: 12,408 LOC
  • Large runtime classes:
    • AIAgent: 183 methods
    • GatewayRunner: 203 methods
    • HermesCLI: 175 methods

Organized smell map

1. God objects in core runtime entrypoints

Evidence

  • run_agent.py:1113class AIAgent
    • run_agent.py:1136-2527AIAgent.__init__ is ~1,392 LOC
    • run_agent.py:12094-16028AIAgent.run_conversation is ~3,935 LOC
  • gateway/run.py:1175class GatewayRunner
    • gateway/run.py:14519-16569GatewayRunner._run_agent is ~2,051 LOC
    • gateway/run.py:5777-6867GatewayRunner._handle_message is ~1,091 LOC
    • gateway/run.py:7121-8127GatewayRunner._handle_message_with_agent is ~1,007 LOC
  • cli.py:2502class HermesCLI
    • cli.py:11539-13843HermesCLI.run is ~2,305 LOC
    • cli.py:7634-8084HermesCLI.process_command is ~451 LOC

Why this is a smell

  • Each class owns orchestration, policy, config access, I/O, formatting, error handling, and lifecycle concerns simultaneously.
  • There are too few stable seams for isolated tests.
  • Changes to one concern are likely to touch unrelated behavior in the same file/class.

2. Hot paths are giant transaction scripts

Evidence

  • AIAgent.run_conversation mixes history hydration, runtime setup, memory behavior, tool execution, fallback/retry policy, streaming, persistence, and output assembly in one method.
  • GatewayRunner._run_agent mixes config resolution, proxy/runtime selection, callback wiring, delivery behavior, queue/progress management, and cleanup policy in one method.
  • HermesCLI.process_command mixes alias resolution, UX rendering, confirmation flow, session/title handling, model switching, and command dispatch in one method.

Why this is a smell

  • These methods are difficult to reason about, hard to test in pieces, and encourage patch-on-patch growth.
  • Bug fixes tend to land inside the same overloaded functions instead of at dedicated abstractions.

3. CLI ownership is split across two oversized roots

Evidence

  • hermes_cli/main.py:1439 imports from cli import main as cli_main
  • hermes_cli/main.py also maintains its own huge main() at 9699-12404 (~2,706 LOC)
  • cli.py maintains its own large orchestration loop and command dispatcher
  • hermes_cli/main.py:9596-9599 explicitly says _BUILTIN_SUBCOMMANDS must be kept in sync with subparsers.add_parser(...)
  • hermes_cli/main.py contains 161 add_parser(...) calls

Why this is a smell

  • There is no single obvious composition root for CLI behavior.
  • Parser definitions, command metadata, and runtime dispatch are spread across multiple giant modules.
  • This invites drift and makes future command additions harder to keep consistent.

4. Core runtime depends directly on CLI package code

Evidence

  • run_agent.py:105-106 imports hermes_cli.env_loader and hermes_cli.timeouts
  • run_agent.py:193 imports from hermes_cli.config import cfg_get
  • gateway/run.py:55 imports from hermes_cli.config import cfg_get
  • gateway/run.py:80 imports from hermes_cli.commands import _sanitize_telegram_name
  • gateway/run.py:386 imports from hermes_cli.env_loader import load_hermes_dotenv

Why this is a smell

  • The layering is backwards: core runtime and gateway code should not need to depend on CLI package internals.
  • Shared config/runtime helpers likely belong in a neutral runtime/core module.
  • This coupling makes import behavior and dependency direction harder to maintain cleanly.

5. Registry intent exists, but command plumbing still duplicates knowledge

Evidence

  • hermes_cli/commands.py:61-64 declares COMMAND_REGISTRY as the central single source of truth
  • hermes_cli/main.py:9596-9599 still documents a second list that must be manually kept in sync
  • cli.py:7634-8084 still uses a large hand-maintained conditional dispatcher path

Why this is a smell

  • Command definitions are not truly centralized in practice.
  • Parser creation, routing, and metadata are still partially duplicated.
  • This creates maintenance drag and increases risk of inconsistent behavior across CLI/gateway/help/autocomplete surfaces.

Mermaid: current structural problem map

flowchart TD
    A[User entrypoints] --> B[AIAgent]
    A --> C[GatewayRunner]
    A --> D[HermesCLI]

    B --> B1[run_conversation 3935 LOC]
    B --> B2[183 methods]
    C --> C1[_run_agent 2051 LOC]
    C --> C2[203 methods]
    D --> D1[run 2305 LOC]
    D --> D2[175 methods]

    B --> E[CLI package helpers]
    C --> E
    D --> F[commands and parser wiring]
    G[hermes_cli main] --> D
    G --> F
    F --> H[Manual sync points]

    B1 --> I[High blast radius]
    C1 --> I
    D1 --> I
    E --> J[Layering leak]
    H --> K[Drift risk]

Mermaid: target decomposition direction

flowchart LR
    AIA[AIAgent facade] --> CE[ConversationEngine]
    AIA --> TEC[ToolExecutionCoordinator]
    AIA --> TH[TurnHistoryLoader]
    AIA --> PR[ProviderRuntime]

    GR[GatewayRunner facade] --> GMP[GatewayMessagePipeline]
    GR --> AR[AgentRunCoordinator]
    GR --> DD[DeliveryDispatcher]

    CLI[HermesCLI facade] --> CD[CLICommandDispatcher]
    CLI --> UX[CLIInteractionFlow]
    CLI --> PS[ParserSpec from registry]

    Core[core runtime modules] --> Shared[shared config env timeout helpers]
    CLI --> Shared
    GR --> Shared
    AIA --> Shared

Scope

In scope

  • Define a staged refactor plan for the runtime monolith hotspots
  • Extract the first set of focused collaborators from the largest execution paths
  • Move shared config/env/runtime helpers out of hermes_cli into a neutral module
  • Make command metadata the actual source of truth for parser + dispatch wiring
  • Add tests around extracted seams to preserve behavior during decomposition

Out of scope

  • Behavior changes to user-facing agent features unless required to preserve correctness during extraction
  • Platform-specific cleanup projects that already have dedicated issues
  • Replacing all print() calls in the repo as a standalone effort
  • Provider-specific bug fixes not directly caused by the structural refactor work

Proposed execution plan

Phase 1 — establish seams and inventory

  • Identify the smallest stable boundaries inside:
    • AIAgent.run_conversation
    • GatewayRunner._run_agent
    • HermesCLI.run
  • Introduce thin helper objects/modules without behavior changes first
  • Add characterization tests around current behavior before moving logic

Phase 2 — extract core agent orchestration

  • Keep AIAgent as façade/composition root
  • Extract at minimum:
    • turn setup / runtime context builder
    • conversation loop engine
    • tool execution coordination
    • post-turn persistence / cleanup

Phase 3 — fix layering direction

  • Move shared config/env/timeout/runtime helpers out of hermes_cli
  • Replace core imports of hermes_cli.* with neutral shared modules
  • Keep CLI-specific UI/rendering logic in CLI-only packages

Phase 4 — unify command source of truth

  • Generate parser specs and dispatch tables from COMMAND_REGISTRY
  • Remove or minimize manual sync lists such as _BUILTIN_SUBCOMMANDS
  • Reduce if/elif dispatch footprint in cli.py

Phase 5 — shrink façade classes

  • Reduce public façade classes to orchestration only
  • Push domain logic into extracted modules with focused tests
  • Track post-refactor metrics for method counts and largest function sizes

Acceptance criteria

  • AIAgent, GatewayRunner, and HermesCLI each lose meaningful responsibility through extracted modules, not just code movement into equally giant helpers
  • AIAgent.run_conversation, GatewayRunner._run_agent, and HermesCLI.run are materially smaller and easier to test in isolation
  • Core runtime no longer imports CLI-only modules for shared config/env/runtime behavior
  • Command metadata is used as the real source of truth for parser/dispatch wiring, with reduced manual sync burden
  • Characterization/regression tests cover the extracted seams so behavior remains stable
  • This issue can be split into follow-up implementation issues/PRs with clear boundaries

Risks and rollback

  • Risk: accidental behavior drift during extraction
    • Mitigation: characterization tests before structural moves; extract behind stable façades
  • Risk: moving shared helpers breaks imports in scripts/plugins/tests
    • Mitigation: add compatibility shims during transition, then remove in follow-up
  • Risk: refactor becomes cosmetic file-shuffling without true ownership cleanup
    • Mitigation: require measurable shrink targets for the largest methods/classes

Related / not duplicated by

  • #24688 — print logging cleanup in run_agent.py
  • #24325 — Discord adapter plugin migration
  • #13473 — provider transport refactor

This issue is broader and focused on the runtime architecture and decomposition seams that cut across agent, gateway, and CLI layers.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix refactor(agent/cli/gateway): decompose runtime god objects and unify command architecture