hermes - 💡(How to fix) Fix [Feature]: Expose model_switch as an agent-callable tool for autonomous task-complexity-based routing [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#16525Fetched 2026-04-28 06:52:47
View on GitHub
Comments
0
Participants
1
Timeline
5
Reactions
0
Participants
Timeline (top)
labeled ×5

Fix Action

Fix / Workaround

A self-hosted Hermes deployment with a single OAuth ChatGPT Plus subscription. The user wants their agent (@rza_hermes_bot) to default to gpt-5.4-mini for routine ops (status checks, log reads, YAML edits) but auto-upgrade to gpt-5.4 for medium-complexity tasks (multi-file refactor, debug Python with cross-module stack) and gpt-5.5 for code-heavy work (script generation >100 lines, architecture review, long-context research). All three models are accessible via the same OAuth, so the only constraint is the weekly quota — sub-evaluating complexity preserves quota for other profiles sharing the OAuth.

A SOUL.md classification rule (A/B/C tiers) was attempted, but the agent literally cannot execute /model — it has no tool to call. Workaround currently in place: bot detects complexity, prefixes its response with [⚠️ Recommande gpt-5.X — <reason>. Type /model gpt-5.X then resend if you want to upgrade], then answers in mini. This works but adds a round-trip per complex task.

Workaround in production today

Code Example

@tool(
    name="model_switch",
    description=(
        "Switch the active model for the current session. Use when the current task "
        "requires significantly more (or less) capability than the default model. "
        "The switch takes effect for the NEXT response (or the current one, if Hermes "
        "supports mid-turn re-prompting). Provide a short justification to help users "
        "understand the cost/benefit."
    ),
)
def model_switch(slug: str, reason: str, scope: Literal["session", "turn"] = "session") -> dict:
    """
    Args:
        slug: Model identifier accepted by the current provider (e.g., gpt-5.5, claude-opus-4-7).
        reason: 1-sentence justification — surfaced to user in the response.
        scope: "session" (persists until /model reset, default) or "turn" (one-shot, revert after).
    
    Returns:
        {"old_model": str, "new_model": str, "provider": str, "scope": str, "applied_at": "next_turn"}
    """
RAW_BUFFERClick to expand / collapse

[Feature]: Expose model_switch as an agent-callable tool for autonomous task-complexity-based routing

Problem / Use Case

Hermes currently has three model-routing mechanisms, none of which solves autonomous self-routing by the agent:

  1. /model <slug> slash command — handled by gateway/run.py:5794:_handle_model_command. Triggered only when a user message starts with /model. The agent cannot trigger it (writing /model gpt-5.5 in its response is just text — the gateway does not re-inject bot messages as user commands).
  2. smart_model_routing config schema — keys exist in DEFAULT_CONFIG (max_simple_chars, max_simple_words, cheap_model) but no Python implementation is found in ~/.hermes/hermes-agent/agent/, hermes_cli/, or tools/ (only traces in .git/objects/pack/). The flag does nothing when enabled.
  3. auxiliary task models — route specific sub-tasks (compression, vision, search) to specialized models, not the main agent.

There is no way for the agent itself to look at a user prompt, classify its complexity, and upgrade its own model accordingly — even though agent.switch_model() exists in core (cli.py:5226, tui_gateway/server.py:756) and the gateway already implements per-session model overrides (_session_model_overrides[session_key] in gateway/run.py).

Related issues:

  • #157 — User-Configurable Multi-Model Routing with Capability Categories (broader framework, tool-declared needs)
  • #5997 — LLM model switch by skill (skill-level frontmatter declaration)

This issue is more focused: let the agent itself call a tool to switch its own model based on classification logic that lives in the agent's system prompt (SOUL.md) or in skills.

Concrete use case (driving this request)

A self-hosted Hermes deployment with a single OAuth ChatGPT Plus subscription. The user wants their agent (@rza_hermes_bot) to default to gpt-5.4-mini for routine ops (status checks, log reads, YAML edits) but auto-upgrade to gpt-5.4 for medium-complexity tasks (multi-file refactor, debug Python with cross-module stack) and gpt-5.5 for code-heavy work (script generation >100 lines, architecture review, long-context research). All three models are accessible via the same OAuth, so the only constraint is the weekly quota — sub-evaluating complexity preserves quota for other profiles sharing the OAuth.

A SOUL.md classification rule (A/B/C tiers) was attempted, but the agent literally cannot execute /model — it has no tool to call. Workaround currently in place: bot detects complexity, prefixes its response with [⚠️ Recommande gpt-5.X — <reason>. Type /model gpt-5.X then resend if you want to upgrade], then answers in mini. This works but adds a round-trip per complex task.

Proposed Solution

Add a built-in agent tool model_switch that wraps the existing infrastructure:

File: tools/model_switch_tool.py (new)

Tool signature (rough sketch):

@tool(
    name="model_switch",
    description=(
        "Switch the active model for the current session. Use when the current task "
        "requires significantly more (or less) capability than the default model. "
        "The switch takes effect for the NEXT response (or the current one, if Hermes "
        "supports mid-turn re-prompting). Provide a short justification to help users "
        "understand the cost/benefit."
    ),
)
def model_switch(slug: str, reason: str, scope: Literal["session", "turn"] = "session") -> dict:
    """
    Args:
        slug: Model identifier accepted by the current provider (e.g., gpt-5.5, claude-opus-4-7).
        reason: 1-sentence justification — surfaced to user in the response.
        scope: "session" (persists until /model reset, default) or "turn" (one-shot, revert after).
    
    Returns:
        {"old_model": str, "new_model": str, "provider": str, "scope": str, "applied_at": "next_turn"}
    """

Implementation skeleton: The tool reuses hermes_cli.model_switch.switch_model() (already imported by gateway/run.py:5805 and cli.py:5331) and updates gateway._session_model_overrides[session_key] for the running session. The agent process picks up the new model on the next turn via the existing _evict_cached_agent + override resolution path.

Toolset registration: Add model_switch to the default toolset (or behind a config flag agent.allow_self_model_switch: true for opt-in safety).

Optional safeguards:

  • model_switch_allowlist config key listing which models the agent is allowed to switch to (defaults to all configured providers' models).
  • approvals.model_switch: manual to require user /approve before each switch (consistent with the existing dangerous_commands HITL pattern).

Out of scope (for this issue)

  • Auto-classification of prompt complexity (that's the agent's job via SOUL.md; this issue just enables the mechanism for the agent to act on its classification).
  • Tool-declared model requirements (covered by #157).
  • Skill-level frontmatter routing (covered by #5997).

This issue is the smallest atomic change: make the existing gateway-internal _session_model_overrides mechanism agent-callable via a new tool. SOUL.md prompts, skills, and external orchestrators can then build smarter routing policies on top.

Why this is valuable

  • Cost control without sacrificing capability: the agent self-throttles, using cheap models for routine work and expensive ones only when justified.
  • Aligns with agentic patterns in adjacent frameworks (Open Interpreter, OpenHands, sweep.dev) where agents have meta-control over their own runtime.
  • Composes with existing features: combine with dangerous_commands.mode = manual and the agent gets full HITL control over its own escalation.
  • Single-OAuth multi-tier setups (like ChatGPT Plus with mini/standard/pro models on one subscription) benefit immediately.

Workaround in production today

For users who need this now, the workaround is:

  1. SOUL.md tells the agent to detect A/B/C complexity tiers
  2. For B/C tasks, the agent prefixes its response with [⚠️ Recommande gpt-5.X — <reason>. Type /model gpt-5.X then resend if you want to upgrade]
  3. The user types /model gpt-5.X then re-sends their prompt
  4. The new turn runs on the upgraded model

Functional but adds a round-trip + relies on user discipline.

Related code references

  • gateway/run.py:5794_handle_model_command (current /model user-only path)
  • gateway/run.py:5928,6056 — where _session_model_overrides[key] is set
  • cli.py:5226,5447 — where agent.switch_model(...) is called
  • hermes_cli/model_switch.pyswitch_model() core logic (~600 lines, well-factored)
  • agent/auxiliary_client.py:2308get_auxiliary_extra_body (pattern for tool-level model routing, not session-level)

I'm willing to draft a PR if there's interest from maintainers — just want to confirm direction before investing.

extent analysis

TL;DR

Implement a new agent tool model_switch to enable autonomous model switching based on task complexity.

Guidance

  1. Create a new tool: Develop model_switch_tool.py with the proposed model_switch function, utilizing existing hermes_cli.model_switch.switch_model() and updating _session_model_overrides.
  2. Register the tool: Add model_switch to the default toolset or behind a config flag agent.allow_self_model_switch for opt-in safety.
  3. Implement safeguards: Consider adding model_switch_allowlist and approvals.model_switch config keys for security and control.
  4. Test and refine: Verify the tool's functionality and adjust as needed to ensure seamless integration with existing features.

Example

@tool(
    name="model_switch",
    description="Switch the active model for the current session.",
)
def model_switch(slug: str, reason: str, scope: Literal["session", "turn"] = "session") -> dict:
    # Implementation details
    pass

Notes

This solution focuses on creating a mechanism for the agent to switch models based on task complexity, without addressing the complexity classification itself, which is assumed to be handled by the agent's SOUL.md prompts or skills.

Recommendation

Apply the proposed workaround until the model_switch tool is implemented, as it provides a functional, albeit less efficient, solution for users who need this feature immediately.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix [Feature]: Expose model_switch as an agent-callable tool for autonomous task-complexity-based routing [1 participants]