hermes - 💡(How to fix) Fix [Feature]: Expose model_switch as an agent-callable tool for autonomous task-complexity-based routing [1 participants]

hermes2026-04-27 11:46:46

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#16525•Fetched 2026-04-28 06:52:47

View on GitHub

Comments

Participants

Timeline

Reactions

Author

razorglyon

Participants

razorglyon

Timeline (top)

labeled ×5

Fix Action

Fix / Workaround

A self-hosted Hermes deployment with a single OAuth ChatGPT Plus subscription. The user wants their agent (@rza_hermes_bot) to default to gpt-5.4-mini for routine ops (status checks, log reads, YAML edits) but auto-upgrade to gpt-5.4 for medium-complexity tasks (multi-file refactor, debug Python with cross-module stack) and gpt-5.5 for code-heavy work (script generation >100 lines, architecture review, long-context research). All three models are accessible via the same OAuth, so the only constraint is the weekly quota — sub-evaluating complexity preserves quota for other profiles sharing the OAuth.

A SOUL.md classification rule (A/B/C tiers) was attempted, but the agent literally cannot execute /model — it has no tool to call. Workaround currently in place: bot detects complexity, prefixes its response with [⚠️ Recommande gpt-5.X — <reason>. Type /model gpt-5.X then resend if you want to upgrade], then answers in mini. This works but adds a round-trip per complex task.

Workaround in production today

Code Example

@tool(
    name="model_switch",
    description=(
        "Switch the active model for the current session. Use when the current task "
        "requires significantly more (or less) capability than the default model. "
        "The switch takes effect for the NEXT response (or the current one, if Hermes "
        "supports mid-turn re-prompting). Provide a short justification to help users "
        "understand the cost/benefit."
    ),
)
def model_switch(slug: str, reason: str, scope: Literal["session", "turn"] = "session") -> dict:
    """
    Args:
        slug: Model identifier accepted by the current provider (e.g., gpt-5.5, claude-opus-4-7).
        reason: 1-sentence justification — surfaced to user in the response.
        scope: "session" (persists until /model reset, default) or "turn" (one-shot, revert after).
    
    Returns:
        {"old_model": str, "new_model": str, "provider": str, "scope": str, "applied_at": "next_turn"}
    """

RAW_BUFFERClick to expand / collapse

[Feature]: Expose `model_switch` as an agent-callable tool for autonomous task-complexity-based routing

Problem / Use Case

Hermes currently has three model-routing mechanisms, none of which solves autonomous self-routing by the agent:

/model <slug> slash command — handled by gateway/run.py:5794:_handle_model_command. Triggered only when a user message starts with /model. The agent cannot trigger it (writing /model gpt-5.5 in its response is just text — the gateway does not re-inject bot messages as user commands).
smart_model_routing config schema — keys exist in DEFAULT_CONFIG (max_simple_chars, max_simple_words, cheap_model) but no Python implementation is found in ~/.hermes/hermes-agent/agent/, hermes_cli/, or tools/ (only traces in .git/objects/pack/). The flag does nothing when enabled.
auxiliary task models — route specific sub-tasks (compression, vision, search) to specialized models, not the main agent.

There is no way for the agent itself to look at a user prompt, classify its complexity, and upgrade its own model accordingly — even though agent.switch_model() exists in core (cli.py:5226, tui_gateway/server.py:756) and the gateway already implements per-session model overrides (_session_model_overrides[session_key] in gateway/run.py).

Related issues:

#157 — User-Configurable Multi-Model Routing with Capability Categories (broader framework, tool-declared needs)
#5997 — LLM model switch by skill (skill-level frontmatter declaration)

This issue is more focused: let the agent itself call a tool to switch its own model based on classification logic that lives in the agent's system prompt (SOUL.md) or in skills.

Concrete use case (driving this request)

Proposed Solution

Add a built-in agent tool model_switch that wraps the existing infrastructure:

File: tools/model_switch_tool.py (new)

Tool signature (rough sketch):

@tool(
    name="model_switch",
    description=(
        "Switch the active model for the current session. Use when the current task "
        "requires significantly more (or less) capability than the default model. "
        "The switch takes effect for the NEXT response (or the current one, if Hermes "
        "supports mid-turn re-prompting). Provide a short justification to help users "
        "understand the cost/benefit."
    ),
)
def model_switch(slug: str, reason: str, scope: Literal["session", "turn"] = "session") -> dict:
    """
    Args:
        slug: Model identifier accepted by the current provider (e.g., gpt-5.5, claude-opus-4-7).
        reason: 1-sentence justification — surfaced to user in the response.
        scope: "session" (persists until /model reset, default) or "turn" (one-shot, revert after).
    
    Returns:
        {"old_model": str, "new_model": str, "provider": str, "scope": str, "applied_at": "next_turn"}
    """

Implementation skeleton: The tool reuses hermes_cli.model_switch.switch_model() (already imported by gateway/run.py:5805 and cli.py:5331) and updates gateway._session_model_overrides[session_key] for the running session. The agent process picks up the new model on the next turn via the existing _evict_cached_agent + override resolution path.

Toolset registration: Add model_switch to the default toolset (or behind a config flag agent.allow_self_model_switch: true for opt-in safety).

Optional safeguards:

model_switch_allowlist config key listing which models the agent is allowed to switch to (defaults to all configured providers' models).
approvals.model_switch: manual to require user /approve before each switch (consistent with the existing dangerous_commands HITL pattern).

Out of scope (for this issue)

Auto-classification of prompt complexity (that's the agent's job via SOUL.md; this issue just enables the mechanism for the agent to act on its classification).
Tool-declared model requirements (covered by #157).
Skill-level frontmatter routing (covered by #5997).

This issue is the smallest atomic change: make the existing gateway-internal _session_model_overrides mechanism agent-callable via a new tool. SOUL.md prompts, skills, and external orchestrators can then build smarter routing policies on top.

Why this is valuable

Cost control without sacrificing capability: the agent self-throttles, using cheap models for routine work and expensive ones only when justified.
Aligns with agentic patterns in adjacent frameworks (Open Interpreter, OpenHands, sweep.dev) where agents have meta-control over their own runtime.
Composes with existing features: combine with dangerous_commands.mode = manual and the agent gets full HITL control over its own escalation.
Single-OAuth multi-tier setups (like ChatGPT Plus with mini/standard/pro models on one subscription) benefit immediately.

Workaround in production today

For users who need this now, the workaround is:

SOUL.md tells the agent to detect A/B/C complexity tiers
For B/C tasks, the agent prefixes its response with [⚠️ Recommande gpt-5.X — <reason>. Type /model gpt-5.X then resend if you want to upgrade]
The user types /model gpt-5.X then re-sends their prompt
The new turn runs on the upgraded model

Functional but adds a round-trip + relies on user discipline.

Related code references

gateway/run.py:5794 — _handle_model_command (current /model user-only path)
gateway/run.py:5928,6056 — where _session_model_overrides[key] is set
cli.py:5226,5447 — where agent.switch_model(...) is called
hermes_cli/model_switch.py — switch_model() core logic (~600 lines, well-factored)
agent/auxiliary_client.py:2308 — get_auxiliary_extra_body (pattern for tool-level model routing, not session-level)

I'm willing to draft a PR if there's interest from maintainers — just want to confirm direction before investing.

extent analysis

TL;DR

Implement a new agent tool model_switch to enable autonomous model switching based on task complexity.

Guidance

Create a new tool: Develop model_switch_tool.py with the proposed model_switch function, utilizing existing hermes_cli.model_switch.switch_model() and updating _session_model_overrides.
Register the tool: Add model_switch to the default toolset or behind a config flag agent.allow_self_model_switch for opt-in safety.
Implement safeguards: Consider adding model_switch_allowlist and approvals.model_switch config keys for security and control.
Test and refine: Verify the tool's functionality and adjust as needed to ensure seamless integration with existing features.

Example

@tool(
    name="model_switch",
    description="Switch the active model for the current session.",
)
def model_switch(slug: str, reason: str, scope: Literal["session", "turn"] = "session") -> dict:
    # Implementation details
    pass

Notes

This solution focuses on creating a mechanism for the agent to switch models based on task complexity, without addressing the complexity classification itself, which is assumed to be handled by the agent's SOUL.md prompts or skills.

Recommendation

Apply the proposed workaround until the model_switch tool is implemented, as it provides a functional, albeit less efficient, solution for users who need this feature immediately.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#tokenizer error #prompt formatting #chain error #conversation history #tool integration

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix [Feature]: Expose model_switch as an agent-callable tool for autonomous task-complexity-based routing [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fix / Workaround

Workaround in production today

Code Example

[Feature]: Expose `model_switch` as an agent-callable tool for autonomous task-complexity-based routing

Problem / Use Case

Concrete use case (driving this request)

Proposed Solution

Out of scope (for this issue)

Why this is valuable

Workaround in production today

Related code references

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix [Feature]: Expose model_switch as an agent-callable tool for autonomous task-complexity-based routing [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fix / Workaround

Workaround in production today

Code Example

[Feature]: Expose model_switch as an agent-callable tool for autonomous task-complexity-based routing

Problem / Use Case

Concrete use case (driving this request)

Proposed Solution

Out of scope (for this issue)

Why this is valuable

Workaround in production today

Related code references

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING

[Feature]: Expose `model_switch` as an agent-callable tool for autonomous task-complexity-based routing