hermes - 💡(How to fix) Fix [Feature]: Per-auxiliary reasoning effort configuration

hermes2026-05-26 21:07:24

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Code Example

auxiliary:
  compression:
    provider: opencode-go
    model: deepseek-v4-flash
    reasoning_effort: none   # new — override global for this aux only
  session_search:
    provider: opencode-go
    model: deepseek-v4-flash
    reasoning_effort: low
  vision:
    provider: opencode-go
    model: glm-5.1
    # inherits global (irrelevant — non-thinking model)
  web_extract:
    provider: opencode-go
    model: deepseek-v4-flash
    reasoning_effort: none
  approval:
    provider: opencode-go
    model: deepseek-v4-flash
    # inherits global
  mcp:
    provider: opencode-go
    model: deepseek-v4-flash
    # inherits global
  title_generation:
    provider: opencode-go
    model: deepseek-v4-flash
    # inherits global
  triage_specifier:
    provider: auto
    model: deepseek-v4-flash
    # inherits global
  curator:
    provider: opencode-go
    model: deepseek-v4-flash
    # inherits global

RAW_BUFFERClick to expand / collapse

Problem or Use Case

I use deepseek-v4-flash as the model for most auxiliary subsystems (compression, session_search, web_extract, approval, mcp, title_generation, curator) via the opencode-go provider. With agent.reasoning_effort: xhigh set globally, every aux task now sends reasoning_effort: max on every API call — including cheap, high-frequency tasks like compression and session_search. These tasks would benefit from low or no reasoning (faster, cheaper) while the main conversation model still uses high reasoning for quality.

Proposed Solution

Allow the existing auxiliary: config sections to accept an optional reasoning_effort key, per subsystem. When set, it overrides the global agent.reasoning_effort for that subsystem. When unset, inherits the global default.

auxiliary:
  compression:
    provider: opencode-go
    model: deepseek-v4-flash
    reasoning_effort: none   # new — override global for this aux only
  session_search:
    provider: opencode-go
    model: deepseek-v4-flash
    reasoning_effort: low
  vision:
    provider: opencode-go
    model: glm-5.1
    # inherits global (irrelevant — non-thinking model)
  web_extract:
    provider: opencode-go
    model: deepseek-v4-flash
    reasoning_effort: none
  approval:
    provider: opencode-go
    model: deepseek-v4-flash
    # inherits global
  mcp:
    provider: opencode-go
    model: deepseek-v4-flash
    # inherits global
  title_generation:
    provider: opencode-go
    model: deepseek-v4-flash
    # inherits global
  triage_specifier:
    provider: auto
    model: deepseek-v4-flash
    # inherits global
  curator:
    provider: opencode-go
    model: deepseek-v4-flash
    # inherits global

Internally, agent/auxiliary_client.py would read the per-subsystem reasoning_effort value (if present) or fall back to agent.reasoning_effort, then pass it into the reasoning_config kwarg when building API kwargs for that subsystem.

Alternatives Considered

Set aux tasks to a non-thinking model — wastes the user choice of deepseek-v4-flash for those slots.
Global reasoning_effort: low — makes the main model worse.
Global auxiliary.reasoning_effort: low — too coarse; curator and vision might want reasoning while compression and session_search don't.
Do nothing — acceptable but wasteful; every aux call pays the thinking-token overhead.

Feature Type

Configuration option

Scope

Small (single file, < 50 lines) — agent/auxiliary_client.py to read and propagate the per-subsystem value + config schema update in hermes_cli/config.py if strict validation is desired.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering