hermes - 💡(How to fix) Fix feat(config): support per-model reasoning_effort defaults via custom_providers [1 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Root Cause

This is a gap because:

  • Different models have very different reasoning characteristics. Some models (e.g. claude-opus-4.6, gpt-5.5, deepseek-v4-pro) benefit from xhigh reasoning effort, while others (e.g. gemini-flash, deepseek-v4-flash) should run at medium or none.
  • Switching models via /model does not change the reasoning effort — the global setting persists, which can mean expensive reasoning effort on a cheap/fast model, or insufficient reasoning on a premium model.
  • The custom_providers mechanism already supports per-provider overrides for context_length and arbitrary extra_body fields, but reasoning_effort bypasses this entirely and always resolves to the global value.

Fix Action

Fixed

Code Example

model:
  default: openrouter/anthropic/claude-sonnet-4-6
  provider: openrouter

custom_providers:
  - name: openrouter
    base_url: https://openrouter.ai/api/v1
    api_key: ${OPENROUTER_API_KEY}
    reasoning_effort: high          # default for all models from this provider

  - name: deepseek
    base_url: https://api.deepseek.com
    api_key: ${DEEPSEEK_API_KEY}
    models:
      deepseek-v4-pro:
        reasoning_effort: xhigh
      deepseek-v4-flash:
        reasoning_effort: none
RAW_BUFFERClick to expand / collapse

Problem

reasoning_effort is currently a flat global setting — it can only be set at agent.reasoning_effort (main agent) or delegation.reasoning_effort (subagents). There is no mechanism to set a different default reasoning effort per model.

This is a gap because:

  • Different models have very different reasoning characteristics. Some models (e.g. claude-opus-4.6, gpt-5.5, deepseek-v4-pro) benefit from xhigh reasoning effort, while others (e.g. gemini-flash, deepseek-v4-flash) should run at medium or none.
  • Switching models via /model does not change the reasoning effort — the global setting persists, which can mean expensive reasoning effort on a cheap/fast model, or insufficient reasoning on a premium model.
  • The custom_providers mechanism already supports per-provider overrides for context_length and arbitrary extra_body fields, but reasoning_effort bypasses this entirely and always resolves to the global value.

Proposed Solution

Extend the custom_providers config schema to support a top-level reasoning_effort key at both the provider level and (optionally) the per-model override level:

model:
  default: openrouter/anthropic/claude-sonnet-4-6
  provider: openrouter

custom_providers:
  - name: openrouter
    base_url: https://openrouter.ai/api/v1
    api_key: ${OPENROUTER_API_KEY}
    reasoning_effort: high          # default for all models from this provider

  - name: deepseek
    base_url: https://api.deepseek.com
    api_key: ${DEEPSEEK_API_KEY}
    models:
      deepseek-v4-pro:
        reasoning_effort: xhigh
      deepseek-v4-flash:
        reasoning_effort: none

The resolution chain should be:

  1. Per-model override (if custom_providers[N].models.<model>.reasoning_effort is set)
  2. Provider-level default (if custom_providers[N].reasoning_effort is set)
  3. Global agent.reasoning_effort (existing fallback)

The existing delegation.reasoning_effort remains unchanged — it only governs subagent defaults, not the main agent.

Alternatives Considered

Model Presets (#20249) — that proposal is complementary (per-turn model escalation) rather than a static per-model default. This request is narrower: a static default reasoning effort per model that persists across all sessions and turns.

Per-session /reasoning override — exists today and is useful for one-off changes, but requires manual toggling every time the model changes and does not survive session boundaries.

References

  • Bug: #20576 (thinking_token_budget not forwarded to vLLM/custom providers — related to incomplete custom_providers reasoning plumbing)
  • PR: #18737 (propagating custom_providers thinking config to DeepSeek API — same pattern as this request)
  • Feature: #20249 (Model Presets — complementary, not overlapping)

Labels

area/config, comp/agent, type/feature

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING