hermes - 💡(How to fix) Fix Support per-model reasoning_effort in custom_providers[].models [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#15511Fetched 2026-04-26 05:26:58
View on GitHub
Comments
0
Participants
1
Timeline
7
Reactions
0
Participants
Timeline (top)
labeled ×4closed ×2reopened ×1

Fix Action

Fix / Workaround

  • DeepSeek V4 Pro/Flash support reasoning_effort parameter: low, high, max
  • Qwen3.6 Plus is a reasoning model that either uses CoT or doesn't
  • GLM-5, MiniMax M2.7 do not support reasoning_effort API parameter
  • Current workaround: hermes profile create <name> for each model type

Code Example

custom_providers:
  - name: my-provider
    base_url: https://api.example.com/v1
    api_key: ${API_KEY}
    models:
      deepseek-v4-pro:
        context_length: 1000000
        reasoning_effort: high      # NEW: per-model reasoning
      deepseek-v4-flash:
        context_length: 1000000
        reasoning_effort: medium
      qwen3-6-plus:
        context_length: 1000000
        reasoning_effort: medium    # Reasoning model, medium triggers CoT
      glm-5:
        context_length: 200000
        reasoning_effort: none      # Explicitly disable for unsupported models
      minimax-m2-7:
        context_length: 204800
        reasoning_effort: none

---

# Around line 2139 in config.py
reasoning_effort = entry.get("reasoning_effort")
if isinstance(reasoning_effort, str) and reasoning_effort.strip():
    # Validate against known levels
    from hermes_constants import VALID_REASONING_EFFORTS
    if reasoning_effort.strip().lower() in VALID_REASONING_EFFORTS:
        normalized["reasoning_effort"] = reasoning_effort.strip().lower()

---

def _get_model_specific_reasoning(model: str, config: dict) -> str | None:
    """Look up per-model reasoning_effort from custom_providers."""
    for provider in config.get("custom_providers", []):
        models = provider.get("models", {})
        if model in models:
            return models[model].get("reasoning_effort")
    return None

# Priority: per-model > global agent.reasoning_effort > default (medium)
per_model_reasoning = _get_model_specific_reasoning(self.model, config)
if per_model_reasoning:
    self.reasoning_effort = per_model_reasoning
elif self.reasoning_effort:  # Already set from agent.reasoning_effort
    pass
else:
    self.reasoning_effort = "medium"  # Default

---

# In process_command() for /model
new_model = args.strip()
new_reasoning = _get_model_specific_reasoning(new_model, config)
if new_reasoning:
    self.reasoning_config = parse_reasoning_effort(new_reasoning)
    # Optionally notify user
    print(f"Model switched to {new_model}, reasoning set to {new_reasoning}")

---

# User's config.yaml
custom_providers:
  - name: my-custom-provider
    base_url: https://my-api-server.example.com/v1
    models:
      deepseek-v4-pro-high:
        context_length: 1000000
        reasoning_effort: high
      glm-5:
        context_length: 128000
        reasoning_effort: none

# In session
> /model deepseek-v4-pro-high
Model: deepseek-v4-pro-high
Reasoning: high (auto-applied from config)

> /model glm-5  
Model: glm-5
Reasoning: none (auto-applied from config, no API overhead)
RAW_BUFFERClick to expand / collapse

Problem Statement

Currently, reasoning_effort is a global setting under agent.reasoning_effort that applies to all models equally. This creates a mismatch when switching between models with different reasoning capabilities:

ModelReasoning SupportDesired Setting
DeepSeek V4 Pro✅ 3 levels (low/high/max)high or xhigh
DeepSeek V4 Flash✅ 3 levelsmedium
Qwen3.6 Plus✅ Binary (on/off)medium
GLM-5 / GLM-5.1❌ Not supportednone (ignored)
MiniMax M2.7❌ Not supportednone (ignored)

Current behavior: Users must manually run /reasoning <level> after /model <name> to match the model's capabilities, or create separate profiles for each model.

Desired behavior: When switching models, the appropriate reasoning_effort should be automatically applied based on per-model configuration.


Proposed Solution

Extend custom_providers[].models schema to support per-model reasoning_effort:

custom_providers:
  - name: my-provider
    base_url: https://api.example.com/v1
    api_key: ${API_KEY}
    models:
      deepseek-v4-pro:
        context_length: 1000000
        reasoning_effort: high      # NEW: per-model reasoning
      deepseek-v4-flash:
        context_length: 1000000
        reasoning_effort: medium
      qwen3-6-plus:
        context_length: 1000000
        reasoning_effort: medium    # Reasoning model, medium triggers CoT
      glm-5:
        context_length: 200000
        reasoning_effort: none      # Explicitly disable for unsupported models
      minimax-m2-7:
        context_length: 204800
        reasoning_effort: none

Implementation Details

1. Config Schema Extension

Extend _normalize_custom_provider_entry() in hermes_cli/config.py:

# Around line 2139 in config.py
reasoning_effort = entry.get("reasoning_effort")
if isinstance(reasoning_effort, str) and reasoning_effort.strip():
    # Validate against known levels
    from hermes_constants import VALID_REASONING_EFFORTS
    if reasoning_effort.strip().lower() in VALID_REASONING_EFFORTS:
        normalized["reasoning_effort"] = reasoning_effort.strip().lower()

2. Runtime Resolution

In AIAgent.__init__() or run_conversation():

def _get_model_specific_reasoning(model: str, config: dict) -> str | None:
    """Look up per-model reasoning_effort from custom_providers."""
    for provider in config.get("custom_providers", []):
        models = provider.get("models", {})
        if model in models:
            return models[model].get("reasoning_effort")
    return None

# Priority: per-model > global agent.reasoning_effort > default (medium)
per_model_reasoning = _get_model_specific_reasoning(self.model, config)
if per_model_reasoning:
    self.reasoning_effort = per_model_reasoning
elif self.reasoning_effort:  # Already set from agent.reasoning_effort
    pass
else:
    self.reasoning_effort = "medium"  # Default

3. /model Command Integration

When user runs /model <name> in-session, automatically apply the model's reasoning:

# In process_command() for /model
new_model = args.strip()
new_reasoning = _get_model_specific_reasoning(new_model, config)
if new_reasoning:
    self.reasoning_config = parse_reasoning_effort(new_reasoning)
    # Optionally notify user
    print(f"Model switched to {new_model}, reasoning set to {new_reasoning}")

Backward Compatibility

  • If reasoning_effort is not specified for a model, fall back to global agent.reasoning_effort
  • Existing configs without per-model settings work unchanged
  • Empty or none reasoning_effort for unsupported models = no extra API parameters sent

Benefits

  1. One-command model switch: /model deepseek-v4-pro automatically sets optimal reasoning
  2. No profile proliferation: Single config file manages all model-specific behavior
  3. Explicit documentation: Config clearly shows which models support reasoning
  4. Prompt caching preserved: Reasoning is set at session start, not changed mid-conversation

Alternatives Considered

AlternativeProsCons
Multiple profilesAlready worksProfile management overhead
Manual /reasoning after /modelSimpleTwo-step, easy to forget
Auto-detect from model nameNo config neededFragile naming patterns, hard to maintain

Use Case Example

# User's config.yaml
custom_providers:
  - name: my-custom-provider
    base_url: https://my-api-server.example.com/v1
    models:
      deepseek-v4-pro-high:
        context_length: 1000000
        reasoning_effort: high
      glm-5:
        context_length: 128000
        reasoning_effort: none

# In session
> /model deepseek-v4-pro-high
Model: deepseek-v4-pro-high
Reasoning: high (auto-applied from config)

> /model glm-5  
Model: glm-5
Reasoning: none (auto-applied from config, no API overhead)

Additional Context

  • DeepSeek V4 Pro/Flash support reasoning_effort parameter: low, high, max
  • Qwen3.6 Plus is a reasoning model that either uses CoT or doesn't
  • GLM-5, MiniMax M2.7 do not support reasoning_effort API parameter
  • Current workaround: hermes profile create <name> for each model type

Would you be willing to submit a PR for this feature? Yes, I can contribute the implementation if the design is approved.

extent analysis

TL;DR

Implement per-model reasoning_effort configuration to automatically apply the appropriate reasoning setting when switching between models.

Guidance

  1. Extend the custom_providers[].models schema: Add a reasoning_effort field to each model's configuration to specify the desired reasoning level.
  2. Update the config normalization: Modify _normalize_custom_provider_entry() to validate and normalize the reasoning_effort field for each model.
  3. Implement runtime resolution: Use the _get_model_specific_reasoning() function to determine the per-model reasoning effort and apply it when switching models.
  4. Integrate with the /model command: Automatically apply the model's reasoning setting when the user runs /model <name> in-session.
  5. Test and verify: Ensure that the implementation works as expected for different models and reasoning settings.

Example

# Example config.yaml
custom_providers:
  - name: my-provider
    base_url: https://api.example.com/v1
    models:
      deepseek-v4-pro:
        context_length: 1000000
        reasoning_effort: high
      glm-5:
        context_length: 200000
        reasoning_effort: none

Notes

The implementation should handle cases where reasoning_effort is not specified for a model, falling back to the global agent.reasoning_effort setting. Additionally, the solution should preserve existing configs without per-model settings and ensure that empty or none reasoning effort for unsupported models does not send extra API parameters.

Recommendation

Apply the proposed solution by extending the custom_providers[].models schema and implementing the necessary changes to automatically apply per-model reasoning settings. This approach provides a flexible and maintainable solution that meets the requirements and use cases outlined in the issue.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING