hermes - 💡(How to fix) Fix feat: automatic in-provider reasoning fallback when API rejects thinking/reasoning_effort parameters

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

  1. When an API call fails with HTTP 400 and the error message indicates an invalid/unsupported reasoning or thinking parameter, Hermes should retry once on the same provider/model after stripping or correcting the reasoning configuration. | Scenario | Error signal | Retry action |
  2. Error classifier (agent/error_classifier.py)

Root Cause

When a user selects a reasoning effort level (e.g., /reasoning high) or switches to a model that has different reasoning semantics within the same provider, Hermes often receives an HTTP 400 because the chosen reasoning_effort / thinking / reasoning configuration is incompatible with the target model or endpoint. Examples:

Fix Action

Fix / Workaround

This mirrors the existing recovery patterns already present in the conversation loop:

  • thinking_signature → strip reasoning_details and retry
  • invalid_encrypted_content → disable replay and retry
  • multimodal_tool_content_unsupported → downgrade to text and retry
  • llama_cpp_grammar_pattern → strip patterns and retry

Would you be open to a PR implementing this? Yes — I can prepare a patch against error_classifier.py, conversation_loop.py, and the provider profile base class if the maintainers agree with the direction.

RAW_BUFFERClick to expand / collapse

Feature Request

Problem

When a user selects a reasoning effort level (e.g., /reasoning high) or switches to a model that has different reasoning semantics within the same provider, Hermes often receives an HTTP 400 because the chosen reasoning_effort / thinking / reasoning configuration is incompatible with the target model or endpoint. Examples:

  • OpenCode Go + kimi-k2.6: the provider profile sends both extra_body.thinking and top-level reasoning_effort; the upstream rejects "cannot specify both 'thinking' and 'reasoning_effort'" (#32040, #32327).
  • xAI Grok fast models: return "does not support parameter reasoningEffort" (#23088).
  • Cerebras / GLM / custom endpoints: reject unknown reasoning fields outright.
  • User mistake: typing /reasoning minimal on a model that only accepts low|medium|high.

Currently Hermes classifies most of these as generic format_error (non-retryable 400) and immediately aborts or falls back to a different provider. This is heavy-handed: the user often just wants the conversation to continue on the same provider with reasoning disabled or clamped to a safe default.

Desired Behavior

Add an in-provider reasoning fallback:

  1. When an API call fails with HTTP 400 and the error message indicates an invalid/unsupported reasoning or thinking parameter, Hermes should retry once on the same provider/model after stripping or correcting the reasoning configuration.
  2. The recovery order should be:
    • Attempt A (original request with user reasoning config)
    • Attempt B (same provider/model, reasoning config sanitized → None / {"enabled": false} / provider-safe default)
    • Only if Attempt B also fails → proceed to normal provider fallback.

This mirrors the existing recovery patterns already present in the conversation loop:

  • thinking_signature → strip reasoning_details and retry
  • invalid_encrypted_content → disable replay and retry
  • multimodal_tool_content_unsupported → downgrade to text and retry
  • llama_cpp_grammar_pattern → strip patterns and retry

Concrete Scenarios

ScenarioError signalRetry action
OpenCode Go kimi-k2.6 with both thinking + reasoning_effort"cannot specify both 'thinking' and 'reasoning_effort'"Drop reasoning_effort (keep thinking toggle) or vice-versa depending on provider profile
xAI Grok fast"does not support parameter reasoningEffort"Strip reasoning_effort entirely
Unsupported reasoning level"invalid reasoning_effort" or "reasoning_effort must be one of..."Clamp to medium or low, or disable reasoning
Model rejects thinking field"unknown parameter: thinking"Strip extra_body.thinking

Suggested Implementation

  1. Error classifier (agent/error_classifier.py)

    • Add FailoverReason.invalid_reasoning_config
    • Detect patterns:
      • "cannot specify both 'thinking' and 'reasoning_effort'"
      • "does not support parameter reasoningEffort"
      • "reasoning_effort" + "invalid" / "unsupported" / "unknown parameter"
      • "thinking" + "invalid" / "unsupported" / "unknown parameter"
  2. Conversation loop (agent/conversation_loop.py)

    • Add a reasoning_fallback_retry_attempted flag.
    • Branch: if classified.reason == FailoverReason.invalid_reasoning_config and not yet retried:
      • Set agent.reasoning_config = None (or provider-profile-specific safe default)
      • Rebuild API kwargs via build_api_kwargs
      • Log: "Provider rejected reasoning config — retrying with reasoning disabled..."
      • continue (retry)
  3. Provider profile hook (optional, preferred)

    • Add a method to ProviderProfile (e.g., safe_reasoning_fallback(reasoning_config) -> dict | None)
    • So OpenCode Go can return "thinking" only (no reasoning_effort) for kimi-k2, while xAI can return None, etc.

Benefits

  • Better UX: a wrong /reasoning level or a model switch no longer hard-aborts the session.
  • Safer model switching: users can switch between reasoning and non-reasoning variants inside the same provider without manually tweaking /reasoning none first.
  • Less provider-fallback noise: many 400s are resolved inside the current provider, so fallback_providers is only used for genuine provider outages or auth issues.

Related Issues

  • #32040 — OpenCode Go kimi-k2.6 dual-parameter 400
  • #32327 — OpenCode provider sends both thinking and reasoning_effort
  • #23088 — xAI Grok fast does not support reasoningEffort
  • #31589 — Provider-scoped reasoning overrides (complementary)

Would you be open to a PR implementing this? Yes — I can prepare a patch against error_classifier.py, conversation_loop.py, and the provider profile base class if the maintainers agree with the direction.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix feat: automatic in-provider reasoning fallback when API rejects thinking/reasoning_effort parameters