hermes - 💡(How to fix) Fix feat: automatic in-provider reasoning fallback when API rejects thinking/reasoning

StepCodex · 2026-05-29T18:54:45Z

[hermes] Feature Request Problem When a user selects a reasoning effort level e.g., /reasoning high or switches to a model that has different reasoning semanti… ## Fix / Workaround This mirrors the existing recovery patterns already present in the conversation loop: - `thinking_signature` → strip `reasoning_details` and retry - `invalid_encrypted_content` → disable replay and retry - `multimodal_tool_content_unsupported` → downgrade to text and retry - `llama_cpp_grammar_pattern` → strip patterns and retry **Would you be open to a PR implementing this?** Yes — I can prepare a patch against `error_classifier.py`, `conversation_loop.py`, and the provider profile base class if the maintainers agree with the direction. ## Feature Request ### Problem When a user selects a reasoning effort level (e.g., `/reasoning high`) or switches to a model that has different reasoning semantics within the same provider, Hermes often receives an HTTP 400 because the chosen `reasoning_effort` / `thinking` / `reasoning` configuration is incompatible with the target model or endpoint. Examples: - **OpenCode Go + kimi-k2.6**: the provider profile sends both `extra_body.thinking` and top-level `reasoning_effort`; the upstream rejects `"cannot specify both 'thinking' and 'reasoning_effort'"` (#32040, #32327). - **xAI Grok fast models**: return `"does not support parameter reasoningEffort"` (#23088). - **Cerebras / GLM / custom endpoints**: reject unknown reasoning fields outright. - **User mistake**: typing `/reasoning minimal` on a model that only accepts `low|medium|high`. Currently Hermes classifies most of these as generic `format_error` (non-retryable 400) and immediately aborts or falls back to a **different provider**. This is heavy-handed: the user often just wants the conversation to continue on the same provider with reasoning disabled or clamped to a safe default. ### Desired Behavior Add an **in-provider reasoning fallback**: 1. When an API call fails with HTTP 400 and the error message indicates an invalid/unsupported reasoning or thinking parameter, Hermes should **retry once on the same provider/model** after stripping or correcting the reasoning configuration. 2. The recovery order should be: - **Attempt A** (original request with user reasoning config) - **Attempt B** (same provider/model, reasoning config sanitized → `None` / `{"enabled": false}` / provider-safe default) - Only if Attempt B also fails → proceed to normal provider fallback. This mirrors the existing recovery patterns already present in the conversation loop: - `thinking_signature` → strip `reasoning_details` and retry - `invalid_encrypted_content` → disable replay and retry - `multimodal_tool_content_unsupported` → downgrade to text and retry - `llama_cpp_grammar_pattern` → strip patterns and retry ### Concrete Scenarios | Scenario | Error signal | Retry action | |----------|-------------|--------------| | OpenCode Go kimi-k2.6 with both `thinking` + `reasoning_effort` | `"cannot specify both 'thinking' and 'reasoning_effort'"` | Drop `reasoning_effort` (keep `thinking` toggle) or vice-versa depending on provider profile | | xAI Grok fast | `"does not support parameter reasoningEffort"` | Strip `reasoning_effort` entirely | | Unsupported reasoning level | `"invalid reasoning_effort"` or `"reasoning_effort must be one of..."` | Clamp to `medium` or `low`, or disable reasoning | | Model rejects `thinking` field | `"unknown parameter: thinking"` | Strip `extra_body.thinking` | ### Suggested Implementation 1. **Error classifier** (`agent/error_classifier.py`) - Add `FailoverReason.invalid_reasoning_config` - Detect patterns: - `"cannot specify both 'thinking' and 'reasoning_effort'"` - `"does not support parameter reasoningEffort"` - `"reasoning_effort"` + `"invalid"` / `"unsupported"` / `"unknown parameter"` - `"thinking"` + `"invalid"` / `"unsupported"` / `"unknown parameter"` 2. **Conversation loop** (`agent/conversation_loop.py`) - Add a `reasoning_fallback_retry_attempted` flag. - Branch: if `classified.reason == FailoverReason.invalid_reasoning_config` and not yet retried: - Set `agent.reasoning_config = None` (or provider-profile-specific safe default) - Rebuild API kwargs via `build_api_kwargs` - Log: *"Provider rejected reasoning config — retrying with reasoning disabled..."* - `continue` (retry) 3. **Provider profile hook** (optional, preferred) - Add a method to `ProviderProfile` (e.g., `safe_reasoning_fallback(reasoning_config) -> dict | None`) - So OpenCode Go can return `"thinking"` only (no `reasoning_effort`) for kimi-k2, while xAI can return `None`, etc. ### Benefits - **Better UX**: a wrong `/reasoning` level or a model switch no longer hard-aborts the session. - **Safer model switching**: users can switch between reasoning and non-reasoning variants inside the same provider without manually tweaking `/reasoning none` first. - **Less provider-fallback noise**: many 400s are resolved inside the current provider, so `fallback_providers`

Error Message

When an API call fails with HTTP 400 and the error message indicates an invalid/unsupported reasoning or thinking parameter, Hermes should retry once on the same provider/model after stripping or correcting the reasoning configuration. | Scenario | Error signal | Retry action |
Error classifier (agent/error_classifier.py)

Root Cause

When a user selects a reasoning effort level (e.g., /reasoning high) or switches to a model that has different reasoning semantics within the same provider, Hermes often receives an HTTP 400 because the chosen reasoning_effort / thinking / reasoning configuration is incompatible with the target model or endpoint. Examples:

Fix Action

Fix / Workaround

This mirrors the existing recovery patterns already present in the conversation loop:

thinking_signature → strip reasoning_details and retry
invalid_encrypted_content → disable replay and retry
multimodal_tool_content_unsupported → downgrade to text and retry
llama_cpp_grammar_pattern → strip patterns and retry

Would you be open to a PR implementing this? Yes — I can prepare a patch against error_classifier.py, conversation_loop.py, and the provider profile base class if the maintainers agree with the direction.

Feature Request

Problem

OpenCode Go + kimi-k2.6: the provider profile sends both extra_body.thinking and top-level reasoning_effort; the upstream rejects "cannot specify both 'thinking' and 'reasoning_effort'" (#32040, #32327).
xAI Grok fast models: return "does not support parameter reasoningEffort" (#23088).
Cerebras / GLM / custom endpoints: reject unknown reasoning fields outright.
User mistake: typing /reasoning minimal on a model that only accepts low|medium|high.

Currently Hermes classifies most of these as generic format_error (non-retryable 400) and immediately aborts or falls back to a different provider. This is heavy-handed: the user often just wants the conversation to continue on the same provider with reasoning disabled or clamped to a safe default.

Desired Behavior

Add an in-provider reasoning fallback:

When an API call fails with HTTP 400 and the error message indicates an invalid/unsupported reasoning or thinking parameter, Hermes should retry once on the same provider/model after stripping or correcting the reasoning configuration.
The recovery order should be:
- Attempt A (original request with user reasoning config)
- Attempt B (same provider/model, reasoning config sanitized → None / {"enabled": false} / provider-safe default)
- Only if Attempt B also fails → proceed to normal provider fallback.

This mirrors the existing recovery patterns already present in the conversation loop:

thinking_signature → strip reasoning_details and retry
invalid_encrypted_content → disable replay and retry
multimodal_tool_content_unsupported → downgrade to text and retry
llama_cpp_grammar_pattern → strip patterns and retry

Concrete Scenarios

Scenario	Error signal	Retry action
OpenCode Go kimi-k2.6 with both `thinking` + `reasoning_effort`	`"cannot specify both 'thinking' and 'reasoning_effort'"`	Drop `reasoning_effort` (keep `thinking` toggle) or vice-versa depending on provider profile
xAI Grok fast	`"does not support parameter reasoningEffort"`	Strip `reasoning_effort` entirely
Unsupported reasoning level	`"invalid reasoning_effort"` or `"reasoning_effort must be one of..."`	Clamp to `medium` or `low`, or disable reasoning
Model rejects `thinking` field	`"unknown parameter: thinking"`	Strip `extra_body.thinking`

Suggested Implementation

Error classifier (agent/error_classifier.py)
- Add FailoverReason.invalid_reasoning_config
- Detect patterns:
  - "cannot specify both 'thinking' and 'reasoning_effort'"
  - "does not support parameter reasoningEffort"
  - "reasoning_effort" + "invalid" / "unsupported" / "unknown parameter"
  - "thinking" + "invalid" / "unsupported" / "unknown parameter"
Conversation loop (agent/conversation_loop.py)
- Add a reasoning_fallback_retry_attempted flag.
- Branch: if classified.reason == FailoverReason.invalid_reasoning_config and not yet retried:
  - Set agent.reasoning_config = None (or provider-profile-specific safe default)
  - Rebuild API kwargs via build_api_kwargs
  - Log: "Provider rejected reasoning config — retrying with reasoning disabled..."
  - continue (retry)
Provider profile hook (optional, preferred)
- Add a method to ProviderProfile (e.g., safe_reasoning_fallback(reasoning_config) -> dict | None)
- So OpenCode Go can return "thinking" only (no reasoning_effort) for kimi-k2, while xAI can return None, etc.

Benefits

Better UX: a wrong /reasoning level or a model switch no longer hard-aborts the session.
Safer model switching: users can switch between reasoning and non-reasoning variants inside the same provider without manually tweaking /reasoning none first.
Less provider-fallback noise: many 400s are resolved inside the current provider, so fallback_providers is only used for genuine provider outages or auth issues.

Related Issues

#32040 — OpenCode Go kimi-k2.6 dual-parameter 400
#32327 — OpenCode provider sends both thinking and reasoning_effort
#23088 — xAI Grok fast does not support reasoningEffort
#31589 — Provider-scoped reasoning overrides (complementary)

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix feat: automatic in-provider reasoning fallback when API rejects thinking/reasoning_effort parameters

Recommended Tools

GitHub issue graph ai analysis