hermes - 💡(How to fix) Fix RFC: User-Centric Reasoning Architecture with Intelligent Adaptation

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

Provider: 400 error "cannot specify both..." System: Parse error → Retry with adaptation

  • ❌ Implementation details leak to users (error messages mention API params) ↓ (if 400 error - edge case) │ Fallback: Parse error → Update cache │
  • Update error messages to be user-friendly
  • Adapts within error recovery loop

Code Example

User: /reasoning high
System: Send both thinking + reasoning_effort params
Provider: 400 error "cannot specify both..."
System: Parse error → Retry with adaptation
User: Finally gets response (with degraded experience)

---

# Layer 1: User Intent (what user wants)
@dataclass
class ReasoningIntent:
    enabled: bool
    strength: Literal["minimal", "low", "medium", "high", "xhigh"]
    
    @classmethod
    def from_user_command(cls, cmd: str) -> "ReasoningIntent":
        """Parse /reasoning high, /thinking on, etc."""
        # ...

# Layer 2: Provider Capability (what provider supports)
@dataclass  
class ReasoningCapability:
    supports_thinking_toggle: bool
    supports_effort_level: bool
    preferred_param: Literal["thinking", "effort", "both", "none"]
    
    @classmethod
    def discover(cls, provider: str, model: str) -> "ReasoningCapability":
        """Detect capability via metadata or probe request"""
        # ...

# Layer 3: Adapter (translates intent → provider config)
class ReasoningAdapter:
    def adapt(
        intent: ReasoningIntent, 
        capability: ReasoningCapability
    ) -> dict:
        """
        Translate user intent to provider-specific config.
        
        Examples:
        - User wants "high reasoning", provider supports only thinking
{"thinking": {"type": "enabled"}}
          
        - User wants "high reasoning", provider supports only effort
{"reasoning_effort": "high"}
          
        - User wants "high reasoning", provider supports both
{"thinking": {...}, "reasoning_effort": "high"}
          
        - User wants "high reasoning", provider supports neither
{} + warning message
        """
        # ...

---

┌─────────────────────────────────────────┐
User Request: /reasoning high           │
└────────────┬────────────────────────────┘
┌─────────────────────────────────────────┐
Check Capability Cache├────────────┬────────────────────────────┤
Cache HitCache Miss↓            ↓                            │
AdaptProbe ProviderImmediately│   ↓                        │
│            │ Cache Result│            │   ↓                        │
│            │ Adapt + Send└────────────┴────────────────────────────┘
┌─────────────────────────────────────────┐
Send Adapted Request (correct params from the start)└────────────┬────────────────────────────┘
              (if 400 error - edge case)
┌─────────────────────────────────────────┐
Fallback: Parse error → Update cache    │
│ → Retry with corrected params           │
└─────────────────────────────────────────┘

---

Reasoning Mode: HIGH
  Provider: OpenCode Go (kimi-k2)
  Adapted: Thinking mode enabled (effort level not supported by this provider)
  Expected: Enhanced step-by-step reasoning

---

ℹ️ Provider Adaptation
  Your request: High-intensity reasoning
  Provider capability: Extended thinking mode (no granular effort control)
  Result: Using maximum available reasoning depth
  Tip: Switch to provider with effort-level support for fine-grained control
RAW_BUFFERClick to expand / collapse

RFC: User-Centric Reasoning Architecture with Intelligent Adaptation

Problem Statement

Currently, when users request enhanced reasoning (e.g., /reasoning high), they expect "smarter responses" — they don't know (and shouldn't need to know) about technical differences between thinking toggles and reasoning_effort parameters across providers.

Current reactive approach:

User: /reasoning high
System: Send both thinking + reasoning_effort params
Provider: 400 error "cannot specify both..."
System: Parse error → Retry with adaptation
User: Finally gets response (with degraded experience)

Problems:

  • ❌ Reactive — errors happen before adaptation
  • ❌ Implementation details leak to users (error messages mention API params)
  • ❌ Recovery logic scattered across conversation_loop.py (violates SRP)
  • ❌ No capability awareness — same request sent to all providers
  • ❌ Poor UX — users see confusing warnings instead of transparent adaptation

Proposed Architecture

1. Three-Layer Separation (SOLID)

# Layer 1: User Intent (what user wants)
@dataclass
class ReasoningIntent:
    enabled: bool
    strength: Literal["minimal", "low", "medium", "high", "xhigh"]
    
    @classmethod
    def from_user_command(cls, cmd: str) -> "ReasoningIntent":
        """Parse /reasoning high, /thinking on, etc."""
        # ...

# Layer 2: Provider Capability (what provider supports)
@dataclass  
class ReasoningCapability:
    supports_thinking_toggle: bool
    supports_effort_level: bool
    preferred_param: Literal["thinking", "effort", "both", "none"]
    
    @classmethod
    def discover(cls, provider: str, model: str) -> "ReasoningCapability":
        """Detect capability via metadata or probe request"""
        # ...

# Layer 3: Adapter (translates intent → provider config)
class ReasoningAdapter:
    def adapt(
        intent: ReasoningIntent, 
        capability: ReasoningCapability
    ) -> dict:
        """
        Translate user intent to provider-specific config.
        
        Examples:
        - User wants "high reasoning", provider supports only thinking
          → {"thinking": {"type": "enabled"}}
          
        - User wants "high reasoning", provider supports only effort
          → {"reasoning_effort": "high"}
          
        - User wants "high reasoning", provider supports both
          → {"thinking": {...}, "reasoning_effort": "high"}
          
        - User wants "high reasoning", provider supports neither
          → {} + warning message
        """
        # ...

2. Proactive + Reactive Strategy

┌─────────────────────────────────────────┐
│ User Request: /reasoning high           │
└────────────┬────────────────────────────┘
┌─────────────────────────────────────────┐
│ Check Capability Cache                  │
├────────────┬────────────────────────────┤
│ Cache Hit  │ Cache Miss                 │
↓            ↓                            │
│ Adapt      │ Probe Provider             │
│ Immediately│   ↓                        │
│            │ Cache Result               │
│            │   ↓                        │
│            │ Adapt + Send               │
└────────────┴────────────────────────────┘
┌─────────────────────────────────────────┐
│ Send Adapted Request                    │
│ (correct params from the start)         │
└────────────┬────────────────────────────┘
             ↓ (if 400 error - edge case)
┌─────────────────────────────────────────┐
│ Fallback: Parse error → Update cache    │
│ → Retry with corrected params           │
└─────────────────────────────────────────┘

3. Transparent UX

Startup (proactive):

✓ Reasoning Mode: HIGH
  Provider: OpenCode Go (kimi-k2)
  Adapted: Thinking mode enabled (effort level not supported by this provider)
  Expected: Enhanced step-by-step reasoning

On capability mismatch (educational):

ℹ️ Provider Adaptation
  Your request: High-intensity reasoning
  Provider capability: Extended thinking mode (no granular effort control)
  Result: Using maximum available reasoning depth
  Tip: Switch to provider with effort-level support for fine-grained control

New commands:

  • /reasoning status — show current adaptation
  • /reasoning capabilities — list provider support matrix
  • /reasoning adapt <mode> — force specific adaptation strategy

Implementation Plan

Phase 1: Extract Capability Layer

  • Create ReasoningCapability dataclass
  • Add capability metadata to provider profiles
  • Implement capability discovery (probe request)
  • Cache results in session state
  • Files: agent/reasoning_capability.py, provider profile updates
  • Tests: Unit tests for capability detection

Phase 2: Implement Adapter Pattern

  • Create ReasoningAdapter with strategy pattern
  • Implement adapters for each provider family
  • Move logic from conversation_loop.py to adapter
  • Files: agent/reasoning_adapter.py, agent/adapters/*.py
  • Tests: Matrix of intent × capability combinations

Phase 3: UX Integration

  • Add capability check to session startup
  • Implement /reasoning status command
  • Add transparent adaptation messages
  • Update error messages to be user-friendly
  • Files: cli.py, agent/conversation_loop.py
  • Tests: Integration tests for UX flows

Phase 4: Learning System (Optional Enhancement)

  • Store successful configurations in user profile
  • Suggest optimal settings per provider
  • Auto-adapt when switching providers
  • Files: agent/reasoning_learning.py
  • Tests: Persistence and recommendation tests

Benefits

User-Centric: Users think in "I want smart responses", not API parameters
Proactive: Most cases resolved before errors occur
Testable: Adapter logic easily tested in isolation
Extensible: New providers = new adapter strategies, no core changes
Educational: Users understand what they got vs what they asked for
SOLID Compliant: Single responsibility, open-closed, dependency inversion

Comparison with Current PR #34794

Current PR (Reactive Only):

  • Detects errors after they happen
  • Adapts within error recovery loop
  • Shows warnings about degradation
  • Good for edge cases, but not optimal UX

This Proposal (Proactive + Reactive):

  • Predicts capabilities before requests
  • Adapts transparently from the start
  • Educates users about provider differences
  • Eliminates most errors before they occur

Recommendation: Merge PR #34794 for immediate bugfix, then implement this architecture for comprehensive solution.

Open Questions for Discussion

  1. Scope: Is this too large for a single PR? Should it be split into phases?

  2. Backward Compatibility: How to preserve compatibility with existing provider profiles that don't declare capabilities?

  3. Capability Discovery Strategy:

    • Option A: Probe request at session start (adds latency, but accurate)
    • Option B: Lazy discovery on first use (no startup overhead)
    • Option C: Static metadata in provider profiles (fast, but requires maintenance)
  4. Cache Strategy:

    • Session-level (fresh each session, but repeated discovery)
    • User profile (persistent across sessions, but stale if provider changes)
    • Global cache (shared across users, but requires invalidation strategy)
  5. UX Granularity:

    • Minimal: Only show when adaptation happens
    • Moderate: Show adaptation at session start
    • Verbose: Detailed capability matrix and adaptation reasoning
  6. Migration Path: How to gradually adopt this architecture without breaking existing users?

References

  • Issue #34786: Original feature request for reasoning fallback
  • PR #34794: Current reactive implementation (proposed for merge)
  • Issue #32040: OpenCode Go + kimi-k2 dual parameter bug
  • Issue #32327: Related reasoning parameter conflicts

Next Steps:

  1. Gather feedback from maintainers and community
  2. Prioritize phases based on complexity vs value
  3. Assign ownership for implementation
  4. Create tracking issues for each phase

Ready to discuss details and begin implementation if this approach aligns with project vision.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix RFC: User-Centric Reasoning Architecture with Intelligent Adaptation