hermes - 💡(How to fix) Fix RFC: User-Centric Reasoning Architecture with Intelligent Adaptation

hermes2026-05-29 21:50:11

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Error Message

Provider: 400 error "cannot specify both..." System: Parse error → Retry with adaptation

❌ Implementation details leak to users (error messages mention API params) ↓ (if 400 error - edge case) │ Fallback: Parse error → Update cache │
Update error messages to be user-friendly
Adapts within error recovery loop

Code Example

User: /reasoning high
  ↓
System: Send both thinking + reasoning_effort params
  ↓
Provider: 400 error "cannot specify both..."
  ↓
System: Parse error → Retry with adaptation
  ↓
User: Finally gets response (with degraded experience)

---

# Layer 1: User Intent (what user wants)
@dataclass
class ReasoningIntent:
    enabled: bool
    strength: Literal["minimal", "low", "medium", "high", "xhigh"]
    
    @classmethod
    def from_user_command(cls, cmd: str) -> "ReasoningIntent":
        """Parse /reasoning high, /thinking on, etc."""
        # ...

# Layer 2: Provider Capability (what provider supports)
@dataclass  
class ReasoningCapability:
    supports_thinking_toggle: bool
    supports_effort_level: bool
    preferred_param: Literal["thinking", "effort", "both", "none"]
    
    @classmethod
    def discover(cls, provider: str, model: str) -> "ReasoningCapability":
        """Detect capability via metadata or probe request"""
        # ...

# Layer 3: Adapter (translates intent → provider config)
class ReasoningAdapter:
    def adapt(
        intent: ReasoningIntent, 
        capability: ReasoningCapability
    ) -> dict:
        """
        Translate user intent to provider-specific config.
        
        Examples:
        - User wants "high reasoning", provider supports only thinking
          → {"thinking": {"type": "enabled"}}
          
        - User wants "high reasoning", provider supports only effort
          → {"reasoning_effort": "high"}
          
        - User wants "high reasoning", provider supports both
          → {"thinking": {...}, "reasoning_effort": "high"}
          
        - User wants "high reasoning", provider supports neither
          → {} + warning message
        """
        # ...

---

┌─────────────────────────────────────────┐
│ User Request: /reasoning high           │
└────────────┬────────────────────────────┘
             ↓
┌─────────────────────────────────────────┐
│ Check Capability Cache                  │
├────────────┬────────────────────────────┤
│ Cache Hit  │ Cache Miss                 │
↓            ↓                            │
│ Adapt      │ Probe Provider             │
│ Immediately│   ↓                        │
│            │ Cache Result               │
│            │   ↓                        │
│            │ Adapt + Send               │
└────────────┴────────────────────────────┘
             ↓
┌─────────────────────────────────────────┐
│ Send Adapted Request                    │
│ (correct params from the start)         │
└────────────┬────────────────────────────┘
             ↓ (if 400 error - edge case)
┌─────────────────────────────────────────┐
│ Fallback: Parse error → Update cache    │
│ → Retry with corrected params           │
└─────────────────────────────────────────┘

---

✓ Reasoning Mode: HIGH
  Provider: OpenCode Go (kimi-k2)
  Adapted: Thinking mode enabled (effort level not supported by this provider)
  Expected: Enhanced step-by-step reasoning

---

ℹ️ Provider Adaptation
  Your request: High-intensity reasoning
  Provider capability: Extended thinking mode (no granular effort control)
  Result: Using maximum available reasoning depth
  Tip: Switch to provider with effort-level support for fine-grained control

RAW_BUFFERClick to expand / collapse

RFC: User-Centric Reasoning Architecture with Intelligent Adaptation

Problem Statement

Currently, when users request enhanced reasoning (e.g., /reasoning high), they expect "smarter responses" — they don't know (and shouldn't need to know) about technical differences between thinking toggles and reasoning_effort parameters across providers.

Current reactive approach:

User: /reasoning high
  ↓
System: Send both thinking + reasoning_effort params
  ↓
Provider: 400 error "cannot specify both..."
  ↓
System: Parse error → Retry with adaptation
  ↓
User: Finally gets response (with degraded experience)

Problems:

❌ Reactive — errors happen before adaptation
❌ Implementation details leak to users (error messages mention API params)
❌ Recovery logic scattered across conversation_loop.py (violates SRP)
❌ No capability awareness — same request sent to all providers
❌ Poor UX — users see confusing warnings instead of transparent adaptation

Proposed Architecture

1. Three-Layer Separation (SOLID)

# Layer 1: User Intent (what user wants)
@dataclass
class ReasoningIntent:
    enabled: bool
    strength: Literal["minimal", "low", "medium", "high", "xhigh"]
    
    @classmethod
    def from_user_command(cls, cmd: str) -> "ReasoningIntent":
        """Parse /reasoning high, /thinking on, etc."""
        # ...

# Layer 2: Provider Capability (what provider supports)
@dataclass  
class ReasoningCapability:
    supports_thinking_toggle: bool
    supports_effort_level: bool
    preferred_param: Literal["thinking", "effort", "both", "none"]
    
    @classmethod
    def discover(cls, provider: str, model: str) -> "ReasoningCapability":
        """Detect capability via metadata or probe request"""
        # ...

# Layer 3: Adapter (translates intent → provider config)
class ReasoningAdapter:
    def adapt(
        intent: ReasoningIntent, 
        capability: ReasoningCapability
    ) -> dict:
        """
        Translate user intent to provider-specific config.
        
        Examples:
        - User wants "high reasoning", provider supports only thinking
          → {"thinking": {"type": "enabled"}}
          
        - User wants "high reasoning", provider supports only effort
          → {"reasoning_effort": "high"}
          
        - User wants "high reasoning", provider supports both
          → {"thinking": {...}, "reasoning_effort": "high"}
          
        - User wants "high reasoning", provider supports neither
          → {} + warning message
        """
        # ...

2. Proactive + Reactive Strategy

┌─────────────────────────────────────────┐
│ User Request: /reasoning high           │
└────────────┬────────────────────────────┘
             ↓
┌─────────────────────────────────────────┐
│ Check Capability Cache                  │
├────────────┬────────────────────────────┤
│ Cache Hit  │ Cache Miss                 │
↓            ↓                            │
│ Adapt      │ Probe Provider             │
│ Immediately│   ↓                        │
│            │ Cache Result               │
│            │   ↓                        │
│            │ Adapt + Send               │
└────────────┴────────────────────────────┘
             ↓
┌─────────────────────────────────────────┐
│ Send Adapted Request                    │
│ (correct params from the start)         │
└────────────┬────────────────────────────┘
             ↓ (if 400 error - edge case)
┌─────────────────────────────────────────┐
│ Fallback: Parse error → Update cache    │
│ → Retry with corrected params           │
└─────────────────────────────────────────┘

3. Transparent UX

Startup (proactive):

✓ Reasoning Mode: HIGH
  Provider: OpenCode Go (kimi-k2)
  Adapted: Thinking mode enabled (effort level not supported by this provider)
  Expected: Enhanced step-by-step reasoning

On capability mismatch (educational):

ℹ️ Provider Adaptation
  Your request: High-intensity reasoning
  Provider capability: Extended thinking mode (no granular effort control)
  Result: Using maximum available reasoning depth
  Tip: Switch to provider with effort-level support for fine-grained control

New commands:

/reasoning status — show current adaptation
/reasoning capabilities — list provider support matrix
/reasoning adapt <mode> — force specific adaptation strategy

Implementation Plan

Phase 1: Extract Capability Layer

Create ReasoningCapability dataclass
Add capability metadata to provider profiles
Implement capability discovery (probe request)
Cache results in session state
Files: agent/reasoning_capability.py, provider profile updates
Tests: Unit tests for capability detection

Phase 2: Implement Adapter Pattern

Create ReasoningAdapter with strategy pattern
Implement adapters for each provider family
Move logic from conversation_loop.py to adapter
Files: agent/reasoning_adapter.py, agent/adapters/*.py
Tests: Matrix of intent × capability combinations

Phase 3: UX Integration

Add capability check to session startup
Implement /reasoning status command
Add transparent adaptation messages
Update error messages to be user-friendly
Files: cli.py, agent/conversation_loop.py
Tests: Integration tests for UX flows

Phase 4: Learning System (Optional Enhancement)

Store successful configurations in user profile
Suggest optimal settings per provider
Auto-adapt when switching providers
Files: agent/reasoning_learning.py
Tests: Persistence and recommendation tests

Benefits

✅ User-Centric: Users think in "I want smart responses", not API parameters
✅ Proactive: Most cases resolved before errors occur
✅ Testable: Adapter logic easily tested in isolation
✅ Extensible: New providers = new adapter strategies, no core changes
✅ Educational: Users understand what they got vs what they asked for
✅ SOLID Compliant: Single responsibility, open-closed, dependency inversion

Comparison with Current PR #34794

Current PR (Reactive Only):

Detects errors after they happen
Adapts within error recovery loop
Shows warnings about degradation
Good for edge cases, but not optimal UX

This Proposal (Proactive + Reactive):

Predicts capabilities before requests
Adapts transparently from the start
Educates users about provider differences
Eliminates most errors before they occur

Recommendation: Merge PR #34794 for immediate bugfix, then implement this architecture for comprehensive solution.

Open Questions for Discussion

Scope: Is this too large for a single PR? Should it be split into phases?
Backward Compatibility: How to preserve compatibility with existing provider profiles that don't declare capabilities?
Capability Discovery Strategy:
- Option A: Probe request at session start (adds latency, but accurate)
- Option B: Lazy discovery on first use (no startup overhead)
- Option C: Static metadata in provider profiles (fast, but requires maintenance)
Cache Strategy:
- Session-level (fresh each session, but repeated discovery)
- User profile (persistent across sessions, but stale if provider changes)
- Global cache (shared across users, but requires invalidation strategy)
UX Granularity:
- Minimal: Only show when adaptation happens
- Moderate: Show adaptation at session start
- Verbose: Detailed capability matrix and adaptation reasoning
Migration Path: How to gradually adopt this architecture without breaking existing users?

References

Issue #34786: Original feature request for reasoning fallback
PR #34794: Current reactive implementation (proposed for merge)
Issue #32040: OpenCode Go + kimi-k2 dual parameter bug
Issue #32327: Related reasoning parameter conflicts

Next Steps:

Gather feedback from maintainers and community
Prioritize phases based on complexity vs value
Assign ownership for implementation
Create tracking issues for each phase

Ready to discuss details and begin implementation if this approach aligns with project vision.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering