hermes - ✅(Solved) Fix tracking: provider modules refactor — Cycle 2 of transport/provider infrastructure [2 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#14418Fetched 2026-04-24 06:17:26
View on GitHub
Comments
0
Participants
1
Timeline
32
Reactions
0
Participants
Timeline (top)
referenced ×22labeled ×5cross-referenced ×4unlabeled ×1

Cycle 2 of the provider infrastructure refactor. Cycle 1 (#13473, PRs 1-7 merged) extracted format conversion into agent/transports/. This cycle consolidates per-provider quirks from 5+ files into single-file provider modules.

Problem: Adding or modifying a provider touches auth.py + runtime_provider.py + models.py + auxiliary_client.py + run_agent.py + transport. Kimi and Copilot each touch 7 files despite having ~130-200 lines of quirk code. ChatCompletionsTransport takes 20+ boolean flag params because every provider's quirks are passed individually.

Solution: providers/<name>.py modules — each declares auth, endpoints, headers, temperature, max_tokens, message preprocessing, and extra_body in one place. Transport receives a provider object instead of flags.

Root Cause

Problem: Adding or modifying a provider touches auth.py + runtime_provider.py + models.py + auxiliary_client.py + run_agent.py + transport. Kimi and Copilot each touch 7 files despite having ~130-200 lines of quirk code. ChatCompletionsTransport takes 20+ boolean flag params because every provider's quirks are passed individually.

Fix Action

Fix / Workaround

  • Collapse adapter v1 normalize functions to return NormalizedResponse directly (eliminates 2-layer transport → v1 → NR mapping chain) — PR #14459
  • Migrate auxiliary_client.py Anthropic path (L575-612) to use transport instead of calling build_anthropic_kwargs / normalize_anthropic_response directly — PR #14459
  • Remove bedrock dispatch-site normalize_converse_response() — flush_memories guard changed to api_mode check — PR #14459 (dispatch-site itself stays: validate_response reads response.choices)
  • Remove _nr_to_assistant_message() shim — ToolCall + NR have backward-compat properties — PR #14459

What Stays on AIAgent (NOT moving to providers)

  • Client lifecycle (construction, interrupt, rebuild)
  • Streaming orchestration
  • Credential rotation / fallback
  • Prompt caching
  • Message history / _build_assistant_message
  • Tool dispatch

PR fix notes

PR #14424: feat: add provider modules — ProviderProfile ABC + 7 providers

Description (problem / solution / changelog)

Summary

Cycle 2 PR 1 (#14418). Introduces providers/ package with ProviderProfile ABC and wires first two providers live.

What this PR delivers

  1. ProviderProfile ABC (providers/base.py) — dataclass declaring auth, endpoints, quirks, hooks
  2. Provider registry (providers/__init__.py) — auto-discovery via pkgutil.iter_modules, get_provider_profile() with alias resolution
  3. 8 provider modules: NVIDIA, Kimi + Kimi-CN, OpenRouter, Nous, DeepSeek, Qwen
  4. Transport single-path (ChatCompletionsTransport._build_kwargs_from_profile()) — replaces 28 flag params with one profile object
  5. run_agent.py wiring — NVIDIA and DeepSeek activated via _PROFILE_ACTIVE_PROVIDERS allowlist
  6. 78 tests: 30 profile declarations + 19 transport parity + 22 profile wiring + 2 override parity + 5 E2E wiring

Provider activation strategy

Providers are activated incrementally via an explicit allowlist in _build_api_kwargs:

_PROFILE_ACTIVE_PROVIDERS = frozenset({
    "nvidia", "nvidia-nim",
    "deepseek", "deepseek-chat",
})

These two have zero special params — just default_max_tokens (NVIDIA) and baseline (DeepSeek). Remaining providers (OpenRouter, Nous, Kimi, Qwen) need additional agent-level params (anthropic_max_output, supports_reasoning, qwen_session_metadata) before activation. See #14515.

Test plan

  • 78 provider tests pass (profiles + parity + wiring + E2E)
  • Full agent/run_agent suite: 2683 passed, 3 pre-existing failures

Changed files

  • agent/transports/__init__.py (modified, +15/-5)
  • agent/transports/anthropic.py (modified, +10/-10)
  • agent/transports/base.py (modified, +7/-7)
  • agent/transports/bedrock.py (modified, +11/-7)
  • agent/transports/chat_completions.py (modified, +146/-29)
  • agent/transports/codex.py (modified, +26/-21)
  • agent/transports/types.py (modified, +16/-15)
  • providers/__init__.py (added, +66/-0)
  • providers/base.py (added, +77/-0)
  • providers/deepseek.py (added, +13/-0)
  • providers/kimi.py (added, +69/-0)
  • providers/nous.py (added, +46/-0)
  • providers/nvidia.py (added, +14/-0)
  • providers/openrouter.py (added, +45/-0)
  • providers/qwen.py (added, +80/-0)
  • pyproject.toml (modified, +44/-5)
  • run_agent.py (modified, +33/-1)
  • tests/providers/__init__.py (added, +0/-0)
  • tests/providers/test_e2e_wiring.py (added, +87/-0)
  • tests/providers/test_profile_wiring.py (added, +293/-0)
  • tests/providers/test_provider_profiles.py (added, +203/-0)
  • tests/providers/test_transport_parity.py (added, +250/-0)

PR #14459: refactor: complete WS1 transport cleanup (all 4 items)

Description (problem / solution / changelog)

Summary

All 4 WS1 items of Cycle 2 (#14418) — complete transport cleanup.

Commit 1: Migrate auxiliary_client to transport

Replaces the last direct normalize_anthropic_response() call outside the transport.

Commit 2: Collapse v1 normalize to return NormalizedResponse directly

Adapter returns NR, transport becomes 1-line passthrough. -22 lines from transport.

Commit 3: Remove _nr_to_assistant_message() shim + fix flush_memories guard

Add backward-compat properties to ToolCall/NormalizedResponse so agent loop reads them directly. -35 line shim deleted, 4 call sites simplified. Flush guard changed from shape check to api_mode check.

Net result across all 3 commits

  • 3-layer normalize chain → 1 layer: transport → v2 → v1transport → adapter (returns NR directly)
  • Shim eliminated: _nr_to_assistant_message() deleted, NR passes through to agent loop
  • auxiliary_client uses transport: no more direct adapter calls
  • ToolCall duck-types as old shape: .function.name, .function.arguments, .type, .call_id, .response_item_id all work
  • NormalizedResponse duck-types as old shape: .reasoning_content, .reasoning_details, .codex_reasoning_items from provider_data
  • flush_memories guard: hasattr(response, "choices")self.api_mode in (...)

What stays for later

  • Bedrock dispatch-site normalize_converse_response() stays — validate_response reads response.choices between dispatch and normalize
  • auxiliary_client still calls build_anthropic_kwargs directly (transport only owns normalize, not build)

Files changed

FileChange
agent/anthropic_adapter.pyv1 returns NR directly, uses ToolCall
agent/transports/anthropic.py1-line passthrough (-22 lines)
agent/transports/types.pyToolCall + NR backward-compat properties
agent/auxiliary_client.pyUses get_transport() for normalize
run_agent.pyShim deleted, 4 sites simplified, flush guard fixed
tests/agent/test_anthropic_adapter.pyUpdated for NR fields
tests/run_agent/test_anthropic_truncation_continuation.pyUpdated for NR fields

Test plan

  • 2647 passed, 3 pre-existing failures (same as main)

Changed files

  • agent/anthropic_adapter.py (modified, +0/-67)
  • agent/auxiliary_client.py (modified, +15/-2)
  • agent/transports/anthropic.py (modified, +38/-17)
  • agent/transports/types.py (modified, +42/-0)
  • run_agent.py (modified, +32/-63)
  • tests/agent/test_anthropic_adapter.py (modified, +31/-31)
  • tests/agent/transports/test_types.py (modified, +92/-0)
  • tests/run_agent/test_anthropic_truncation_continuation.py (modified, +15/-15)

Code Example

# providers/base.py
from dataclasses import dataclass, field
from typing import Any, Dict, List, Optional

@dataclass
class ProviderProfile:
    """Declares everything about a single provider in one place."""
    name: str
    api_mode: str = "chat_completions"
    aliases: tuple = ()
    
    # Auth
    env_vars: tuple = ()               # ("KIMI_API_KEY", "MOONSHOT_API_KEY")
    base_url: str = ""
    auth_type: str = "api_key"          # api_key, oauth, copilot, aws
    
    # Client quirks
    default_headers: dict = field(default_factory=dict)
    
    # Request quirks  
    fixed_temperature: float | None = None      # None = use default
    omit_temperature: bool = False               # Kimi: server manages it
    default_max_tokens: int | None = None        # NVIDIA: 16384, Qwen: 65536
    
    # Hooks (override in subclass)
    def prepare_messages(self, messages: list) -> list:
        """Provider-specific message preprocessing (Qwen normalization, etc.)."""
        return messages
    
    def extra_body(self, **context) -> dict:
        """Provider-specific extra_body fields (tags, reasoning, metadata)."""
        return {}
    
    def extra_headers(self, **context) -> dict:
        """Provider-specific HTTP headers (xAI conv-id, Kimi User-Agent)."""
        return {}
RAW_BUFFERClick to expand / collapse

Overview

Cycle 2 of the provider infrastructure refactor. Cycle 1 (#13473, PRs 1-7 merged) extracted format conversion into agent/transports/. This cycle consolidates per-provider quirks from 5+ files into single-file provider modules.

Problem: Adding or modifying a provider touches auth.py + runtime_provider.py + models.py + auxiliary_client.py + run_agent.py + transport. Kimi and Copilot each touch 7 files despite having ~130-200 lines of quirk code. ChatCompletionsTransport takes 20+ boolean flag params because every provider's quirks are passed individually.

Solution: providers/<name>.py modules — each declares auth, endpoints, headers, temperature, max_tokens, message preprocessing, and extra_body in one place. Transport receives a provider object instead of flags.

Workstreams

WS1: Transport Cleanup (from Cycle 1 gaps)

Prerequisite work before provider modules can absorb the adapters.

  • Collapse adapter v1 normalize functions to return NormalizedResponse directly (eliminates 2-layer transport → v1 → NR mapping chain) — PR #14459
  • Migrate auxiliary_client.py Anthropic path (L575-612) to use transport instead of calling build_anthropic_kwargs / normalize_anthropic_response directly — PR #14459
  • Remove bedrock dispatch-site normalize_converse_response() — flush_memories guard changed to api_mode check — PR #14459 (dispatch-site itself stays: validate_response reads response.choices)
  • Remove _nr_to_assistant_message() shim — ToolCall + NR have backward-compat properties — PR #14459

WS2: Provider Module ABC + Registry

Design the provider interface and migrate providers.

  • Design ProviderProfile ABC in providers/base.pyPR #14424
  • Create provider registry (auto-discovery from providers/*.py) — PR #14424
  • Wire ChatCompletionsTransport _build_kwargs_from_profile() single-path — PR #14424
  • Wire run_agent.py for NVIDIA + DeepSeek (incremental allowlist) — PR #14424 (remaining providers: #14515)
  • Remove legacy flag params from transport after run_agent.py is wired
  • Migrate auth.py PROVIDER_REGISTRY to read from provider modules
  • Migrate runtime_provider.py api_mode resolution to provider registry

WS3: Provider Migrations (ordered by scatter/complexity)

Migrate actual providers, most-scattered first.

  • NVIDIA (14 lines) — PR #14424
  • Kimi/Moonshot + Kimi-CN (separate profiles, correct endpoints/env_vars) — PR #14424
  • OpenRouter (provider_preferences, full reasoning passthrough) — PR #14424
  • Nous Portal (tags, reasoning with disabled omission) — PR #14424
  • Qwen Portal (message normalization + cache_control, vl_high_resolution, metadata top-level) — PR #14424
  • DeepSeek (minimal) — PR #14424
  • Copilot/GitHub Models (7 files, most scattered)
  • OpenAI Codex (6 files, 940 lines)
  • Custom/Ollama (4 files)
  • Remaining simple providers (ZAI, MiniMax, Alibaba, HF — mostly auth+models only)

Provider Quirk Scatter (Current State)

ProviderFiles touchedApprox linesKey quirks
Anthropic6~1510OAuth identity spoof, model-gated features, thinking modes, prompt caching
Bedrock5~1000boto3 client, dual api_mode (Converse vs Anthropic), region/guardrail
OpenAI Codex6~940Responses API, Cloudflare headers, encrypted reasoning
Nous Portal6~875OAuth device flow, agent keys, rate guard, tags
OpenRouter6~340HTTP headers, provider preferences, reasoning extra_body
Copilot/GitHub7~205Editor headers, dynamic api_mode per model, reasoning
Kimi/Moonshot7~130Dual endpoint, User-Agent spoof, temp OMIT, thinking/effort
Custom/Ollama4~110Auto-detect model, num_ctx, think=false
Qwen Portal4~95OAuth, message preprocessing, vl_high_resolution
ZAI/GLM3~854-endpoint probing, billing plan detection
MiniMax4~75Anthropic-compat, Bearer auth, beta header strip
xAI/Grok4~45Responses API, encrypted reasoning, conv headers
NVIDIA3~30max_tokens 16384
Alibaba2~25DashScope URL
HuggingFace2~25Case-sensitive model IDs
DeepSeek3~20reasoning_content field

ProviderProfile Design (Draft)

# providers/base.py
from dataclasses import dataclass, field
from typing import Any, Dict, List, Optional

@dataclass
class ProviderProfile:
    """Declares everything about a single provider in one place."""
    name: str
    api_mode: str = "chat_completions"
    aliases: tuple = ()
    
    # Auth
    env_vars: tuple = ()               # ("KIMI_API_KEY", "MOONSHOT_API_KEY")
    base_url: str = ""
    auth_type: str = "api_key"          # api_key, oauth, copilot, aws
    
    # Client quirks
    default_headers: dict = field(default_factory=dict)
    
    # Request quirks  
    fixed_temperature: float | None = None      # None = use default
    omit_temperature: bool = False               # Kimi: server manages it
    default_max_tokens: int | None = None        # NVIDIA: 16384, Qwen: 65536
    
    # Hooks (override in subclass)
    def prepare_messages(self, messages: list) -> list:
        """Provider-specific message preprocessing (Qwen normalization, etc.)."""
        return messages
    
    def extra_body(self, **context) -> dict:
        """Provider-specific extra_body fields (tags, reasoning, metadata)."""
        return {}
    
    def extra_headers(self, **context) -> dict:
        """Provider-specific HTTP headers (xAI conv-id, Kimi User-Agent)."""
        return {}

What Stays on AIAgent (NOT moving to providers)

  • Client lifecycle (construction, interrupt, rebuild)
  • Streaming orchestration
  • Credential rotation / fallback
  • Prompt caching
  • Message history / _build_assistant_message
  • Tool dispatch

Relationship to Cycle 1

  • Tracking issue: #13473
  • Cycle 1 PRs: #12975, #13347, #13366, #13430, #13447, #13467, #13862 (all merged)
  • Cycle 1 delivered: agent/transports/ with 4 transports, shared types, unified _get_transport()
  • Cycle 2 builds on top: provider modules feed INTO transports

extent analysis

TL;DR

Migrate the remaining providers to the new provider module system to reduce code scatter and simplify transport configuration.

Guidance

  1. Complete the migration of Copilot/GitHub Models and OpenAI Codex: These providers have the most scattered code and will benefit the most from the new provider module system.
  2. Remove legacy flag params from transport: Once all providers are migrated, remove the legacy flag parameters from the transport to simplify its configuration.
  3. Verify the provider registry: Ensure that the provider registry is correctly auto-discovering provider modules and that the ProviderProfile ABC is being used correctly.
  4. Test the new provider modules: Thoroughly test each provider module to ensure that they are working as expected and that the transport is correctly configured.
  5. Monitor for issues: Keep an eye on the system for any issues that may arise after the migration and be prepared to make adjustments as needed.

Example

# providers/copilot.py
from providers.base import ProviderProfile

class CopilotProfile(ProviderProfile):
    name = "Copilot"
    api_mode = "chat_completions"
    env_vars = ("COPILOT_API_KEY",)
    base_url = "https://api.github.com"
    auth_type = "copilot"

    def prepare_messages(self, messages: list) -> list:
        # Copilot-specific message preprocessing
        return messages

    def extra_body(self, **context) -> dict:
        # Copilot-specific extra_body fields
        return {}

Notes

The migration to the new provider module system is a significant change and may require additional testing and validation to ensure that it is working correctly.

Recommendation

Apply the workaround of migrating the remaining providers to the new provider module system to reduce code scatter and simplify transport configuration. This will allow for a more scalable and maintainable system.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - ✅(Solved) Fix tracking: provider modules refactor — Cycle 2 of transport/provider infrastructure [2 pull requests, 1 participants]