hermes - ✅(Solved) Fix tracking: provider transport refactor (agent/transports/) [3 pull requests, 1 participants]

kshitijk4poor · 2026-04-21T10:52:06Z

[hermes] This is a two-cycle refactor of hermes-agent's provider infrastructure. Cycle 1 this issue : Transport layer — Extract format conversion and response… This is a two-cycle refactor of hermes-agent's provider infrastructure. **Cycle 1 (this issue): Transport layer** — Extract format conversion and response normalization from `run_agent.py` into `agent/transports/`. Each transport owns `convert_messages`, `convert_tools`, `build_kwargs`, `normalize_response`. Client lifecycle, streaming, credentials, and prompt caching stay on `AIAgent`. **Cycle 2 (future): Provider modules** — Consolidate per-provider quirks (currently scattered across 5+ files) into single-file provider definitions under `providers/`. Each provider module declares its auth, endpoints, client headers, temperature behavior, max_tokens defaults, message preprocessing, and extra_body construction in one place. Transports become generic — they read from the provider object instead of checking boolean flags. See [Cycle 2 Design](#cycle-2-provider-modules-next) below. **Principle:** Every PR wires its code to real production paths in the same PR. No dormant abstractions. --- # PR #13805: feat: add ChatCompletionsTransport + wire all default paths - Repository: NousResearch/hermes-agent - Author: teknium1 - State: closed | merged: True - Link: https://github.com/NousResearch/hermes-agent/pull/13805 ## Description (problem / solution / changelog) Salvages #13447 with regression fixes and Kimi port. Third transport — handles the default `chat_completions` api_mode used by ~16 OpenAI-compatible providers. Closes the main PR 5 of the transport refactor series (issue #13473). ## Changes vs #13447 - Preserve `tool_call.extra_content` (Gemini thought_signature) via `ToolCall.provider_data` — the original shim stripped it, causing 400 errors on multi-turn Gemini 3 thinking. - Preserve `reasoning_content` distinctly from `reasoning` (DeepSeek/Moonshot) so the thinking-prefill retry check still triggers. - Port Kimi/Moonshot quirks that landed on main after the original PR (32000 max_tokens default, top-level `reasoning_effort`, `extra_body.thinking`). - Skip the SimpleNamespace shim in the main normalize loop — for chat_completions, `response.choices[0].message` is already the right shape. ## Impact run_agent.py: **-239 lines** in `_build_api_kwargs` default branch. ## Transport coverage | api_mode | Transport | build_kwargs | normalize | validate | |---|---|:---:|:---:|:---:| | anthropic_messages | AnthropicTransport | ✅ | ✅ | ✅ | | codex_responses | ResponsesApiTransport | ✅ | ✅ | ✅ | | **chat_completions** | **ChatCompletionsTransport** | **✅** | **✅** | **✅** | | bedrock_converse | — (PR #13467) | | | | ## Validation | | Result | |---|---| | New transport tests | 39 pass (8 build_kwargs, 5 Kimi, 4 validate, 4 normalize, 3 cache, 3 basic) | | `tests/run_agent/` | 885/885 pass (+ 15 skipped; the single `test_concurrent_interrupt` failure is a pre-existing flake on `origin/main`) | | E2E — Gemini extra_content | Live check with real `openai.types.chat.ChatCompletionMessageToolCall`: `provider_data["extra_content"]` preserved ✅ | | E2E — Kimi build_kwargs | max_tokens=32000, reasoning_effort=high, extra_body.thinking={"type":"enabled"} ✅ | | E2E — Kimi thinking-off | reasoning_effort omitted, thinking={"type":"disabled"} ✅ | | E2E — reasoning_content | preserved separately in provider_data ✅ | Closes #13447 (merging this credits @kshitijk4poor's original work). ## Changed files - `agent/transports/__init__.py` (modified, +4/-0) - `agent/transports/chat_completions.py` (added, +387/-0) - `run_agent.py` (modified, +91/-239) - `tests/agent/transports/test_chat_completions.py` (added, +349/-0) --- # PR #13814: feat: add BedrockTransport + wire all Bedrock transport paths - Repository: NousResearch/hermes-agent - Author: teknium1 - State: closed | merged: True - Link: https://github.com/NousResearch/hermes-agent/pull/13814 ## Description (problem / solution / changelog) Salvages #13467. Fourth and final transport — completes the transport layer with all four api_modes covered (issue #13473, Cycle 1). ## Changes vs #13467 One adjustment from the original: the main normalize loop does NOT add a `bedrock_converse` branch to invoke `normalize_response` on the response. Bedrock's `normalize_converse_response` runs at the dispatch site (`run_agent.py:5189`), so the response already has the OpenAI-compatible `.choices[0].message` shape by the time the main loop sees it. Falling through to the chat_completions else branch is correct and sidesteps a redundant NormalizedResponse rebuild. Everything else is preserved: `build_kwargs`, `validate_response`, `finish_reason` branch, `normalize_response` (still usable by direct callers), `map_finish_reason`. ## Transport coverage — COMPLETE | api_mode | Transport | build_kwargs | normalize | validate | |---|---|:---:|:---:|:---:| | anthropic_messages | AnthropicTransport | ✅ | ✅ | ✅ | | codex_responses | ResponsesApiTran

hermes2026-04-21 10:52:06

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#13473•Fetched 2026-04-22 08:06:15

View on GitHub

Comments

Participants

Timeline

Reactions

Author

kshitijk4poor

Participants

kshitijk4poor

Timeline (top)

cross-referenced ×4labeled ×1subscribed ×1

This is a two-cycle refactor of hermes-agent's provider infrastructure.

Cycle 1 (this issue): Transport layer — Extract format conversion and response normalization from run_agent.py into agent/transports/. Each transport owns convert_messages, convert_tools, build_kwargs, normalize_response. Client lifecycle, streaming, credentials, and prompt caching stay on AIAgent.

Cycle 2 (future): Provider modules — Consolidate per-provider quirks (currently scattered across 5+ files) into single-file provider definitions under providers/. Each provider module declares its auth, endpoints, client headers, temperature behavior, max_tokens defaults, message preprocessing, and extra_body construction in one place. Transports become generic — they read from the provider object instead of checking boolean flags. See Cycle 2 Design below.

Principle: Every PR wires its code to real production paths in the same PR. No dormant abstractions.

Root Cause

Problem Cycle 1 leaves behind: Provider quirks are still scattered across auth.py, runtime_provider.py, models.py, auxiliary_client.py, run_agent.py, and the transports themselves. Adding a new provider requires touching 5+ files. The ChatCompletionsTransport takes 20+ boolean params because each provider's quirks are passed as flags.

Fix Action

Fix / Workaround

PR	Status	What it does	Lines
PR 1 #12975	✅ Merged	Extract 10 Codex Responses API functions into `agent/codex_responses_adapter.py`	-565 from run_agent.py
PR 2 #13347	✅ Merged	Add `agent/transports/types.py` (NormalizedResponse, ToolCall, Usage) + migrate Anthropic normalize path	+554
PR 3 #13366	✅ Merged	Add ProviderTransport ABC + AnthropicTransport, wire all Anthropic paths (9 sites)	+539/-45
PR 4 #13430	🔄 Open	Add ResponsesApiTransport, wire all Codex paths, remove 7 dead wrappers	+590/-169
PR 5 #13447	🔄 Open	Add ChatCompletionsTransport, wire all default paths (210-line kwargs block extracted)	+640/-227
PR 6 #13467	🔄 Open	Add BedrockTransport, wire all Bedrock paths	+383/-13
PR 7	📋 Planned	Unify dispatch — remove dead api_mode branches, collapse normalize shims	Dep: 4,5,6
PR 8	📋 Planned	Simplify `runtime_provider.py` — transport registry replaces manual api_mode routing	Dep: 7
PR 9	📋 Planned	Documentation — architecture guide, transport authoring guide	Dep: 8

reasoning_content vs reasoning — two distinct fields downstream, transport merges them into reasoning. The thinking-prefill check reads reasoning_content separately.
Prompt caching runs between convert and build_kwargs — apply_anthropic_cache_control mutates messages after conversion. Transport can't produce final API-ready messages alone.
ChatCompletionsTransport has 13 provider conditionals — flags passed as explicit params. Works but the param list is long. This is the primary motivation for Cycle 2.
flush_memories and iteration_limit_summary have their own normalize dispatch — wired through transports now but still have separate code paths.
Bedrock normalizes at dispatch, not at the main loop — the transport handles both shapes (raw boto3 dict + already-normalized SimpleNamespace).
_ephemeral_max_output_tokens is consumed by both Anthropic and chat_completions branches — shared agent state that both transports need.

PR fix notes

PR #13805: feat: add ChatCompletionsTransport + wire all default paths

Repository: NousResearch/hermes-agent
Author: teknium1
State: closed | merged: True
Link: https://github.com/NousResearch/hermes-agent/pull/13805

Description (problem / solution / changelog)

Salvages #13447 with regression fixes and Kimi port.

Third transport — handles the default chat_completions api_mode used by ~16 OpenAI-compatible providers. Closes the main PR 5 of the transport refactor series (issue #13473).

Changes vs #13447

Preserve tool_call.extra_content (Gemini thought_signature) via ToolCall.provider_data — the original shim stripped it, causing 400 errors on multi-turn Gemini 3 thinking.
Preserve reasoning_content distinctly from reasoning (DeepSeek/Moonshot) so the thinking-prefill retry check still triggers.
Port Kimi/Moonshot quirks that landed on main after the original PR (32000 max_tokens default, top-level reasoning_effort, extra_body.thinking).
Skip the SimpleNamespace shim in the main normalize loop — for chat_completions, response.choices[0].message is already the right shape.

Impact

run_agent.py: -239 lines in _build_api_kwargs default branch.

Transport coverage

api_mode	Transport	build_kwargs	normalize	validate
anthropic_messages	AnthropicTransport	✅	✅	✅
codex_responses	ResponsesApiTransport	✅	✅	✅
chat_completions	ChatCompletionsTransport	✅	✅	✅
bedrock_converse	— (PR #13467)

Validation

	Result
New transport tests	39 pass (8 build_kwargs, 5 Kimi, 4 validate, 4 normalize, 3 cache, 3 basic)
`tests/run_agent/`	885/885 pass (+ 15 skipped; the single `test_concurrent_interrupt` failure is a pre-existing flake on `origin/main`)
E2E — Gemini extra_content	Live check with real `openai.types.chat.ChatCompletionMessageToolCall`: `provider_data["extra_content"]` preserved ✅
E2E — Kimi build_kwargs	max_tokens=32000, reasoning_effort=high, extra_body.thinking={"type":"enabled"} ✅
E2E — Kimi thinking-off	reasoning_effort omitted, thinking={"type":"disabled"} ✅
E2E — reasoning_content	preserved separately in provider_data ✅

Closes #13447 (merging this credits @kshitijk4poor's original work).

Changed files

agent/transports/__init__.py (modified, +4/-0)
agent/transports/chat_completions.py (added, +387/-0)
run_agent.py (modified, +91/-239)
tests/agent/transports/test_chat_completions.py (added, +349/-0)

PR #13814: feat: add BedrockTransport + wire all Bedrock transport paths

Repository: NousResearch/hermes-agent
Author: teknium1
State: closed | merged: True
Link: https://github.com/NousResearch/hermes-agent/pull/13814

Description (problem / solution / changelog)

Salvages #13467. Fourth and final transport — completes the transport layer with all four api_modes covered (issue #13473, Cycle 1).

Changes vs #13467

One adjustment from the original: the main normalize loop does NOT add a bedrock_converse branch to invoke normalize_response on the response. Bedrock's normalize_converse_response runs at the dispatch site (run_agent.py:5189), so the response already has the OpenAI-compatible .choices[0].message shape by the time the main loop sees it. Falling through to the chat_completions else branch is correct and sidesteps a redundant NormalizedResponse rebuild.

Everything else is preserved: build_kwargs, validate_response, finish_reason branch, normalize_response (still usable by direct callers), map_finish_reason.

Transport coverage — COMPLETE

api_mode	Transport	build_kwargs	normalize	validate
anthropic_messages	AnthropicTransport	✅	✅	✅
codex_responses	ResponsesApiTransport	✅	✅	✅
chat_completions	ChatCompletionsTransport	✅	✅	✅
bedrock_converse	BedrockTransport	✅	✅	✅

Validation

	Result
BedrockTransport tests	18 pass
All transport tests	117 pass
All bedrock/converse tests across tests/agent/	160 pass
`tests/run_agent/`	885/885 pass (+ 15 skipped; the single `test_concurrent_interrupt` failure is pre-existing on `origin/main`)
E2E — build_kwargs	model, region, max_tokens, guardrail ✅
E2E — validate_response	raw dict + normalized SimpleNamespace ✅
E2E — normalize_response	text + tool call ✅
E2E — `_build_api_kwargs` integration	✅

Closes #13467 (merging this credits @kshitijk4poor's original work).

Changed files

agent/transports/__init__.py (modified, +4/-0)
agent/transports/bedrock.py (added, +154/-0)
run_agent.py (modified, +30/-13)
tests/agent/transports/test_bedrock_transport.py (added, +164/-0)

PR #13862: refactor: unify transport dispatch + collapse normalize shims

Repository: NousResearch/hermes-agent
Author: kshitijk4poor
State: open | merged: False
Link: https://github.com/NousResearch/hermes-agent/pull/13862

Description (problem / solution / changelog)

Summary

PR 7 of the provider transport refactor (#13473). PRs 1-6 wired all 4 transports to production paths. This PR consolidates the wiring.

What this PR does

1. Consolidate 4 transport helpers → 1

Replace _get_anthropic_transport(), _get_codex_transport(), _get_chat_completions_transport(), _get_bedrock_transport() with one generic _get_transport(api_mode=None) that uses a shared dict cache. 22 call sites updated — no behavioral change, just less boilerplate.

2. Collapse 65-line main normalize block → 7 lines

Before (3 branches, each with its own SimpleNamespace construction):

if self.api_mode == "codex_responses":
    _ct = self._get_transport()
    _cnr = _ct.normalize_response(response)
    # ... 35 lines of SimpleNamespace shim with codex-specific fields ...
elif self.api_mode == "anthropic_messages":
    _transport = self._get_transport()
    _nr = _transport.normalize_response(response, strip_tool_prefix=...)
    # ... 26 lines of SimpleNamespace shim ...
else:
    assistant_message = response.choices[0].message

After:

_transport = self._get_transport()
_normalize_kwargs = {}
if self.api_mode == "anthropic_messages":
    _normalize_kwargs["strip_tool_prefix"] = self._is_anthropic_oauth
_nr = _transport.normalize_response(response, **_normalize_kwargs)
assistant_message = self._nr_to_assistant_message(_nr)
finish_reason = _nr.finish_reason

The shared _nr_to_assistant_message() static method handles all 4 api_modes — extracts provider_data fields (codex_reasoning_items, reasoning_details, call_id, response_item_id) into the SimpleNamespace shape downstream expects.

3. Wire chat_completions + bedrock normalize through transports

These were previously falling through to the raw response.choices[0].message else branch. Now all 4 api_modes go through transport.normalize_response().

4. Remove 8 dead codex adapter imports

_chat_content_to_responses_parts, _codex_chat_messages_to_responses_input, _codex_normalize_codex_response, _codex_preflight_codex_api_kwargs, _codex_preflight_codex_input_items, _codex_responses_tools, _codex_extract_responses_message_text, _codex_extract_responses_reasoning_text — all have zero callers after PRs 1-6.

Impact

run_agent.py: -46 lines (11,988 → 11,941)
1 test file updated (import path change)

What stays as-is (per plan)

flush_memories and _iteration_limit_summary secondary dispatch — low-traffic auxiliary paths, follow-up issue
The _nr_to_assistant_message shim itself — removed when downstream code migrates to read NormalizedResponse directly

Test plan

2,605 run_agent + agent tests pass (4 pre-existing failures)
Zero regressions from our changes
6K+ errors in other dirs are pre-existing environment issue (same on main)

Changed files

run_agent.py (modified, +76/-123)
tests/run_agent/test_run_agent_multimodal_prologue.py (modified, +2/-1)

Code Example

@dataclass
class ToolCall:
    id: str | None          # Protocol's canonical ID (call_XXXX, toolu_XXXX, etc.)
    name: str
    arguments: str          # JSON string
    provider_data: dict | None = None   # Per-tool-call protocol metadata

@dataclass
class NormalizedResponse:
    content: str | None
    tool_calls: list[ToolCall] | None
    finish_reason: str                  # "stop", "tool_calls", "length", "content_filter"
    reasoning: str | None = None        # Cross-provider (Anthropic, Codex, DeepSeek, Gemini)
    usage: Usage | None = None
    provider_data: dict | None = None   # Response-level protocol state

---

PR1 ──→ PR4
PR2 ──→ PR3 ──→ PR4
              ──→ PR5
              ──→ PR6
                    PR4+5+6 ──→ PR7 ──→ PR8 ──→ PR9

---

# providers/kimi.py
class KimiProvider:
    name = "kimi-coding"
    aliases = ["kimi", "moonshot"]
    api_mode = "chat_completions"
    
    # Auth (currently in hermes_cli/auth.py)
    env_vars = ["KIMI_API_KEY", "MOONSHOT_API_KEY"]
    base_url = "https://api.kimi.com/v1"
    
    # Client quirks (currently in run_agent.py __init__)
    default_headers = {"User-Agent": "hermes-agent/1.0"}
    
    # Request quirks (currently in auxiliary_client.py)
    fixed_temperature = 0.6
    default_max_tokens = None

---

# providers/nvidia.py
class NvidiaProvider:
    name = "nvidia"
    api_mode = "chat_completions"
    env_vars = ["NVIDIA_API_KEY"]
    base_url = "https://integrate.api.nvidia.com/v1"
    default_max_tokens = 16384  # GLM-4.7 thinking exhaust fix

---

# providers/qwen.py
class QwenPortalProvider:
    name = "qwen-portal"
    api_mode = "chat_completions"
    env_vars = ["QWEN_API_KEY"]
    base_url = "https://portal.qwen.ai/api/v1"
    default_max_tokens = 65536
    
    def prepare_messages(self, messages):
        """Normalize content to list-of-dicts, inject cache_control."""
        ...
    
    def extra_body(self, session_id):
        return {
            "metadata": {"sessionId": session_id},
            "vl_high_resolution_images": True,
        }

RAW_BUFFERClick to expand / collapse

Overview

This is a two-cycle refactor of hermes-agent's provider infrastructure.

Principle: Every PR wires its code to real production paths in the same PR. No dormant abstractions.

Shared Types (`agent/transports/types.py`)

@dataclass
class ToolCall:
    id: str | None          # Protocol's canonical ID (call_XXXX, toolu_XXXX, etc.)
    name: str
    arguments: str          # JSON string
    provider_data: dict | None = None   # Per-tool-call protocol metadata

@dataclass
class NormalizedResponse:
    content: str | None
    tool_calls: list[ToolCall] | None
    finish_reason: str                  # "stop", "tool_calls", "length", "content_filter"
    reasoning: str | None = None        # Cross-provider (Anthropic, Codex, DeepSeek, Gemini)
    usage: Usage | None = None
    provider_data: dict | None = None   # Response-level protocol state

Cycle 1: PR Tracker

PR	Status	What it does	Lines
PR 1 #12975	✅ Merged	Extract 10 Codex Responses API functions into `agent/codex_responses_adapter.py`	-565 from run_agent.py
PR 2 #13347	✅ Merged	Add `agent/transports/types.py` (NormalizedResponse, ToolCall, Usage) + migrate Anthropic normalize path	+554
PR 3 #13366	✅ Merged	Add ProviderTransport ABC + AnthropicTransport, wire all Anthropic paths (9 sites)	+539/-45
PR 4 #13430	🔄 Open	Add ResponsesApiTransport, wire all Codex paths, remove 7 dead wrappers	+590/-169
PR 5 #13447	🔄 Open	Add ChatCompletionsTransport, wire all default paths (210-line kwargs block extracted)	+640/-227
PR 6 #13467	🔄 Open	Add BedrockTransport, wire all Bedrock paths	+383/-13
PR 7	📋 Planned	Unify dispatch — remove dead api_mode branches, collapse normalize shims	Dep: 4,5,6
PR 8	📋 Planned	Simplify `runtime_provider.py` — transport registry replaces manual api_mode routing	Dep: 7
PR 9	📋 Planned	Documentation — architecture guide, transport authoring guide	Dep: 8

Dependency Graph

PR1 ──→ PR4
PR2 ──→ PR3 ──→ PR4
              ──→ PR5
              ──→ PR6
                    PR4+5+6 ──→ PR7 ──→ PR8 ──→ PR9

What the Transport Owns vs What Stays on AIAgent

Transport owns	AIAgent keeps
`convert_messages()` — OpenAI msgs → provider format	Client construction (`build_anthropic_client`, etc.)
`convert_tools()` — OpenAI tools → provider format	Client rebuild/teardown on interrupt
`build_kwargs()` — assemble full API call kwargs	Credential refresh/rotation
`normalize_response()` → NormalizedResponse	Streaming (`_call_anthropic`, `_run_codex_stream`)
`validate_response()` — structural check	Prompt caching policy
`extract_cache_stats()` — provider-specific cache tokens	Retry/interrupt threading
`map_finish_reason()` — provider stop reason → OpenAI	Fallback provider routing

Transport Coverage

api_mode	Transport	build_kwargs	normalize	validate	cache_stats	finish_reason
`anthropic_messages`	`AnthropicTransport`	✅	✅	✅	✅	✅
`codex_responses`	`ResponsesApiTransport`	✅	✅	✅	—	✅
`chat_completions`	`ChatCompletionsTransport`	✅	✅	✅	✅	—
`bedrock_converse`	`BedrockTransport`	✅	✅	✅	—	✅

Abort Points

Each PR delivers standalone value. Safe stopping points:

After PR 3 — one transport proven end-to-end, types established
After PR 6 — all 4 transports wired, transport layer complete
After PR 8 — runtime simplified, full Cycle 1 done
After PR 9 — documented, ready for Cycle 2

Known Gaps (from codebase stress test)

reasoning_content vs reasoning — two distinct fields downstream, transport merges them into reasoning. The thinking-prefill check reads reasoning_content separately.
Prompt caching runs between convert and build_kwargs — apply_anthropic_cache_control mutates messages after conversion. Transport can't produce final API-ready messages alone.
ChatCompletionsTransport has 13 provider conditionals — flags passed as explicit params. Works but the param list is long. This is the primary motivation for Cycle 2.
flush_memories and iteration_limit_summary have their own normalize dispatch — wired through transports now but still have separate code paths.
Bedrock normalizes at dispatch, not at the main loop — the transport handles both shapes (raw boto3 dict + already-normalized SimpleNamespace).
_ephemeral_max_output_tokens is consumed by both Anthropic and chat_completions branches — shared agent state that both transports need.

Cycle 2: Provider Modules (Next)

Solution: Consolidate per-provider quirks into single-file provider modules under providers/. Each module declares everything about that provider in one place:

# providers/kimi.py
class KimiProvider:
    name = "kimi-coding"
    aliases = ["kimi", "moonshot"]
    api_mode = "chat_completions"
    
    # Auth (currently in hermes_cli/auth.py)
    env_vars = ["KIMI_API_KEY", "MOONSHOT_API_KEY"]
    base_url = "https://api.kimi.com/v1"
    
    # Client quirks (currently in run_agent.py __init__)
    default_headers = {"User-Agent": "hermes-agent/1.0"}
    
    # Request quirks (currently in auxiliary_client.py)
    fixed_temperature = 0.6
    default_max_tokens = None

# providers/nvidia.py
class NvidiaProvider:
    name = "nvidia"
    api_mode = "chat_completions"
    env_vars = ["NVIDIA_API_KEY"]
    base_url = "https://integrate.api.nvidia.com/v1"
    default_max_tokens = 16384  # GLM-4.7 thinking exhaust fix

# providers/qwen.py
class QwenPortalProvider:
    name = "qwen-portal"
    api_mode = "chat_completions"
    env_vars = ["QWEN_API_KEY"]
    base_url = "https://portal.qwen.ai/api/v1"
    default_max_tokens = 65536
    
    def prepare_messages(self, messages):
        """Normalize content to list-of-dicts, inject cache_control."""
        ...
    
    def extra_body(self, session_id):
        return {
            "metadata": {"sessionId": session_id},
            "vl_high_resolution_images": True,
        }

What changes:

Transport's build_kwargs receives a provider object instead of 20 flags
hermes_cli/auth.py reads ProviderConfig from provider modules
hermes_cli/runtime_provider.py resolves api_mode from provider registry
hermes_cli/models.py reads model lists from provider modules
auxiliary_client.py reads temperature/aux config from provider modules

What this enables:

Adding a new OpenAI-compatible provider = one file (providers/newprovider.py)
Each provider's behavior is testable in isolation
No more "search 5 files to understand how Kimi works"

Current quirk distribution (what Cycle 2 consolidates)

Quirk	Provider	Currently in	Moves to
Fixed temperature 0.6	Kimi	`auxiliary_client.py`	`providers/kimi.py`
User-Agent header	Kimi	`run_agent.py` client init	`providers/kimi.py`
Default max_tokens 16384	NVIDIA	`ChatCompletionsTransport`	`providers/nvidia.py`
Default max_tokens 65536	Qwen	`ChatCompletionsTransport`	`providers/qwen.py`
Message normalization	Qwen	`run_agent.py` + transport	`providers/qwen.py`
`vl_high_resolution_images`	Qwen	`ChatCompletionsTransport`	`providers/qwen.py`
Developer role swap	GPT-5/Codex	`ChatCompletionsTransport`	`providers/openai_codex.py`
`think=false` suppression	Ollama/custom	`ChatCompletionsTransport`	`providers/custom.py`
`num_ctx` override	Ollama	`ChatCompletionsTransport`	`providers/custom.py`
Provider preferences	OpenRouter	`ChatCompletionsTransport`	`providers/openrouter.py`
Product attribution tags	Nous	`ChatCompletionsTransport`	`providers/nous.py`
Reasoning extra_body	OR/Nous/GitHub	`ChatCompletionsTransport`	each provider module
xAI conv headers	xAI/Grok	`ResponsesApiTransport`	`providers/xai.py`
Thinking signatures	Anthropic	`AnthropicTransport` → adapter	`providers/anthropic.py`
Guardrail config	Bedrock	`BedrockTransport`	`providers/bedrock.py`
OAuth identity transform	Anthropic	adapter	`providers/anthropic.py`
Encrypted reasoning	Codex/xAI	`ResponsesApiTransport`	each provider module

extent analysis

TL;DR

The primary motivation for Cycle 2 is to consolidate per-provider quirks into single-file provider modules, making it easier to add new providers and reduce code duplication.

Guidance

Review the current quirk distribution: Examine the table in the issue body to understand how provider quirks are currently scattered across multiple files and how they will be consolidated in Cycle 2.
Create provider modules: Start creating provider modules under providers/ for each provider, declaring their auth, endpoints, client headers, temperature behavior, and other quirks in one place.
Update transports to use provider objects: Modify the transports to receive a provider object instead of multiple flags, allowing them to read the necessary information from the provider module.
Refactor affected files: Update files like hermes_cli/auth.py, hermes_cli/runtime_provider.py, hermes_cli/models.py, and auxiliary_client.py to read configuration from the provider modules.

Example

# providers/kimi.py
class KimiProvider:
    name = "kimi-coding"
    aliases = ["kimi", "moonshot"]
    api_mode = "chat_completions"
    env_vars = ["KIMI_API_KEY", "MOONSHOT_API_KEY"]
    base_url = "https://api.kimi.com/v1"
    default_headers = {"User-Agent": "hermes-agent/1.0"}
    fixed_temperature = 0.6
    default_max_tokens = None

Notes

The solution involves significant refactoring, and it's essential to ensure that all provider quirks are correctly consolidated and that the transports are updated to use the new provider modules.

Recommendation

Apply the workaround by creating provider modules and updating the transports to use them, as this will simplify the codebase and make it easier to add new providers.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #dependency conflict #environment setup #docker error #permission error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

hermes - ✅(Solved) Fix tracking: provider transport refactor (agent/transports/) [3 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

PR fix notes

PR #13805: feat: add ChatCompletionsTransport + wire all default paths

Description (problem / solution / changelog)

Changes vs #13447

Impact

Transport coverage

Validation

Changed files

PR #13814: feat: add BedrockTransport + wire all Bedrock transport paths

Description (problem / solution / changelog)

Changes vs #13467

Transport coverage — COMPLETE

Validation

Changed files

PR #13862: refactor: unify transport dispatch + collapse normalize shims

Description (problem / solution / changelog)

Summary

What this PR does

Impact

What stays as-is (per plan)

Test plan

Changed files

Code Example

Overview

Shared Types (agent/transports/types.py)

Cycle 1: PR Tracker

Dependency Graph

What the Transport Owns vs What Stays on AIAgent

Transport Coverage

Abort Points

Known Gaps (from codebase stress test)

Cycle 2: Provider Modules (Next)

Current quirk distribution (what Cycle 2 consolidates)

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING

Shared Types (`agent/transports/types.py`)