hermes - 💡(How to fix) Fix [Bug] Auxiliary auto-detect ignores named custom providers, routes all unconfigured tasks to main provider

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

2026-05-08 23:30:50 Auxiliary session_search (async): connection error on auto (Request timed out.), trying fallback

Fix Action

Workaround

Current working configuration:

auxiliary:
  compression:
    model: kimi-k2.6:cloud
    provider: custom          # MUST be custom, NOT ollama-launch
    base_url: http://127.0.0.1:11434/v1
    timeout: 300              # MUST be explicit (default is only 30s)
  vision:
    model: kimi-k2.6:cloud
    provider: custom
    base_url: http://127.0.0.1:11434/v1
    timeout: 120

Code Example

model:
  default: glm-5.1:cloud
  provider: ollama-launch

providers:
  ollama-launch:
    api: http://127.0.0.1:11434/v1
    default_model: kimi-k2.6:cloud

# Only compression and vision are explicitly configured:
auxiliary:
  compression:
    model: kimi-k2.6:cloud
    provider: custom
    base_url: http://127.0.0.1:11434/v1
    timeout: 300
  vision:
    model: kimi-k2.6:cloud
    provider: custom
    base_url: http://127.0.0.1:11434/v1
    timeout: 120

---

# Compression with provider: ollama-launch (before fix)
2026-05-08 23:15:50 Auxiliary compression: using ollama-launch (qwen3.5:cloud) at http://127.0.0.1:11434/v1/

# session_search with no explicit config → auto-detect → uses main model
2026-05-08 23:29:19 WARNING: OPENAI_BASE_URL is set (http://127.0.0.1:11434/v1) but model.provider is 'ollama-launch'. Auxiliary clients may route to the wrong endpoint.
2026-05-08 23:29:19 Auxiliary auto-detect: using main provider ollama-launch (kimi-k2.6:cloud)

# session_search timeout (same endpoint, same model, same timeout)
2026-05-08 23:30:50 Auxiliary session_search (async): connection error on auto (Request timed out.), trying fallback

# After fix (provider: custom):
2026-05-09 10:22:17 Auxiliary compression: using custom (kimi-k2.6:cloud) at http://127.0.0.1:11434/v1/
# Works correctly, completes in ~15 seconds

---

auxiliary:
  compression:
    model: kimi-k2.6:cloud
    provider: custom          # MUST be custom, NOT ollama-launch
    base_url: http://127.0.0.1:11434/v1
    timeout: 300              # MUST be explicit (default is only 30s)
  vision:
    model: kimi-k2.6:cloud
    provider: custom
    base_url: http://127.0.0.1:11434/v1
    timeout: 120
RAW_BUFFERClick to expand / collapse

Bug Description

When using a named custom provider (defined in config.yaml providers or custom_providers) as the main model provider, auxiliary tasks that don't have explicit auxiliary.<task> config entries fall through to auto-detect, which skips the named custom provider and routes directly to the main provider's default model.

This causes silent misconfiguration: the auxiliary task uses the main model (often a large, slow model) instead of a suitable auxiliary model, and since both initial and fallback requests go to the same endpoint, timeouts are guaranteed.

Environment

  • Hermes Agent v0.10.x
  • macOS / Linux
  • Provider: named custom provider (ollama-launch) pointing to Ollama Cloud at http://127.0.0.1:11434/v1

Config

model:
  default: glm-5.1:cloud
  provider: ollama-launch

providers:
  ollama-launch:
    api: http://127.0.0.1:11434/v1
    default_model: kimi-k2.6:cloud

# Only compression and vision are explicitly configured:
auxiliary:
  compression:
    model: kimi-k2.6:cloud
    provider: custom
    base_url: http://127.0.0.1:11434/v1
    timeout: 300
  vision:
    model: kimi-k2.6:cloud
    provider: custom
    base_url: http://127.0.0.1:11434/v1
    timeout: 120

What Happens

Tasks without explicit auxiliary.<task> config (e.g., session_search, title_generation) go through _resolve_task_provider_model() which returns "auto", then _resolve_auto() in resolve_provider_client():

  1. _resolve_auto() reads main_provider = runtime_provider or _read_main_provider()"ollama-launch"
  2. It calls resolve_provider_client("ollama-launch", main_model)
  3. resolve_provider_client finds ollama-launch via _get_named_custom_provider() — this part works
  4. But: the model used is the main model (glm-5.1:cloud), not a lightweight auxiliary model

Even for tasks WITH explicit config (like compression), if the user writes provider: ollama-launch instead of provider: custom, _resolve_task_provider_model returns ("ollama-launch", model, base_url, None), then resolve_provider_client enters the named custom provider branch. This actually works for routing, but there's a secondary problem:

When the named custom provider's default_model differs from what the user specified in auxiliary.<task>.model, the resolution may pick default_model instead, depending on code path.

The Real Problem: Two Separate Issues

Issue 1: Unconfigured auxiliary tasks use the main model

session_search, title_generation, and any future auxiliary task default to "auto", which resolves to the main provider + main model. For large models (e.g., glm-5.1:cloud, kimi-k2.6:cloud), this means:

  • Expensive auxiliary calls (session summarization doesn't need a 100B+ parameter model)
  • Timeout risk for tasks like compression that need fast responses
  • No way to set a global auxiliary default without configuring every individual task

Issue 2: provider: custom vs named custom providers for auxiliary tasks

Users must use provider: custom + explicit base_url for auxiliary tasks. Using the named provider (e.g., provider: ollama-launch) technically works for routing, but:

  • It triggers a WARNING about OPENAI_BASE_URL mismatch
  • The base_url must be duplicated in both providers.ollama-launch.api and auxiliary.<task>.base_url
  • There's no documentation that provider: custom is required for auxiliary tasks

This is related to #17488 (named custom providers being overwritten by aliases) and #14744 (vision auto-detection failing on Ollama Cloud).

Log Evidence

# Compression with provider: ollama-launch (before fix)
2026-05-08 23:15:50 Auxiliary compression: using ollama-launch (qwen3.5:cloud) at http://127.0.0.1:11434/v1/

# session_search with no explicit config → auto-detect → uses main model
2026-05-08 23:29:19 WARNING: OPENAI_BASE_URL is set (http://127.0.0.1:11434/v1) but model.provider is 'ollama-launch'. Auxiliary clients may route to the wrong endpoint.
2026-05-08 23:29:19 Auxiliary auto-detect: using main provider ollama-launch (kimi-k2.6:cloud)

# session_search timeout (same endpoint, same model, same timeout)
2026-05-08 23:30:50 Auxiliary session_search (async): connection error on auto (Request timed out.), trying fallback

# After fix (provider: custom):
2026-05-09 10:22:17 Auxiliary compression: using custom (kimi-k2.6:cloud) at http://127.0.0.1:11434/v1/
# Works correctly, completes in ~15 seconds

Suggested Fixes

  1. Add a global auxiliary default: Allow auxiliary.default.provider and auxiliary.default.model as a fallback for unconfigured tasks, instead of always defaulting to "auto" (which picks the main model).

  2. Detect named custom providers in auto-detect: When _resolve_auto() encounters a named custom provider (e.g., ollama-launch), it should check if that provider has a lightweight model suitable for auxiliary tasks and prefer it.

  3. Document the provider: custom requirement: The hermes-agent skill doc and official docs should explicitly state that auxiliary tasks require provider: custom + base_url when using non-standard providers, and that named custom providers in auxiliary.*.provider may not behave as expected.

  4. Default timeout: The hardcoded _DEFAULT_AUX_TIMEOUT = 30.0 is too short for any cloud model. Consider raising it to 120s or making it configurable via auxiliary.default.timeout.

Workaround

Current working configuration:

auxiliary:
  compression:
    model: kimi-k2.6:cloud
    provider: custom          # MUST be custom, NOT ollama-launch
    base_url: http://127.0.0.1:11434/v1
    timeout: 300              # MUST be explicit (default is only 30s)
  vision:
    model: kimi-k2.6:cloud
    provider: custom
    base_url: http://127.0.0.1:11434/v1
    timeout: 120

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING