hermes - ✅(Solved) Fix Bug: /model switch to named custom provider ignores custom_providers model context_length [2 pull requests, 2 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#15779Fetched 2026-04-26 05:25:04
View on GitHub
Comments
2
Participants
3
Timeline
12
Reactions
0
Timeline (top)
labeled ×4referenced ×3commented ×2cross-referenced ×2

Error Message

No traceback. The issue is in the resolved context window after /model.

Root Cause

Root Cause Analysis (optional)

Fix Action

Fixed

PR fix notes

PR #15787: fix(model-switch): honor custom_providers per-model context_length on /model switch (#15779)

Description (problem / solution / changelog)

Summary

  • Adds _lookup_custom_provider_context_length(model, base_url) helper in hermes_cli.model_switch that reads custom_providers[].models.<model>.context_length for a given (model, base_url) pair
  • resolve_display_context_length() now forwards the per-model override as config_context_length to the resolver, so /model confirmation shows the correct window
  • run_agent.switch_model() refreshes self._config_context_length on each switch so context compression uses the correct window after the switch

The bug

When a live session switched to a named custom provider via /model, Hermes ignored per-model context_length configured under custom_providers[].models.<model>.context_length in two independent code paths:

  1. Display (resolve_display_context_length): called get_model_context_length() without config_context_length, so the per-model override never reached the resolver and the confirmation message always reported the auto-detected default (128 K instead of the configured 1,050,000).

  2. Runtime (run_agent.switch_model): forwarded self._config_context_length, which was computed at startup for the original model and never refreshed for the new target. Context compression after the switch operated on the wrong window size.

The fix

A new _lookup_custom_provider_context_length(model, base_url) helper in hermes_cli.model_switch looks up the per-model context_length from get_compatible_custom_providers() by matching base_url (trailing-slash insensitive, lowercase). Both broken sites call this helper so the correct context window is used for display and compression after every model switch.

Test plan

  • Before: tests fail with ImportError (function doesn't exist on stock main) — proves tests are not tautological
  • After: 13/13 new tests pass covering positive, negative, edge, and idempotency cases
  • Regression guard: stashed fix → observed ImportError on import of _lookup_custom_provider_context_length; restored → 13/13 pass
  • Adjacent suites (test_model_switch_context_display, test_model_switch_custom_providers, test_user_providers_model_switch) unchanged — 36 pass, 6 pre-existing baseline failures in test_custom_provider_model_switch.py also present on clean origin/main

Related

  • Fixes #15779

🤖 Generated with Claude Code

Changed files

  • hermes_cli/model_switch.py (modified, +47/-0)
  • run_agent.py (modified, +11/-1)
  • tests/hermes_cli/test_model_switch_custom_provider_context_length.py (added, +291/-0)
  • tests/run_agent/test_switch_model_context.py (modified, +46/-13)

PR #15844: fix(context): honor custom_providers context_length on /model switch + bump probe tier to 256K

Description (problem / solution / changelog)

Summary

/model <model> --provider custom:<name> now reports the context window the user configured in custom_providers[].models.<id>.context_length instead of falling back to 128K. Closes #15779. Also bumps the top context-probe tier (and default fallback) from 128K to 256K.

Root cause: the per-model override was read only in AIAgent.__init__. switch_model(), resolve_display_context_length(), and _format_session_info() all ignored it and fell through to the probe-down fallback.

Changes

  • hermes_cli/config.py: new get_custom_provider_context_length() helper — single source of truth for the per-model override lookup, trailing-slash-insensitive
  • agent/model_metadata.py: get_model_context_length() gains custom_providers= kwarg (step 0b — after explicit config_context_length, before every probe); CONTEXT_PROBE_TIERS prefixed with 256K so DEFAULT_FALLBACK_CONTEXT = 256K; stale 128000 literal in OR metadata-miss path replaced with DEFAULT_FALLBACK_CONTEXT
  • run_agent.py: startup path refactored to use the helper (dedups inline loop, preserves invalid-value warning); AIAgent.switch_model() re-reads custom_providers from live config so mid-session switches resolve the override
  • hermes_cli/model_switch.py: resolve_display_context_length() accepts + forwards custom_providers
  • gateway/run.py: /model confirmation (picker callback + text path) and _format_session_info thread custom_providers through

Validation

BeforeAfter
/model gpt-5.5 --provider custom:my-endpoint with context_length: 1050000Context: 128,000Context: 1,050,000
Unknown-model default (no detection succeeds)128,000256,000
CONTEXT_PROBE_TIERS[0]128,000256,000

Targeted tests: tests/agent/test_model_metadata.py tests/hermes_cli/test_custom_provider_context_length.py tests/hermes_cli/test_model_switch_context_display.py tests/hermes_cli/test_model_switch_custom_providers.py tests/hermes_cli/test_custom_provider_model_switch.py tests/run_agent/test_invalid_context_length_warning.py tests/run_agent/test_switch_model_context.py tests/agent/test_model_metadata_local_ctx.py tests/gateway/test_session_info.py171/171 pass. Broader tests/gateway/ tests/run_agent/ delta: 47 failing on origin/main → 46 failing on branch (same pre-existing failures, none new).

E2E via execute_code: exact #15779 repro config + helper edge cases (trailing slash, wrong url/model, invalid value types, zero/negative, config_context_length precedence) all green.

New tests

  • tests/hermes_cli/test_custom_provider_context_length.py — 19 tests: helper unit + step 0b integration + CONTEXT_PROBE_TIERS invariants
  • tests/hermes_cli/test_model_switch_context_display.py — added regression tests for #15779 through the display resolver

Closes #15779.

Changed files

  • agent/model_metadata.py (modified, +23/-3)
  • gateway/run.py (modified, +9/-0)
  • hermes_cli/config.py (modified, +65/-0)
  • hermes_cli/model_switch.py (modified, +7/-0)
  • run_agent.py (modified, +67/-34)
  • tests/agent/test_model_metadata.py (modified, +10/-6)
  • tests/gateway/test_session_info.py (modified, +1/-1)
  • tests/hermes_cli/test_custom_provider_context_length.py (added, +240/-0)
  • tests/hermes_cli/test_model_switch_context_display.py (modified, +58/-0)

Code Example

model:
  default: MiniMax-M2.7
  provider: minimax-cn
  base_url: https://api.minimaxi.com/v1

custom_providers:
  - name: my-custom-endpoint
    base_url: https://example.invalid/v1
    api_key: <redacted>
    model: gpt-5.5
    models:
      gpt-5.5:
        context_length: 1050000

---

/model gpt-5.5 --provider custom:my-custom-endpoint

---

Model switched to gpt-5.5
Provider: my-custom-endpoint
Context: 1,050,000 tokens

---

Model switched to gpt-5.5
Provider: my-custom-endpoint
Context: 128,000 tokens
(session only -- add --global to persist)

---

No traceback. The issue is in the resolved context window after `/model`.

---

custom_providers[].models.<resolved_model>.context_length
RAW_BUFFERClick to expand / collapse

Bug Description

When a running gateway session switches to a named custom provider via /model, Hermes ignores the per-model context_length configured under custom_providers[].models.<model>.context_length and falls back to the default 128,000 token window.

This is reproducible even when the same config is already sufficient for the startup/session-reset path to resolve the correct context window.

Related but not identical: #5089.

Steps to Reproduce

  1. Configure a named custom provider in ~/.hermes/config.yaml with a per-model context override:
model:
  default: MiniMax-M2.7
  provider: minimax-cn
  base_url: https://api.minimaxi.com/v1

custom_providers:
  - name: my-custom-endpoint
    base_url: https://example.invalid/v1
    api_key: <redacted>
    model: gpt-5.5
    models:
      gpt-5.5:
        context_length: 1050000
  1. Start a fresh gateway chat session with /new.
  2. In that running session, switch to the named custom provider:
/model gpt-5.5 --provider custom:my-custom-endpoint

(Equivalent triple syntax also reproduces it: /model custom:my-custom-endpoint:gpt-5.5)

  1. Observe the /model switch confirmation.

Expected Behavior

After the switch, Hermes should use the configured per-model context window from custom_providers and report something equivalent to:

Model switched to gpt-5.5
Provider: my-custom-endpoint
Context: 1,050,000 tokens

Actual Behavior

Hermes instead reports the fallback window:

Model switched to gpt-5.5
Provider: my-custom-endpoint
Context: 128,000 tokens
(session only -- add --global to persist)

Affected Component

Gateway model switching (/model), custom provider context resolution

Messaging Platform (if gateway-related)

Feishu

Debug Report

Report https://paste.rs/cXlQQ agent.log https://paste.rs/i86Gq gateway.log https://paste.rs/0ESpk

Operating System

macOS

Python Version

No response

Hermes Version

Observed on v2026.4.23 (commit bf196a3fc0fd1f79353369e8732051db275c6276)

Additional Logs / Traceback (optional)

No traceback. The issue is in the resolved context window after `/model`.

Root Cause Analysis (optional)

There appear to be two different context-resolution paths:

  1. Startup / session reset path
    • run_agent.py reads per-model context_length from custom_providers during agent initialization.
  2. Mid-session /model switch path
    • gateway/run.py builds the /model switch confirmation by calling get_model_context_length(...) without carrying the same per-model override.
    • run_agent.py's model-switch update path similarly forwards only _config_context_length, which comes from top-level model.context_length, not from custom_providers[].models.<id>.context_length.

Relevant locations observed while debugging:

  • run_agent.py:1619-1640 — startup path reads custom_providers per-model context_length
  • run_agent.py:1994-2004 — model-switch path recalculates context using only _config_context_length
  • gateway/run.py:5753-5760/model response path calls get_model_context_length(...) directly

This makes the behavior diverge:

  • startup/reset can use the configured custom-provider context window
  • /model switch falls back to 128,000

Proposed Fix (optional)

Make the /model switch path reuse the same custom-provider override resolution that startup already uses.

Concretely, when config_context_length is absent at the top-level model: block and the active provider/base URL matches an entry in custom_providers, the switch path should look up:

custom_providers[].models.<resolved_model>.context_length

before falling back to endpoint probing or the default 128,000 window.

One way to do this would be to centralize that lookup so both startup and /model switch paths share the same resolution helper instead of duplicating partial logic.

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

extent analysis

TL;DR

To fix the issue, update the /model switch path in gateway/run.py and run_agent.py to reuse the custom-provider override resolution used during startup, ensuring that the per-model context_length from custom_providers is applied.

Guidance

  1. Review the proposed fix: Understand the difference in context-resolution paths between startup/session reset and mid-session /model switch, as outlined in the Root Cause Analysis.
  2. Centralize the lookup logic: Create a shared helper function that resolves the context_length from custom_providers[].models.<resolved_model>.context_length when the top-level model: block lacks config_context_length and the active provider matches a custom_providers entry.
  3. Apply the fix to both paths: Modify run_agent.py and gateway/run.py to use this centralized helper for both startup and /model switch logic, ensuring consistency in context window resolution.
  4. Test the changes: Verify that after applying the fix, switching models with /model correctly applies the per-model context_length from custom_providers, as expected.

Example

No code example is provided due to the complexity and specificity of the changes required, but the proposed fix suggests centralizing the lookup logic for context_length resolution.

Notes

The fix requires careful consideration of the interaction between run_agent.py and gateway/run.py, ensuring that the centralized helper function correctly resolves the context_length for both startup and /model switch scenarios.

Recommendation

Apply the workaround by implementing the proposed fix to centralize the context_length resolution logic, ensuring that both startup and /model switch paths consistently apply the per-model overrides from custom_providers. This approach addresses the root cause and provides a consistent behavior for context window resolution.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING