hermes - ✅(Solved) Fix get_model_context_length step 2 short-circuits to 128K default; resolve_display_context_length skips config_context_length [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#15563Fetched 2026-04-26 05:26:37
View on GitHub
Comments
1
Participants
2
Timeline
6
Reactions
0
Author
Timeline (top)
labeled ×4commented ×1cross-referenced ×1

get_model_context_length() resolution step 2 (_is_custom_endpoint branch in agent/model_metadata.py) returns DEFAULT_FALLBACK_CONTEXT (128K) when the endpoint is not a known provider and its /models API response lacks context_length, without falling through to later resolution steps (hardcoded DEFAULT_CONTEXT_LENGTHS, models.dev, OpenRouter). This means models that genuinely support 1M context via custom/compat endpoints are capped at 128K.

Additionally, resolve_display_context_length() in hermes_cli/model_switch.py does not pass config_context_length, so the explicit model.context_length or per-model context_length in config.yaml (step 0) is skipped. Even if the config has correct values, the /model switch confirmation still shows 128K.

Root Cause

Issue A: In agent/model_metadata.py, the resolution at step 2 (lines 1261–1296):

if _is_custom_endpoint(base_url) and not _is_known_provider_base_url(base_url):
    endpoint_metadata = fetch_endpoint_model_metadata(...)
    ...
    if not _is_known_provider_base_url(base_url):
        if is_local_endpoint(base_url):
            local_ctx = ...
        # Falls through to DEFAULT_FALLBACK_CONTEXT without reaching steps 4–8
        return DEFAULT_FALLBACK_CONTEXT

The return DEFAULT_FALLBACK_CONTEXT short-circuits the entire resolution chain. Steps 4–8 (Anthropic API, provider-aware lookups, models.dev, hardcoded defaults, OpenRouter) are never reached.

Issue B: resolve_display_context_length() omits config_context_length:

ctx = get_model_context_length(
    model,
    base_url=base_url or "",
    api_key=api_key or "",
    provider=provider or None,
    # config_context_length is MISSING here
)

Fix Action

Workaround

Users can convert their provider models list from flat list to dict format with per-model context_length:

providers:
  my-provider:
    models:
      some-model:
        context_length: 1000000

But this only fixes the agent runtime — the /model confirmation display still shows 128K due to issue B above.

PR fix notes

PR #15732: fix(context-length): fall through to catalog when endpoint probe misses; pass config_context_length to display resolver

Description (problem / solution / changelog)

Summary

  • Remove early return DEFAULT_FALLBACK_CONTEXT in get_model_context_length() step 2 so custom endpoints fall through to hardcoded defaults and models.dev when their /models API returns no context_length field
  • Add config_context_length parameter to resolve_display_context_length() and thread it from cli.py so the explicit model.context_length config override is honoured in the /model confirmation display

The bug

Issue A (agent/model_metadata.py:1283-1296): When _is_custom_endpoint(base_url) and the endpoint's /models API returns no context_length metadata, the resolver returned DEFAULT_FALLBACK_CONTEXT (128K) immediately — skipping steps 4–8 (hardcoded DEFAULT_CONTEXT_LENGTHS, models.dev, OpenRouter). A model like zai-org/GLM-5-TEE proxied via a custom URL has an entry in the hardcoded table (202752) that was never reached.

Issue B (hermes_cli/model_switch.py:550-556): resolve_display_context_length() called get_model_context_length() without config_context_length, so the step-0 config override (model.context_length in config.yaml) was skipped in the /model display path. Users who set model.context_length: 400000 still saw "Context: 128K tokens".

The fix

A: Replace the early return DEFAULT_FALLBACK_CONTEXT with a logger.debug(...) and fall through to the remaining resolution steps. The log message is downgraded from info to debug since it is no longer the final word.

B: Add config_context_length: Optional[int] = None to resolve_display_context_length(), pass it to get_model_context_length(), and update the cli.py callsite to extract model.context_length from self.config and forward it.

Test plan

  • Before (A): get_model_context_length("zai-org/GLM-5-TEE", base_url="https://llm.chutes.ai/v1", ...) with mocked empty endpoint metadata returned 128000
  • After (A): Same call returns 202752 (the hardcoded default for the GLM-5 family)
  • Regression guard (A): reverted the fall-through change → test_custom_endpoint_without_metadata_falls_through_to_catalog fails with assert 128000 == 202752; restored → passes
  • Before (B): resolve_display_context_length(config_context_length=1_000_000) did not forward the parameter
  • After (B): test_config_context_length_passed_through_to_resolver confirms the kwarg reaches get_model_context_length
  • Adjacent suites unchanged: tests/hermes_cli/test_model_switch_context_display.py (6 tests), tests/agent/test_model_metadata.py (96 tests), tests/hermes_cli/test_copilot_context.py, tests/hermes_cli/test_gemini_provider.py — 176 passed total

Related

  • Fixes #15563

🤖 Generated with Claude Code

Changed files

  • agent/model_metadata.py (modified, +25/-6)
  • cli.py (modified, +15/-0)
  • gateway/run.py (modified, +9/-0)
  • hermes_cli/model_switch.py (modified, +2/-0)
  • tests/agent/test_model_metadata.py (modified, +26/-2)
  • tests/hermes_cli/test_model_switch_context_display.py (modified, +21/-0)

Code Example

if _is_custom_endpoint(base_url) and not _is_known_provider_base_url(base_url):
    endpoint_metadata = fetch_endpoint_model_metadata(...)
    ...
    if not _is_known_provider_base_url(base_url):
        if is_local_endpoint(base_url):
            local_ctx = ...
        # Falls through to DEFAULT_FALLBACK_CONTEXT without reaching steps 48
        return DEFAULT_FALLBACK_CONTEXT

---

ctx = get_model_context_length(
    model,
    base_url=base_url or "",
    api_key=api_key or "",
    provider=provider or None,
    # config_context_length is MISSING here
)

---

providers:
  my-provider:
    models:
      some-model:
        context_length: 1000000
RAW_BUFFERClick to expand / collapse

Summary

get_model_context_length() resolution step 2 (_is_custom_endpoint branch in agent/model_metadata.py) returns DEFAULT_FALLBACK_CONTEXT (128K) when the endpoint is not a known provider and its /models API response lacks context_length, without falling through to later resolution steps (hardcoded DEFAULT_CONTEXT_LENGTHS, models.dev, OpenRouter). This means models that genuinely support 1M context via custom/compat endpoints are capped at 128K.

Additionally, resolve_display_context_length() in hermes_cli/model_switch.py does not pass config_context_length, so the explicit model.context_length or per-model context_length in config.yaml (step 0) is skipped. Even if the config has correct values, the /model switch confirmation still shows 128K.

Steps to Reproduce

  1. Configure a custom provider that proxies models via a compatible API endpoint
  2. Set model.context_length: 400000 or per-model context_length: 1000000 in config.yaml
  3. Run /model <model> --provider <custom>
  4. Observe the confirmation message shows "Context: 128K tokens" regardless of actual model capability

Root Cause

Issue A: In agent/model_metadata.py, the resolution at step 2 (lines 1261–1296):

if _is_custom_endpoint(base_url) and not _is_known_provider_base_url(base_url):
    endpoint_metadata = fetch_endpoint_model_metadata(...)
    ...
    if not _is_known_provider_base_url(base_url):
        if is_local_endpoint(base_url):
            local_ctx = ...
        # Falls through to DEFAULT_FALLBACK_CONTEXT without reaching steps 4–8
        return DEFAULT_FALLBACK_CONTEXT

The return DEFAULT_FALLBACK_CONTEXT short-circuits the entire resolution chain. Steps 4–8 (Anthropic API, provider-aware lookups, models.dev, hardcoded defaults, OpenRouter) are never reached.

Issue B: resolve_display_context_length() omits config_context_length:

ctx = get_model_context_length(
    model,
    base_url=base_url or "",
    api_key=api_key or "",
    provider=provider or None,
    # config_context_length is MISSING here
)

Expected Behavior

  1. Step 2 should fall through to later resolution steps instead of returning the default
  2. resolve_display_context_length() should pass config_context_length so explicit config overrides are honored in the display

Workaround

Users can convert their provider models list from flat list to dict format with per-model context_length:

providers:
  my-provider:
    models:
      some-model:
        context_length: 1000000

But this only fixes the agent runtime — the /model confirmation display still shows 128K due to issue B above.

Version

Hermes Agent v0.11.0 (observed on commit from release branch)

extent analysis

TL;DR

Modify the get_model_context_length() function in agent/model_metadata.py to fall through to later resolution steps when the endpoint is not a known provider, and update resolve_display_context_length() in hermes_cli/model_switch.py to pass config_context_length.

Guidance

  • Review the agent/model_metadata.py file, specifically lines 1261-1296, to ensure that the function does not return DEFAULT_FALLBACK_CONTEXT prematurely.
  • Update the resolve_display_context_length() function in hermes_cli/model_switch.py to include config_context_length as an argument to get_model_context_length().
  • Verify that the config.yaml file contains the correct model.context_length or per-model context_length values.
  • Test the changes by running the /model command with a custom provider and observing the confirmation message.

Example

# Updated get_model_context_length() function
if _is_custom_endpoint(base_url) and not _is_known_provider_base_url(base_url):
    endpoint_metadata = fetch_endpoint_model_metadata(...)
    # ... (rest of the function remains the same)
    # Remove the premature return statement

# Updated resolve_display_context_length() function
ctx = get_model_context_length(
    model,
    base_url=base_url or "",
    api_key=api_key or "",
    provider=provider or None,
    config_context_length=config_context_length  # Add this argument
)

Notes

The provided workaround of converting the provider models list to a dict format with per-model context_length only partially addresses the issue, as it does not fix the display problem.

Recommendation

Apply the workaround of updating the config.yaml file and the code changes to get_model_context_length() and resolve_display_context_length() functions, as this will fully address the issue and allow for correct context length display.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - ✅(Solved) Fix get_model_context_length step 2 short-circuits to 128K default; resolve_display_context_length skips config_context_length [1 pull requests, 1 comments, 2 participants]