hermes - ✅(Solved) Fix auxiliary compression model does not read context_length from providers config [3 pull requests, 3 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#13807Fetched 2026-04-23 07:48:50
View on GitHub
Comments
3
Participants
2
Timeline
10
Reactions
0
Author
Participants
Timeline (top)
labeled ×4commented ×3cross-referenced ×3

Root Cause

In run_agent.py, _aux_compression_context_length_config is only populated from auxiliary.compression.context_length. The same providers lookup that exists for the main model is missing for the auxiliary model.

Fix Action

Fixed

PR fix notes

PR #13813: fix(run_agent): read context_length from providers/custom_providers for compression model

Description (problem / solution / changelog)

Summary

When auxiliary.compression.model uses a custom provider with per-model context_length configured under providers/<key>/models/<model>/context_length, the compression feasibility check now looks up that value — mirroring the main model's existing providers lookup in __init__.

Root cause: _aux_compression_context_length_config was only populated from auxiliary.compression.context_length (explicit override). The same providers / custom_providers per-model context_length lookup that exists for the main model was missing for the auxiliary model.

Fix: In _prepare_compression(), added providers/custom_providers context_length lookup for the auxiliary compression model (mirrors the main model lookup in init`)

Testing

  1. Configure auxiliary.compression.provider: custom + auxiliary.compression.model pointing to a local endpoint with context_length: 262141 in providers.local-kimi.models.kimi-code/kimi-code.context_length
  2. Start a session — the compression threshold should use 262141 instead of the endpoint's auto-detected 128000

Closes #13807

Changed files

  • agent/prompt_builder.py (modified, +7/-1)
  • hermes_cli/model_normalize.py (modified, +17/-0)
  • hermes_cli/tools_config.py (modified, +9/-1)
  • run_agent.py (modified, +48/-1)
  • tests/agent/test_prompt_builder.py (modified, +18/-0)
  • ui-tui/src/components/appChrome.tsx (modified, +4/-1)

PR #14023: fix(agent): detect custom provider context length for auto-mode compression

Description (problem / solution / changelog)

  • Added _get_custom_provider_context_length() helper to query custom_providers config for context length
  • Integrated step 0b in get_model_context_length() to consult custom_providers before probe tiers
  • Preserves auto flexibility (no hardcoded model names) by matching on base_url

Changed files

  • agent/model_metadata.py (modified, +64/-0)
  • tests/agent/test_model_metadata.py (modified, +124/-0)

PR #14119: fix(agent): honor provider context for aux compression

Description (problem / solution / changelog)

Summary

  • reuse providers / compatible custom-provider model context hints for auxiliary.compression
  • keep explicit auxiliary.compression.context_length overrides taking precedence
  • add an init-level regression test covering provider-backed auxiliary context resolution

Testing

  • pytest -o addopts= tests/run_agent/test_compression_feasibility.py

Closes #13807

Changed files

  • run_agent.py (modified, +42/-0)
  • tests/run_agent/test_compression_feasibility.py (modified, +70/-0)

Code Example

providers:
  local-kimi:
    base_url: http://127.0.0.1:8080/v1
    model: kimi-code/kimi-code
    models:
      kimi-code/kimi-code:
        context_length: 262141

auxiliary:
  compression:
    provider: custom
    model: kimi-code/kimi-code
    base_url: http://127.0.0.1:8080/v1

---

Compression model (kimi-code/kimi-code) context is 128,000 tokens, but the main model's compression threshold was 209,712 tokens. Auto-lowered this session's threshold to 128,000 tokens so compression can run.
RAW_BUFFERClick to expand / collapse

Bug Description

The auxiliary compression model fails to read context_length from the providers / custom_providers config section, causing it to fallback to endpoint-reported values (e.g. 128K) instead of the user-configured value.

Repro Steps

  1. Configure a custom provider with per-model context_length in config.yaml:
providers:
  local-kimi:
    base_url: http://127.0.0.1:8080/v1
    model: kimi-code/kimi-code
    models:
      kimi-code/kimi-code:
        context_length: 262141

auxiliary:
  compression:
    provider: custom
    model: kimi-code/kimi-code
    base_url: http://127.0.0.1:8080/v1
  1. Start a session.
  2. Observe the warning:
⚠ Compression model (kimi-code/kimi-code) context is 128,000 tokens, but the main model's compression threshold was 209,712 tokens. Auto-lowered this session's threshold to 128,000 tokens so compression can run.

Expected Behavior

The compression model should resolve context_length: 262141 from providers.local-kimi.models.kimi-code/kimi-code, matching how the main model already does.

Actual Behavior

get_model_context_length() for the compression model only checks:

  1. auxiliary.compression.context_length direct override
  2. Endpoint auto-detection (which returns 128K for this local endpoint)

It never looks up providers / custom_providers per-model config, unlike the main model which does (see run_agent.py:1587-1623).

Root Cause

In run_agent.py, _aux_compression_context_length_config is only populated from auxiliary.compression.context_length. The same providers lookup that exists for the main model is missing for the auxiliary model.

Suggested Fix

Add the same providers / custom_providers per-model context_length lookup for the auxiliary compression model before falling back to endpoint detection.


Labels: bug, context-compression

extent analysis

TL;DR

The auxiliary compression model can be fixed by adding a providers lookup for per-model context_length configuration before falling back to endpoint detection.

Guidance

  • Review the run_agent.py file, specifically lines 1587-1623, to understand how the main model resolves context_length from providers / custom_providers.
  • Modify the _aux_compression_context_length_config population in run_agent.py to include the same providers lookup as the main model.
  • Verify that the context_length value is correctly read from the providers / custom_providers config section for the auxiliary compression model.
  • Test the updated code with the provided repro steps to ensure the warning message is no longer displayed.

Example

# Example of how the _aux_compression_context_length_config might be updated
_aux_compression_context_length_config = (
    auxiliary.compression.context_length
    or providers.get(model, {}).get('context_length')
    or custom_providers.get(model, {}).get('context_length')
    or endpoint_detected_context_length
)

Notes

The suggested fix assumes that the providers and custom_providers data structures are already populated with the necessary model configurations. If this is not the case, additional code changes may be required to load the provider configurations.

Recommendation

Apply the suggested fix to add the providers lookup for the auxiliary compression model, as this will allow the model to correctly resolve the context_length value from the user-configured settings.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING