hermes - 💡(How to fix) Fix [Bug]: Compression model context length does not respect custom provider context length

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

Operating System

Ubuntu 22.04.2 LTS

Python Version

3.10.9

Hermes Version

Hermes Agent v0.13.0 (2026.5.7)

Additional Logs / Traceback (optional)

Root Cause

Because custom_providers is not passed, the compression model may not get the correct context length from the custom provider config.

Fix Action

Fix / Workaround

Compression model <compression-model> (<custom-provider>) context is 256,000 tokens, but the main model <main-model> (custom)'s compression threshold was 840,000 tokens. Auto-lowered this session's threshold to 256,000 tokens so compression can run. A local patch that fixes the issue is to load compatible custom providers before the compression model context lookup:

aux_context = get_model_context_length( aux_model, base_url=aux_base_url, api_key=aux_api_key, config_context_length=getattr(self, "_aux_compression_context_length_config", None), provider=(_aux_cfg_provider if _aux_cfg_provider and _aux_cfg_provider != "auto" else getattr(self, "provider", "")), custom_providers=_aux_custom_providers, ) After applying this patch, the compression model context length should be resolved through the same custom provider logic as the main model.

Code Example

Compression model <compression-model> (<custom-provider>) context is 256,000 tokens, but the main model <main-model> (custom)'s compression threshold was 840,000 tokens. Auto-lowered this session's threshold to 256,000 tokens so compression can run.

Check the implementation in run_agent.py: _check_compression_feasibility() calls get_model_context_length() without passing custom_providers, unlike the main model logic in switch_model().

Reproducibility
This should reproduce whenever the compression model relies on custom_providers metadata to resolve its context length.

### Expected Behavior

The compression model should resolve its context length using the same custom provider configuration logic as the main model.

If the compression model is configured through a custom provider, get_model_context_length() should receive custom_providers, so it can correctly determine the model's context length.

### Actual Behavior

In run_agent.py, inside _check_compression_feasibility(), the call to get_model_context_length() does not pass custom_providers.

Current code:
aux_context = get_model_context_length(
    aux_model,
    base_url=aux_base_url,
    api_key=aux_api_key,
    config_context_length=getattr(self, "_aux_compression_context_length_config", None),
    provider=(_aux_cfg_provider if _aux_cfg_provider and _aux_cfg_provider != "auto" else getattr(self, "provider", "")),
)

Because custom_providers is not passed, the compression model may not get the correct context length from the custom provider config.


### Affected Component

Agent Core (conversation loop, context compression, memory)

### Messaging Platform (if gateway-related)

_No response_

### Debug Report

---

### Operating System

Ubuntu 22.04.2 LTS

### Python Version

3.10.9

### Hermes Version

Hermes Agent v0.13.0 (2026.5.7)

### Additional Logs / Traceback (optional)
RAW_BUFFERClick to expand / collapse

Bug Description

When using a custom provider for the compression model, the compression model's context length is not correctly resolved from custom_providers.

For example, the compression model supports 256,000 tokens, but the main model had a compression threshold of 840,000 tokens.

The agent detected the mismatch and printed: Compression model <compression-model> (<custom-provider>) context is 256,000 tokens, but the main model <main-model> (custom)'s compression threshold was 840,000 tokens. Auto-lowered this session's threshold to 256,000 tokens so compression can run. This suggests that the compression model context length is not being resolved using the configured custom provider metadata.

Steps to Reproduce

Steps to Reproduce

  1. Configure a custom provider in the Hermes Agent config.

  2. Set the main model to a custom model from that provider.

  3. Set a separate compression model from the same or another custom provider.

  4. Configure the compression model with a known context length in the custom provider config, for example 256000.

  5. Start a session where the main model's compression threshold is higher than the compression model's configured context length.

  6. Trigger or approach the compression threshold so that _check_compression_feasibility() runs.

  7. Observe that the agent reports a mismatch and auto-lowers the compression threshold, for example:

    Compression model <compression-model> (<custom-provider>) context is 256,000 tokens, but the main model <main-model> (custom)'s compression threshold was 840,000 tokens. Auto-lowered this session's threshold to 256,000 tokens so compression can run.

Check the implementation in run_agent.py: _check_compression_feasibility() calls get_model_context_length() without passing custom_providers, unlike the main model logic in switch_model().

Reproducibility This should reproduce whenever the compression model relies on custom_providers metadata to resolve its context length.

Expected Behavior

The compression model should resolve its context length using the same custom provider configuration logic as the main model.

If the compression model is configured through a custom provider, get_model_context_length() should receive custom_providers, so it can correctly determine the model's context length.

Actual Behavior

In run_agent.py, inside _check_compression_feasibility(), the call to get_model_context_length() does not pass custom_providers.

Current code: aux_context = get_model_context_length( aux_model, base_url=aux_base_url, api_key=aux_api_key, config_context_length=getattr(self, "_aux_compression_context_length_config", None), provider=(_aux_cfg_provider if _aux_cfg_provider and _aux_cfg_provider != "auto" else getattr(self, "provider", "")), )

Because custom_providers is not passed, the compression model may not get the correct context length from the custom provider config.

Affected Component

Agent Core (conversation loop, context compression, memory)

Messaging Platform (if gateway-related)

No response

Debug Report

The issue appears to be in run_agent.py, inside the _check_compression_feasibility() method.

The compression model context length is resolved with:

python

aux_context = get_model_context_length(
    aux_model,
    base_url=aux_base_url,
    api_key=aux_api_key,
    config_context_length=getattr(self, "_aux_compression_context_length_config", None),
    provider=(_aux_cfg_provider if _aux_cfg_provider and _aux_cfg_provider != "auto" else getattr(self, "provider", "")),
)
However, this call does not pass custom_providers.

By comparison, the main model path in switch_model() appears to load compatible custom providers and pass them to get_model_context_length(). As a result, the main model can correctly resolve context length from custom provider metadata, while the compression model path cannot.

This can cause the compression model context length to be resolved incorrectly or inconsistently, which then triggers the warning:

text

Compression model <compression-model> (<custom-provider>) context is 256,000 tokens, but the main model <main-model> (custom)'s compression threshold was 840,000 tokens. Auto-lowered this session's threshold to 256,000 tokens so compression can run.
A local patch that fixes the issue is to load compatible custom providers before the compression model context lookup:

python

_aux_custom_providers = None
try:
    from hermes_cli.config import load_config, get_compatible_custom_providers
    _aux_cfg = load_config()
    _aux_custom_providers = get_compatible_custom_providers(_aux_cfg)
except Exception:
    _aux_custom_providers = None
Then pass them into get_model_context_length():

python

aux_context = get_model_context_length(
    aux_model,
    base_url=aux_base_url,
    api_key=aux_api_key,
    config_context_length=getattr(self, "_aux_compression_context_length_config", None),
    provider=(_aux_cfg_provider if _aux_cfg_provider and _aux_cfg_provider != "auto" else getattr(self, "provider", "")),
    custom_providers=_aux_custom_providers,
)
After applying this patch, the compression model context length should be resolved through the same custom provider logic as the main model.

Operating System

Ubuntu 22.04.2 LTS

Python Version

3.10.9

Hermes Version

Hermes Agent v0.13.0 (2026.5.7)

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

No response

Proposed Fix (optional)

The logic should match the main model path in switch_model().

Before calling get_model_context_length(), load compatible custom providers: _aux_custom_providers = None try: from hermes_cli.config import load_config, get_compatible_custom_providers _aux_cfg = load_config() _aux_custom_providers = get_compatible_custom_providers(_aux_cfg) except Exception: _aux_custom_providers = None

Then pass custom_providers into get_model_context_length(): aux_context = get_model_context_length( aux_model, base_url=aux_base_url, api_key=aux_api_key, config_context_length=getattr(self, "_aux_compression_context_length_config", None), provider=(_aux_cfg_provider if _aux_cfg_provider and _aux_cfg_provider != "auto" else getattr(self, "provider", "")), custom_providers=_aux_custom_providers, )

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING