hermes - 💡(How to fix) Fix kimi-k2.6 on Ollama Cloud detected as 32K context despite API reporting 256K

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

ValueError: Model kimi-k2.6 has a context window of 32,768 tokens, below the minimum 64,000 required by Hermes Agent.

Root Cause

Root Cause Hypothesis

Fix Action

Workaround

Add the following to config.yaml under the model section:

model:
  context_length: 262144

This bypasses automatic detection and allows the session to start normally.

Code Example

ValueError: Model kimi-k2.6 has a context window of 32,768 tokens,
below the minimum 64,000 required by Hermes Agent.

---

"model_info": {
  "kimi-k2.context_length": 262144
}

---

"kimi": 262144

---

ValueError: Model kimi-k2.6 has a context window of 32,768 tokens...

---

model:
  context_length: 262144
RAW_BUFFERClick to expand / collapse

Bug Description

Hermes Agent rejects kimi-k2.6 on Ollama Cloud with the following error:

ValueError: Model kimi-k2.6 has a context window of 32,768 tokens,
below the minimum 64,000 required by Hermes Agent.

However, the Ollama Cloud API correctly reports a context length of 262,144 (256K), and DEFAULT_CONTEXT_LENGTHS["kimi"] in the Hermes source code is also set to 262144.

Evidence

1. Ollama Cloud API returns the correct value

Endpoint: GET https://ollama.com/api/show

The response includes:

"model_info": {
  "kimi-k2.context_length": 262144
}

2. Server type detection succeeds

Endpoint: GET https://ollama.com/api/tags

Returns a valid model list. detect_local_server_type() should therefore identify the provider as "ollama".

3. Hermes source already knows the correct value

DEFAULT_CONTEXT_LENGTHS in model_metadata.py contains:

"kimi": 262144

4. Despite the above, run_agent.py throws

Lines ~2000–2011 raise:

ValueError: Model kimi-k2.6 has a context window of 32,768 tokens...

Root Cause Hypothesis

detect_local_server_type() may fail to identify https://ollama.com/v1 as an "ollama" provider because it is a remote/cloud endpoint rather than a local server. Alternatively, query_ollama_num_ctx() may not be called for remote Ollama instances at all.

A hardcoded fallback of 32,768 appears somewhere in the context-resolution chain. This value is not present in DEFAULT_CONTEXT_LENGTHS nor in the API response, so its origin is unclear.

Workaround

Add the following to config.yaml under the model section:

model:
  context_length: 262144

This bypasses automatic detection and allows the session to start normally.

Environment

KeyValue
Hermes Agent~0.11.0
ProviderOllama Cloud (https://ollama.com/v1)
Modelkimi-k2.6
OSmacOS 26.3
Affected configsGlobal (~/.hermes/config.yaml) and profile (~/.hermes/profiles/<profile>/config.yaml)

Suggested Fix

Investigate the context-length resolution path in model_metadata.py for remote ollama providers. Ensure query_ollama_num_ctx() is called and its result is used, rather than silently falling back to 32,768.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING