hermes - ✅(Solved) Fix Bug: Gemma 4 models hardcoded with 8K context instead of 256K [1 pull requests]

hermes2026-04-20 10:53:32

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

The DEFAULT_CONTEXT_LENGTHS dict in agent/model_metadata.py has an incorrect fallback for Gemma 4 models. The entry "gemma": 8192 catches gemma4:31b-cloud before the more specific "gemma-4-31b": 256000 can match.

Error Message

ValueError: Auxiliary compression model gemma4:31b-cloud has a context window of 8,192 tokens, which is below the minimum 64,000 required by Hermes Agent.

Root Cause

File: agent/model_metadata.py, line ~127-129

The hardcoded defaults don't account for Ollama Cloud's naming convention. The model is named gemma4:31b-cloud but the only Gemma 4 entry is "gemma-4-31b": 256000, which doesn't match. The generic "gemma": 8192 fallback catches it instead.

Fix Action

Workaround

Add to ~/.hermes/context_length_cache.yaml:

context_lengths:
  gemma4:31b-cloud@https://ollama.com/v1: 262144

PR fix notes

PR #13000: fix(model_metadata): add Ollama-style naming for Gemma 4 context length

Repository: NousResearch/hermes-agent
Author: MestreY0d4-Uninter
State: open | merged: False
Link: https://github.com/NousResearch/hermes-agent/pull/13000

Description (problem / solution / changelog)

Summary

Fixes #12976 - Gemma 4 models were incorrectly resolving to 8K context instead of 256K when used via Ollama Cloud.

Problem

The DEFAULT_CONTEXT_LENGTHS dict in agent/model_metadata.py uses substring matching to find context lengths. Ollama Cloud names Gemma 4 as gemma4:31b-cloud, which:

Does NOT match "gemma-4-31b" (has hyphen, Ollama doesn't)
DOES match "gemma" (generic fallback for older models)
Returns 8192 instead of 262144

Solution

Added Ollama-style naming entries:

"gemma4": 262144 - matches gemma4:31b-cloud, gemma4:27b
"gemma-4-27b": 262144 - explicit 27B variant
"gemma3": 131072 - Ollama-style for Gemma 3

Also fixed incorrect value: 256000 -> 262144 (actual 256K = 256 * 1024)

Testing

Updated existing test to expect correct 262144 value
Added new test test_gemma4_ollama_style_context for Ollama naming
All 3 context length tests pass locally

Files Changed

agent/model_metadata.py - Add Gemma 4/3 Ollama-style keys
tests/hermes_cli/test_gemini_provider.py - Update tests

Changed files

agent/model_metadata.py (modified, +10/-4)
tests/hermes_cli/test_gemini_provider.py (modified, +8/-1)

Code Example

auxiliary:
  compression:
    provider: ollama-cloud
    model: gemma4:31b-cloud

---

ValueError: Auxiliary compression model gemma4:31b-cloud has a context window of 8,192 tokens, which is below the minimum 64,000 required by Hermes Agent.

---

context_lengths:
  gemma4:31b-cloud@https://ollama.com/v1: 262144

---

"gemma-4": 262144,    # Gemma 4 family
"gemma4": 262144,     # Ollama-style naming (e.g., gemma4:31b-cloud)
"gemma-3": 131072,
"gemma": 8192,        # fallback for older gemma models only

RAW_BUFFERClick to expand / collapse

Bug Report: Gemma 4 Models Hardcoded with 8K Context Instead of 256K

Description

Reproduction

Fresh Hermes install (no context_length_cache.yaml)
Configure Ollama Cloud auxiliary model:

auxiliary:
  compression:
    provider: ollama-cloud
    model: gemma4:31b-cloud

Run any task requiring context compression

Expected Behavior

gemma4:31b-cloud resolves to 262144 tokens (256K context)

Actual Behavior

Model resolves to 8192 tokens, triggering:

ValueError: Auxiliary compression model gemma4:31b-cloud has a context window of 8,192 tokens, which is below the minimum 64,000 required by Hermes Agent.

Root Cause

File: agent/model_metadata.py, line ~127-129

Impact

All Ollama Cloud auxiliary functions fail with gemma4:31b-cloud on fresh installs
Users must manually add cache entries or patch source code
Breakage after updates if cache is cleared

Workaround

Add to ~/.hermes/context_length_cache.yaml:

context_lengths:
  gemma4:31b-cloud@https://ollama.com/v1: 262144

Suggested Fix

Add explicit entries for Ollama-style naming in agent/model_metadata.py:

"gemma-4": 262144,    # Gemma 4 family
"gemma4": 262144,     # Ollama-style naming (e.g., gemma4:31b-cloud)
"gemma-3": 131072,
"gemma": 8192,        # fallback for older gemma models only

extent analysis

TL;DR

Update the DEFAULT_CONTEXT_LENGTHS dictionary in agent/model_metadata.py to include explicit entries for Ollama-style naming conventions to fix the context length resolution issue for Gemma 4 models.

Guidance

Verify the issue by checking the context_length_cache.yaml file for the presence of a manual override for gemma4:31b-cloud and the DEFAULT_CONTEXT_LENGTHS dictionary in agent/model_metadata.py for the incorrect fallback value.
Apply the suggested fix by adding explicit entries for Ollama-style naming in agent/model_metadata.py, such as "gemma-4": 262144 and "gemma4": 262144.
As a temporary workaround, add a manual entry to ~/.hermes/context_length_cache.yaml to override the context length for gemma4:31b-cloud.
Test the fix by running a task that requires context compression with the gemma4:31b-cloud model and verifying that it resolves to the correct context length of 262144 tokens.

Example

DEFAULT_CONTEXT_LENGTHS = {
    # ...
    "gemma-4": 262144,    # Gemma 4 family
    "gemma4": 262144,     # Ollama-style naming (e.g., gemma4:31b-cloud)
    "gemma-3": 131072,
    "gemma": 8192,        # fallback for older gemma models only
}

Notes

The suggested fix assumes that the Ollama-style naming convention is consistent and can be accounted for with explicit entries in the DEFAULT_CONTEXT_LENGTHS dictionary. If the naming convention is subject to change, a more dynamic solution may be necessary.

Recommendation

Apply the workaround by adding a manual entry to ~/.hermes/context_length_cache.yaml until the suggested fix can be implemented and verified. This will allow users to continue using the gemma4:31b-cloud model without encountering the context length resolution issue.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#network issue #logging issue #authentication issue #prompt issue #agent setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

hermes - ✅(Solved) Fix Bug: Gemma 4 models hardcoded with 8K context instead of 256K [1 pull requests]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Workaround

PR fix notes

PR #13000: fix(model_metadata): add Ollama-style naming for Gemma 4 context length

Description (problem / solution / changelog)

Summary

Problem

Solution

Testing

Files Changed

Changed files

Code Example

Bug Report: Gemma 4 Models Hardcoded with 8K Context Instead of 256K

Description

Reproduction

Expected Behavior

Actual Behavior

Root Cause

Impact

Workaround

Suggested Fix

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING