hermes - ✅(Solved) Fix Bug: Gemma 4 models hardcoded with 8K context instead of 256K [1 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

The DEFAULT_CONTEXT_LENGTHS dict in agent/model_metadata.py has an incorrect fallback for Gemma 4 models. The entry "gemma": 8192 catches gemma4:31b-cloud before the more specific "gemma-4-31b": 256000 can match.

Error Message

ValueError: Auxiliary compression model gemma4:31b-cloud has a context window of 8,192 tokens, which is below the minimum 64,000 required by Hermes Agent.

Root Cause

File: agent/model_metadata.py, line ~127-129

The hardcoded defaults don't account for Ollama Cloud's naming convention. The model is named gemma4:31b-cloud but the only Gemma 4 entry is "gemma-4-31b": 256000, which doesn't match. The generic "gemma": 8192 fallback catches it instead.

Fix Action

Workaround

Add to ~/.hermes/context_length_cache.yaml:

context_lengths:
  gemma4:31b-cloud@https://ollama.com/v1: 262144

PR fix notes

PR #13000: fix(model_metadata): add Ollama-style naming for Gemma 4 context length

Description (problem / solution / changelog)

Summary

Fixes #12976 - Gemma 4 models were incorrectly resolving to 8K context instead of 256K when used via Ollama Cloud.

Problem

The DEFAULT_CONTEXT_LENGTHS dict in agent/model_metadata.py uses substring matching to find context lengths. Ollama Cloud names Gemma 4 as gemma4:31b-cloud, which:

  1. Does NOT match "gemma-4-31b" (has hyphen, Ollama doesn't)
  2. DOES match "gemma" (generic fallback for older models)
  3. Returns 8192 instead of 262144

Solution

Added Ollama-style naming entries:

  • "gemma4": 262144 - matches gemma4:31b-cloud, gemma4:27b
  • "gemma-4-27b": 262144 - explicit 27B variant
  • "gemma3": 131072 - Ollama-style for Gemma 3

Also fixed incorrect value: 256000 -> 262144 (actual 256K = 256 * 1024)

Testing

  • Updated existing test to expect correct 262144 value
  • Added new test test_gemma4_ollama_style_context for Ollama naming
  • All 3 context length tests pass locally

Files Changed

  • agent/model_metadata.py - Add Gemma 4/3 Ollama-style keys
  • tests/hermes_cli/test_gemini_provider.py - Update tests

Changed files

  • agent/model_metadata.py (modified, +10/-4)
  • tests/hermes_cli/test_gemini_provider.py (modified, +8/-1)

Code Example

auxiliary:
  compression:
    provider: ollama-cloud
    model: gemma4:31b-cloud

---

ValueError: Auxiliary compression model gemma4:31b-cloud has a context window of 8,192 tokens, which is below the minimum 64,000 required by Hermes Agent.

---

context_lengths:
  gemma4:31b-cloud@https://ollama.com/v1: 262144

---

"gemma-4": 262144,    # Gemma 4 family
"gemma4": 262144,     # Ollama-style naming (e.g., gemma4:31b-cloud)
"gemma-3": 131072,
"gemma": 8192,        # fallback for older gemma models only
RAW_BUFFERClick to expand / collapse

Bug Report: Gemma 4 Models Hardcoded with 8K Context Instead of 256K

Description

The DEFAULT_CONTEXT_LENGTHS dict in agent/model_metadata.py has an incorrect fallback for Gemma 4 models. The entry "gemma": 8192 catches gemma4:31b-cloud before the more specific "gemma-4-31b": 256000 can match.

Reproduction

  1. Fresh Hermes install (no context_length_cache.yaml)
  2. Configure Ollama Cloud auxiliary model:
auxiliary:
  compression:
    provider: ollama-cloud
    model: gemma4:31b-cloud
  1. Run any task requiring context compression

Expected Behavior

gemma4:31b-cloud resolves to 262144 tokens (256K context)

Actual Behavior

Model resolves to 8192 tokens, triggering:

ValueError: Auxiliary compression model gemma4:31b-cloud has a context window of 8,192 tokens, which is below the minimum 64,000 required by Hermes Agent.

Root Cause

File: agent/model_metadata.py, line ~127-129

The hardcoded defaults don't account for Ollama Cloud's naming convention. The model is named gemma4:31b-cloud but the only Gemma 4 entry is "gemma-4-31b": 256000, which doesn't match. The generic "gemma": 8192 fallback catches it instead.

Impact

  • All Ollama Cloud auxiliary functions fail with gemma4:31b-cloud on fresh installs
  • Users must manually add cache entries or patch source code
  • Breakage after updates if cache is cleared

Workaround

Add to ~/.hermes/context_length_cache.yaml:

context_lengths:
  gemma4:31b-cloud@https://ollama.com/v1: 262144

Suggested Fix

Add explicit entries for Ollama-style naming in agent/model_metadata.py:

"gemma-4": 262144,    # Gemma 4 family
"gemma4": 262144,     # Ollama-style naming (e.g., gemma4:31b-cloud)
"gemma-3": 131072,
"gemma": 8192,        # fallback for older gemma models only

extent analysis

TL;DR

Update the DEFAULT_CONTEXT_LENGTHS dictionary in agent/model_metadata.py to include explicit entries for Ollama-style naming conventions to fix the context length resolution issue for Gemma 4 models.

Guidance

  • Verify the issue by checking the context_length_cache.yaml file for the presence of a manual override for gemma4:31b-cloud and the DEFAULT_CONTEXT_LENGTHS dictionary in agent/model_metadata.py for the incorrect fallback value.
  • Apply the suggested fix by adding explicit entries for Ollama-style naming in agent/model_metadata.py, such as "gemma-4": 262144 and "gemma4": 262144.
  • As a temporary workaround, add a manual entry to ~/.hermes/context_length_cache.yaml to override the context length for gemma4:31b-cloud.
  • Test the fix by running a task that requires context compression with the gemma4:31b-cloud model and verifying that it resolves to the correct context length of 262144 tokens.

Example

DEFAULT_CONTEXT_LENGTHS = {
    # ...
    "gemma-4": 262144,    # Gemma 4 family
    "gemma4": 262144,     # Ollama-style naming (e.g., gemma4:31b-cloud)
    "gemma-3": 131072,
    "gemma": 8192,        # fallback for older gemma models only
}

Notes

The suggested fix assumes that the Ollama-style naming convention is consistent and can be accounted for with explicit entries in the DEFAULT_CONTEXT_LENGTHS dictionary. If the naming convention is subject to change, a more dynamic solution may be necessary.

Recommendation

Apply the workaround by adding a manual entry to ~/.hermes/context_length_cache.yaml until the suggested fix can be implemented and verified. This will allow users to continue using the gemma4:31b-cloud model without encountering the context length resolution issue.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - ✅(Solved) Fix Bug: Gemma 4 models hardcoded with 8K context instead of 256K [1 pull requests]