hermes - 💡(How to fix) Fix [Bug]: Hardcoded MINIMUM_CONTEXT_LENGTH = 64_000 deadlocks auto-compression and causes infinite tool loops on high-context models [2 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

  1. Escalate a malformed JSON tool call parsing error to a ContextLengthError so the agent recovers, instead of silently discarding the tool arguments and repeating the call indefinitely.

Additional Logs / Traceback (optional)

  1. Silent Parser Failure: In run_agent.py, when a tool call fails parsing due to truncation under high context, the parser logs a warning and returns {} instead of raising a catchable exception.
  2. Truncation-Aware Exception: In run_agent.py, if a tool call argument exhibits truncation signatures (ends with ..., contains open quotes/braces, or is otherwise unrepairable), raise a ValueError("Truncated tool call arguments due to context length limits") containing 'context length' to proactively trigger the ContextCompressor instead of returning {}.

Root Cause

Because MINIMUM_CONTEXT_LENGTH is hardcoded to 64,000, the context compressor cannot trigger or compress history below 64k. However, the model's structured output degrades well before 64k tokens, generating truncated JSON tool calls for fact_store that are silently discarded as "unrepairable warnings," trapping the agent in an infinite, token-wasting loop on every subsequent turn.

Fix Action

Fixed

Code Example

N/A - Dynamic Docker NAS Environment (Standard build logs supplied below)

---

WARNING run_agent: Unrepairable tool_call arguments for fact_store — replaced with empty object (was: {"action": "add", "category": "user_pref", "content": "User...")
RAW_BUFFERClick to expand / collapse

Bug Description

During long conversations where context size grows beyond ~60,000 tokens on models with large physical windows (e.g. Gemini 3.5 Flash via direct API or OpenRouter), the background review agent (review_agent) frequently fails to persist new memories.

Because MINIMUM_CONTEXT_LENGTH is hardcoded to 64,000, the context compressor cannot trigger or compress history below 64k. However, the model's structured output degrades well before 64k tokens, generating truncated JSON tool calls for fact_store that are silently discarded as "unrepairable warnings," trapping the agent in an infinite, token-wasting loop on every subsequent turn.

Steps to Reproduce

  1. Run a session using a model with a massive context window (e.g., gemini-3.5-flash with a 1M token window).
  2. Accumulate conversation history to ~60,000 tokens.
  3. Introduce a new high-priority preference (e.g., "My workstation name is GOD").
  4. Complete the turn and monitor agent.log for the warning.

Expected Behavior

The system should either:

  1. Automatically trigger context compression when structured tool calling reliability drops below the 64,000 limit.
  2. Escalate a malformed JSON tool call parsing error to a ContextLengthError so the agent recovers, instead of silently discarding the tool arguments and repeating the call indefinitely.

Actual Behavior

The background review agent gets trapped in a loop, wasting tokens. The log shows: WARNING run_agent: Unrepairable tool_call arguments for fact_store — replaced with empty object (was: {"action": "add", "category": "user_pref", "content": "Dieter...")

Affected Component

Agent Core (conversation loop, context compression, memory)

Messaging Platform (if gateway-related)

Telegram

Debug Report

N/A - Dynamic Docker NAS Environment (Standard build logs supplied below)

Operating System

Linux (Synology DSM 7.x Docker container)

Python Version

3.13 (Native to latest nousresearch/hermes-agent:latest container)

Hermes Version

v0.14.0

Additional Logs / Traceback (optional)

WARNING run_agent: Unrepairable tool_call arguments for fact_store — replaced with empty object (was: {"action": "add", "category": "user_pref", "content": "User...")

Root Cause Analysis (optional)

  1. Hardcoded Safety Floor: In agent/model_metadata.py, MINIMUM_CONTEXT_LENGTH is pinned to 64_000. This overrides custom compression thresholds or overridden model contexts: threshold_tokens = max(int(model_context * threshold), MINIMUM_CONTEXT_LENGTH)
  2. Silent Parser Failure: In run_agent.py, when a tool call fails parsing due to truncation under high context, the parser logs a warning and returns {} instead of raising a catchable exception.
  3. The Deadlock: The compressor is blocked from running below 64k, but tool parsing consistently fails at ~60k, creating a loop where the same memory is retried with progressively larger contexts on every subsequent turn.

Proposed Fix (optional)

  1. Configuration Parameter: Move MINIMUM_CONTEXT_LENGTH out of model_metadata.py constants and make it a configurable property in config.yaml (e.g. compression.minimum_context_floor, allowing overrides down to 16_000).
  2. Truncation-Aware Exception: In run_agent.py, if a tool call argument exhibits truncation signatures (ends with ..., contains open quotes/braces, or is otherwise unrepairable), raise a ValueError("Truncated tool call arguments due to context length limits") containing 'context length' to proactively trigger the ContextCompressor instead of returning {}.

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING