hermes - 💡(How to fix) Fix [Bug]: Context compression failure uses static placeholder instead of preserved message tail — context permanently lost [1 participants]

WwNeXst · 2026-04-18T11:33:19Z

[hermes] Bug Description When a conversation triggers context compression and the summarisation call fails HTTP 529/500/timeout, etc. , Hermes injects a static… **Bug Description** When a conversation triggers context compression and the summarisation call fails (HTTP 529/500/timeout, etc.), Hermes injects a static fallback text into the conversation instead of the genuinely preserved message tail. The model then completely loses the current task context and responds to stale topics from several rounds ago. **Specific symptoms** - User is working on a task/topic - Conversation hits compression trigger; Hermes calls the LLM for summarisation - Summarisation call fails (observed: MiniMax HTTP 529) - Model receives static placeholder text (\"Summary generation was unavailable. N conversation turns were removed...\") instead of the actual preserved message tail - User asks model to continue the interrupted task - Model starts responding to content from several rounds ago, not the interrupted task - User must re-explain the task from scratch to recover **Contrast with OpenClaw** OpenClaw using the same MiniMax API does NOT have this problem with the same 529 scenario. This suggests the issue is in Hermes fallback handling, not the API provider itself. **Root Cause (preliminary)** Two-part failure: 1. Summarisation LLM call fails due to provider error (529/500/timeout) 2. Fallback mechanism injects a static placeholder message (\"context lost, continue from recent messages\") instead of concatenating the genuinely preserved message tail — model cannot recover the interrupted task from this **Additional Observations** - 529 errors appear to affect summarisation calls specifically (primary model calls may succeed while summarisation fails) - OpenClaw does NOT have this problem with the same MiniMax API key — suggesting Hermes fallback logic is the root cause - After compression failure, the model next response is consistently off-topic - User must re-explain the task from scratch to recover **Environment** - Hermes agent with MiniMax-M2.7 as primary model (provider: minimax-cn) - Context compression enabled - Same MiniMax API key used by OpenClaw (OAuth auth) without this issue - Issue reproduced multiple times in today's sessions **Expected vs Actual Behavior** - Expected: After context compression failure, model should continue naturally from the preserved message tail with full awareness of the in-progress task - Actual: Model loses the thread entirely and responds to stale topics **Proposed Fix** 1. Compare context compression fallback logic between OpenClaw and Hermes — how does OpenClaw preserve context on summarisation failure? 2. Change fallback to concatenate actual preserved messages (the message tail that was explicitly kept) rather than static placeholder text 3. Ensure the preserved message tail is always accessible even when summarisation LLM call fails **Related Issues** - #11914 (same issue, different reporter, MiniMax provider) - #12028 (related: token accounting fallback for reasoning models) - #11821 (related: compression Pass 3 JSON safety fix) - #12072 (related: streaming stall recovery, merged)

hermes2026-04-18 11:33:19

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#12131•Fetched 2026-04-19 15:25:35

View on GitHub

Comments

Participants

Timeline

Reactions

Author

WwNeXst

Participants

WwNeXst

Error Message

Summarisation LLM call fails due to provider error (529/500/timeout)

Root Cause

Root Cause (preliminary) Two-part failure:

Summarisation LLM call fails due to provider error (529/500/timeout)
Fallback mechanism injects a static placeholder message ("context lost, continue from recent messages") instead of concatenating the genuinely preserved message tail — model cannot recover the interrupted task from this

RAW_BUFFERClick to expand / collapse

Bug Description When a conversation triggers context compression and the summarisation call fails (HTTP 529/500/timeout, etc.), Hermes injects a static fallback text into the conversation instead of the genuinely preserved message tail. The model then completely loses the current task context and responds to stale topics from several rounds ago.

Specific symptoms

User is working on a task/topic
Conversation hits compression trigger; Hermes calls the LLM for summarisation
Summarisation call fails (observed: MiniMax HTTP 529)
Model receives static placeholder text ("Summary generation was unavailable. N conversation turns were removed...") instead of the actual preserved message tail
User asks model to continue the interrupted task
Model starts responding to content from several rounds ago, not the interrupted task
User must re-explain the task from scratch to recover

Contrast with OpenClaw OpenClaw using the same MiniMax API does NOT have this problem with the same 529 scenario. This suggests the issue is in Hermes fallback handling, not the API provider itself.

Root Cause (preliminary) Two-part failure:

Summarisation LLM call fails due to provider error (529/500/timeout)
Fallback mechanism injects a static placeholder message ("context lost, continue from recent messages") instead of concatenating the genuinely preserved message tail — model cannot recover the interrupted task from this

Additional Observations

529 errors appear to affect summarisation calls specifically (primary model calls may succeed while summarisation fails)
OpenClaw does NOT have this problem with the same MiniMax API key — suggesting Hermes fallback logic is the root cause
After compression failure, the model next response is consistently off-topic
User must re-explain the task from scratch to recover

Environment

Hermes agent with MiniMax-M2.7 as primary model (provider: minimax-cn)
Context compression enabled
Same MiniMax API key used by OpenClaw (OAuth auth) without this issue
Issue reproduced multiple times in today's sessions

Expected vs Actual Behavior

Expected: After context compression failure, model should continue naturally from the preserved message tail with full awareness of the in-progress task
Actual: Model loses the thread entirely and responds to stale topics

Proposed Fix

Compare context compression fallback logic between OpenClaw and Hermes — how does OpenClaw preserve context on summarisation failure?
Change fallback to concatenate actual preserved messages (the message tail that was explicitly kept) rather than static placeholder text
Ensure the preserved message tail is always accessible even when summarisation LLM call fails

Related Issues

#11914 (same issue, different reporter, MiniMax provider)
#12028 (related: token accounting fallback for reasoning models)
#11821 (related: compression Pass 3 JSON safety fix)
#12072 (related: streaming stall recovery, merged)

extent analysis

TL;DR

Modify the Hermes fallback mechanism to concatenate the preserved message tail instead of injecting a static placeholder text when the summarisation call fails.

Guidance

Compare the context compression fallback logic between OpenClaw and Hermes to identify differences in handling summarisation failures.
Update the fallback mechanism in Hermes to use the actual preserved message tail, ensuring the model can recover the interrupted task context.
Verify that the preserved message tail is accessible even when the summarisation LLM call fails, to prevent context loss.
Review related issues (#11914, #12028, #11821, #12072) for potential insights into similar problems and their solutions.

Example

No specific code snippet is provided, but the fix involves modifying the fallback logic to use the preserved message tail, e.g., by replacing the static placeholder text with the actual preserved messages.

Notes

The issue seems to be specific to the Hermes fallback handling, as OpenClaw using the same MiniMax API does not exhibit this problem. The proposed fix focuses on modifying the fallback mechanism to preserve the context correctly.

Recommendation

Apply the workaround by modifying the Hermes fallback mechanism to concatenate the preserved message tail, as this directly addresses the identified root cause of the issue.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #batch processing #GPU compatibility #latency issue #model loading

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix [Bug]: Context compression failure uses static placeholder instead of preserved message tail — context permanently lost [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix [Bug]: Context compression failure uses static placeholder instead of preserved message tail — context permanently lost [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING