hermes - 💡(How to fix) Fix Context window changes to 256K after interrupted compaction and resume [1 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

compression summary failed: Error code: 502. Inserted a fallback context marker. Session compressed 3 times — accuracy may degrade. Consider /new to start fresh. API call failed (attempt 1/3): InternalServerError [HTTP 502] Provider: custom Model: gpt-5.5 Endpoint: https://redacted.example/openai/v1 Error: HTTP 502: Error code: 502 ... Max retries (3) exhausted — trying fallback... API failed after 3 retries — HTTP 502: Error code: 502 Final error: HTTP 502: Error code: 502

Fix Action

Fixed

Code Example

model:
  default: gpt-5.5
  provider: custom
  base_url: https://redacted.example/openai/v1
  context_length: 1000000

custom_providers:
  - name: private_codex
    base_url: https://redacted.example/openai/v1
    model: gpt-5.5
    models:
      gpt-5.5:
        context_length: 1000000

compression:
  enabled: true
  threshold: 0.8
  target_ratio: 0.2

---

compression summary failed: Error code: 502. Inserted a fallback context marker.
Session compressed 3 times — accuracy may degrade. Consider /new to start fresh.
API call failed (attempt 1/3): InternalServerError [HTTP 502]
Provider: custom  Model: gpt-5.5
Endpoint: https://redacted.example/openai/v1
Error: HTTP 502: Error code: 502
...
Max retries (3) exhausted — trying fallback...
API failed after 3 retries — HTTP 502: Error code: 502
Final error: HTTP 502: Error code: 502
RAW_BUFFERClick to expand / collapse

Bug Description

After a Hermes CLI session is interrupted during context compaction and later resumed/continued, the effective context window shown in the status bar can change from the configured 1M tokens to 256K. In the same failure path, compression summary generation can fail with HTTP 502 and Hermes inserts a fallback context marker, which may degrade the resumed session's accuracy.

Steps to Reproduce

  1. Configure a custom OpenAI-compatible provider with a 1M context length:
model:
  default: gpt-5.5
  provider: custom
  base_url: https://redacted.example/openai/v1
  context_length: 1000000

custom_providers:
  - name: private_codex
    base_url: https://redacted.example/openai/v1
    model: gpt-5.5
    models:
      gpt-5.5:
        context_length: 1000000

compression:
  enabled: true
  threshold: 0.8
  target_ratio: 0.2
  1. Start a long Hermes CLI session and let it approach/trigger context compaction.
  2. Have the custom endpoint return HTTP 502 during compression summary generation.
  3. Continue or resume the interrupted session.
  4. Observe the CLI status bar and compaction warnings.

Expected Behavior

  • The resumed/continued session should preserve the configured context length (1000000, displayed as roughly 1M).
  • If summary generation fails, Hermes should not silently degrade context by inserting a fallback marker that loses important middle turns, or it should at least make the session state and recovery options explicit.
  • The configured context length should remain stable across interruption, compaction, resume, and continue flows.

Actual Behavior

  • The status bar showed 97.9K/256K even though the active config has model.context_length: 1000000 and the custom provider entry also sets context_length: 1000000.
  • The CLI reported:
compression summary failed: Error code: 502. Inserted a fallback context marker.
Session compressed 3 times — accuracy may degrade. Consider /new to start fresh.
API call failed (attempt 1/3): InternalServerError [HTTP 502]
Provider: custom  Model: gpt-5.5
Endpoint: https://redacted.example/openai/v1
Error: HTTP 502: Error code: 502
...
Max retries (3) exhausted — trying fallback...
API failed after 3 retries — HTTP 502: Error code: 502
Final error: HTTP 502: Error code: 502

Additional Observations

  • Running hermes config after the incident confirms the config still contains context_length: 1000000.
  • Directly calling Hermes' model metadata resolver with config_context_length=1000000 returns 1000000, so a fresh initialization should resolve to 1M.
  • The 256K value appears to come from the running/resumed agent's ContextCompressor.context_length, not from the current config file.
  • This suggests a resume/continue or compression failure path may restore or retain a stale/default context length instead of the configured value.

Environment

  • OS: Windows 10
  • Shell: Git Bash/MSYS via Hermes terminal backend
  • Hermes profile: default
  • Provider: custom OpenAI-compatible endpoint
  • Model: gpt-5.5
  • Configured context length: 1000000
  • Observed status bar context length after interruption/continue: 256K

Possible Fix Direction

  • Ensure resumed/continued sessions rehydrate ContextCompressor.context_length from the active model/custom provider config, not from stale runtime state or fallback metadata.
  • Consider making compression.abort_on_summary_failure default to safer behavior for transient provider failures, or avoid inserting a fallback marker that drops middle context when the summary request fails.
  • Surface a clearer warning when the runtime context length differs from the configured model.context_length for the active provider/model.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING