hermes - 💡(How to fix) Fix Invalid skill availability state after context compression can corrupt active task execution

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Hermes Agent currently compresses old tool results during long sessions. When the compressed tool result comes from skill_view(), the original skill content is removed and replaced by a compact metadata line such as: [skill_view] name=docker-management (12,345 chars)

This creates a critical ambiguity: the conversation history shows that a skill was previously loaded, but the actual skill instructions are no longer available in the model's active context.

The real issue: The runtime/context system exposes a false availability state to the model after compression. The model sees historical evidence that a skill was loaded, while the actual operational instructions no longer exist in context.

This produces a "Ghost Skill" effect (informally) or, more accurately: Invalid Skill Availability State.

Root Cause

Without execution-loop invalidation, the runtime may allow: invalid reasoning → valid reload → invalid continuation

This creates a subtle but dangerous failure mode where the skill technically exists again, but the reasoning chain remains corrupted. This is critical for Docker orchestration, deployment workflows, and infrastructure modification.

Code Example

if tool_name == "skill_view":
    name = args.get("name", "?")
    return (
        f"[{tool_name}] name={name} ({content_len:,} chars) "
        "[SKILL_PRUNED: content lost in compression; "
        "reload with skill_view before relying on it]"
    )

if tool_name in {"skills_list", "skill_manage"}:
    name = args.get("name", "?")
    return f"[{tool_name}] name={name} ({content_len:,} chars)"

---

## Skills Loaded This Session

- docker-management: pruned — reload before use if required by active task
- hermes-agent: available in current context

---

## Required Skills For Current Task

- docker-management
- hermes-agent
RAW_BUFFERClick to expand / collapse

La compression de contexte Hermes résume actuellement les sorties skill_view() en tant qu'entrées de métadonnées uniquement, ce qui peut créer un faux état de disponibilité des compétences après l'élagage.

Cela peut causer :

  • réutilisation incorrecte des compétences élaguées ;
  • boucles de rechargement répétées ;
  • contamination du raisonnement pendant l'exécution active de la tâche.

J'ai écrit une proposition architecturale détaillée couvrant :

  • marqueurs explicites [SKILL_PRUNED] ;
  • suivi de l'état des compétences maintenues en temps d'exécution ;
  • récupération des compétences conscientes des tâches ;
  • invalidation de la boucle d'exécution après la compression au milieu de la tâche.

Proposition complète ci-jointe ci-dessous.

Proposal: Fix Invalid Skill Availability State & Execution Safety After Context Compression

Summary

Hermes Agent currently compresses old tool results during long sessions. When the compressed tool result comes from skill_view(), the original skill content is removed and replaced by a compact metadata line such as: [skill_view] name=docker-management (12,345 chars)

This creates a critical ambiguity: the conversation history shows that a skill was previously loaded, but the actual skill instructions are no longer available in the model's active context.

The real issue: The runtime/context system exposes a false availability state to the model after compression. The model sees historical evidence that a skill was loaded, while the actual operational instructions no longer exist in context.

This produces a "Ghost Skill" effect (informally) or, more accurately: Invalid Skill Availability State.

Problem

In agent/context_compressor.py, skill-related tool outputs are currently summarized like normal logs. The summary keeps only the tool name, skill name, and character count.

Example current behavior: [skill_view] name=docker-management (12,345 chars)

The compressed representation does not clearly signal that the skill content is no longer usable. The model may infer:

"I already loaded this skill earlier, so I can continue relying on it."

But in reality, the skill manual has been removed from the active context.

Critical Edge Case: Partial Skill Preservation

The current pruning may also leave fragments of skill content after multiple compression cycles:

  • Beginning preserved but end truncated
  • Procedural instructions cut mid-section
  • Overlapping summaries with partial overlap

This is particularly dangerous because the model may believe the skill is still complete while only fragments remain. Partial skill retention must be treated as pruned.

Critical Edge Case: Mid-Execution Compression (Reasoning Contamination)

Compression can trigger during an active execution loop or reasoning chain:

  1. skill_view(docker-management)
  2. Reasoning phase based on skill
  3. Automatic compression triggers → skill becomes pruned
  4. Reasoning continues → additional task actions execute from incomplete context

This creates Reasoning Contamination: the model generates incorrect procedural assumptions from pruned context. Even if the skill is reloaded later, the reasoning chain remains corrupted.

Impact

This is especially risky during long-running tasks.

Conversation mode

The agent may answer from general memory instead of using the exact skill instructions.

Task mode (Critical)

If compression happens mid-operation (Docker deployment, file manipulation, environment setup), the agent may continue "blind" without its manual. This can lead to:

  • Incorrect execution steps
  • Missed project-specific constraints
  • Use of generic knowledge instead of skill-specific instructions
  • Failed tasks
  • Potential damage to the user environment in high-impact workflows

P0 — Required Safety Fix: Explicit Pruned Marker

The compressed representation of a pruned skill_view() result must explicitly state that the skill content was removed and must be reloaded before use.

Current behavior

[skill_view] name=docker-management (12,345 chars)

Proposed behavior

[skill_view] name=docker-management (12,345 chars) [SKILL_PRUNED: content lost in compression; reload with skill_view before relying on it]

This makes the state unambiguous for the model.

Suggested Code Change

In agent/context_compressor.py, inside _summarize_tool_result, apply the strict pruning marker specifically to skill_view():

if tool_name == "skill_view":
    name = args.get("name", "?")
    return (
        f"[{tool_name}] name={name} ({content_len:,} chars) "
        "[SKILL_PRUNED: content lost in compression; "
        "reload with skill_view before relying on it]"
    )

if tool_name in {"skills_list", "skill_manage"}:
    name = args.get("name", "?")
    return f"[{tool_name}] name={name} ({content_len:,} chars)"

Important: This stricter marker applies mainly to skill_view because it contains the actual procedural/manual content.

P1 — Required Behavior Rule

Add this rule to the system prompt:

A summarized skill_view result is not an active or usable skill.

If a skill_view result contains [SKILL_PRUNED] or only metadata (name/character count), treat the skill content as unavailable.

If an in-progress task requires that skill, reload it with skill_view() before continuing. If no task is in progress, do not reload automatically; wait until the skill is needed.

P2 — Required Efficiency Fix: Runtime-Maintained Skill State Tracking

A marker alone fixes the safety issue, but does not solve repeated reload behavior after multiple compression cycles.

The compression summary should include a compact skill state section:

## Skills Loaded This Session

- docker-management: pruned — reload before use if required by active task
- hermes-agent: available in current context

Critical Architecture: Runtime Truth > Model Memory

The skill-state section must be generated and updated by the compression/runtime layer, NOT inferred by the LLM.

LLMs are probabilistic systems and should not be treated as authoritative state managers. The source of truth must come from the runtime layer.

Implementation approach: The compressor should scan messages after pruning and automatically generate the tracking section.

State Semantics (Binary Model)

Only two states should exist from the model perspective:

  • available — Full skill_view() content is present in active context
  • pruned — Content was lost (fully or partially)

Partial preservation must collapse into pruned unless the runtime can guarantee full content integrity.

P3 — Task-Aware Skill Recovery

For long-running tasks, the compression summary should also track which skills are required by the current task:

## Required Skills For Current Task

- docker-management
- hermes-agent

Then, after compression, the runtime can automatically detect:

  • Required skills for the active task
  • Skills marked as pruned
  • Skills still available in current context

If a required skill is marked as pruned, the agent should be instructed to reload it before taking further task actions.

P4 — Execution Loop Invalidation (Critical Runtime Safety)

Reloading a skill (P3) is not sufficient if earlier reasoning steps were already generated from incomplete skill context. The reasoning process itself must be refreshed to prevent "Reasoning Contamination".

Required behavior

If compression prunes a skill required by the currently active reasoning/task chain, the current execution loop must be considered invalid.

Required recovery sequence

  1. Detect skill invalidation during active execution loop
  2. Reload required skill(s)
  3. Re-evaluate reasoning/task state (Force refresh of assumptions)
  4. Resume execution only after reasoning refresh

Why this matters

Without execution-loop invalidation, the runtime may allow: invalid reasoning → valid reload → invalid continuation

This creates a subtle but dangerous failure mode where the skill technically exists again, but the reasoning chain remains corrupted. This is critical for Docker orchestration, deployment workflows, and infrastructure modification.

Recommended Priority

  1. P0 — Safety (Required): Add the [SKILL_PRUNED] marker to compressed skill_view() results.
  2. P1 — Behavior (Required): Add a system prompt rule stating that summarized/pruned outputs are not usable.
  3. P2 — Efficiency (Required): Add a compact "Skills Loaded This Session" section maintained by the runtime.
  4. P3 — Task Recovery (Required): Track required skills for active tasks.
  5. P4 — Execution Safety (Critical): Invalidate reasoning chains generated from incomplete skill context.

Acceptance Criteria

A fix can be considered successful if:

  1. Compressed skill_view() outputs clearly indicate that the skill content is no longer available.
  2. The agent does not rely on a pruned skill from memory during an active task.
  3. If a task is in progress and requires a pruned skill, the agent reloads the skill before continuing.
  4. In normal conversation, the agent does not reload skills unnecessarily.
  5. Multiple compression cycles do not cause uncontrolled reload loops.
  6. A skill is only marked available if its full content is present in active context.
  7. Partial skill preservation is treated as pruned unless full integrity can be guaranteed.
  8. The "Skills Loaded This Session" section is maintained by the runtime.
  9. Reasoning chains generated from incomplete skill context are invalidated and refreshed.

Conclusion

The current behavior creates a critical ambiguity between:

  • "this skill was loaded earlier" (historical evidence)
  • "this skill is still usable now" (operational reality)

A summarized skill_view() entry is historical evidence. It is NOT proof that the skill remains operationally available.

The safest fix is to make compressed skill_view() results explicitly state that the content has been pruned and must be reloaded before use.

The full robust solution requires runtime-maintained skill-state tracking and execution loop invalidation.

Core principle: Skill recovery must restore both skill availability and reasoning validity. The runtime must treat skill loss during active execution as a reasoning-integrity event, not merely a missing-context event.

final_github_issue_skill_compression.md

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING