openclaw - 💡(How to fix) Fix skills.entries.*.enabled: false silently ignored on existing sessions due to stale skillsSnapshot (cache-invalidation gap) [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#73765Fetched 2026-04-29 06:15:22
View on GitHub
Comments
0
Participants
1
Timeline
2
Reactions
0
Participants
Timeline (top)
cross-referenced ×2

skills.entries[name].enabled: false is silently ignored for existing long-lived sessions if the user adds the override after the session's skillsSnapshot was first persisted. The disabled skills keep appearing in the model's <available_skills> block until the session expires past the freshness window, the user runs /new, or the gateway happens to be running and watching the config when the user edits it.

This is the same root-cause family as #22517 (closed stale, never fixed) and #50468 (chokidar watcher), but the user-facing symptom here is the inverse: instead of new skills not showing up, disabled skills don't go away. The fix surface is the same: skill snapshot invalidation only relies on an in-process globalVersion counter that resets to 0 on gateway restart and is never persisted.

Documentation states unambiguously (docs/tools/skills.md):

enabled: false disables the skill even if it's bundled/installed.

…but in practice this only holds for sessions created (or rebuilt) after the config edit, in the same gateway process that observed the edit.

Root Cause

The filter at src/agents/skills/config.ts:73-105 correctly returns false for any entry where skills.entries[skillKey].enabled === false, and buildWorkspaceSkillSnapshot() is invoked with the live config (src/agents/agent-command.ts:599-614). I verified by replaying shouldIncludeSkill() standalone against the user's config and the leaking skills' frontmatter — they're correctly rejected.

The bug is in the cache-invalidation path:

// src/agents/agent-command.ts:580-587
const skillsSnapshotVersion = getSkillsSnapshotVersion(workspaceDir);
const currentSkillsSnapshot = sessionEntry?.skillsSnapshot;
const shouldRefreshSkillsSnapshot =
  !currentSkillsSnapshot ||
  shouldRefreshSnapshotForVersion(currentSkillsSnapshot.version, skillsSnapshotVersion) ||
  !matchesSkillFilter(currentSkillsSnapshot.skillFilter, skillFilter);
const needsSkillsSnapshot = isNewSession || shouldRefreshSkillsSnapshot;
// src/agents/skills/refresh-state.ts:65-72
export function shouldRefreshSnapshotForVersion(cachedVersion?, nextVersion?): boolean {
  const cached = typeof cachedVersion === "number" ? cachedVersion : 0;
  const next   = typeof nextVersion   === "number" ? nextVersion   : 0;
  return next === 0 ? cached > 0 : cached < next;
}

globalVersion lives in module-scope state in refresh-state.ts:9 and resets to 0 on every restart. Persisted SkillSnapshot.version is also 0 for every entry on my disk. So shouldRefreshSnapshotForVersion(0, 0) returns false and the cached snapshot is reused — regardless of how much the underlying openclaw.json has changed.

The only invalidation mechanism that survives is bumpSkillsSnapshotVersion called from gateway/config-reload.ts:268-276 when the watcher sees a skills.* path change. That path only fires while the gateway is up and watching, and the bumped value is in-memory only, so it's lost the next time the gateway restarts.

There is no fallback (config hash, meta.lastTouchedAt comparison, mtime check) that would catch stale snapshots after restart. Confirmed by searching the repo for configHash/skillsConfigHash/mtime in the skill snapshot path — nothing relevant exists.

Fix Action

Fix / Workaround

A quick partial mitigation: include meta.lastTouchedAt from openclaw.json in the snapshot, refresh when it differs. Closes most of this bug for ~5 LOC.

Code Example

"skills": {
     "entries": {
       "<skill-name>": { "enabled": false }
     }
   }

---

// src/agents/agent-command.ts:580-587
const skillsSnapshotVersion = getSkillsSnapshotVersion(workspaceDir);
const currentSkillsSnapshot = sessionEntry?.skillsSnapshot;
const shouldRefreshSkillsSnapshot =
  !currentSkillsSnapshot ||
  shouldRefreshSnapshotForVersion(currentSkillsSnapshot.version, skillsSnapshotVersion) ||
  !matchesSkillFilter(currentSkillsSnapshot.skillFilter, skillFilter);
const needsSkillsSnapshot = isNewSession || shouldRefreshSkillsSnapshot;

---

// src/agents/skills/refresh-state.ts:65-72
export function shouldRefreshSnapshotForVersion(cachedVersion?, nextVersion?): boolean {
  const cached = typeof cachedVersion === "number" ? cachedVersion : 0;
  const next   = typeof nextVersion   === "number" ? nextVersion   : 0;
  return next === 0 ? cached > 0 : cached < next;
}
RAW_BUFFERClick to expand / collapse

Summary

skills.entries[name].enabled: false is silently ignored for existing long-lived sessions if the user adds the override after the session's skillsSnapshot was first persisted. The disabled skills keep appearing in the model's <available_skills> block until the session expires past the freshness window, the user runs /new, or the gateway happens to be running and watching the config when the user edits it.

This is the same root-cause family as #22517 (closed stale, never fixed) and #50468 (chokidar watcher), but the user-facing symptom here is the inverse: instead of new skills not showing up, disabled skills don't go away. The fix surface is the same: skill snapshot invalidation only relies on an in-process globalVersion counter that resets to 0 on gateway restart and is never persisted.

Documentation states unambiguously (docs/tools/skills.md):

enabled: false disables the skill even if it's bundled/installed.

…but in practice this only holds for sessions created (or rebuilt) after the config edit, in the same gateway process that observed the edit.

Reproduction

  1. Use a sticky chat session (e.g. Telegram) that has been running for longer than the session reset window. Send a message — a skillsSnapshot is persisted to ~/.openclaw/agents/<id>/sessions/sessions.json.
  2. Confirm a discoverable skill (e.g. anything under ~/.agents/skills/) appears in skillsSnapshot.skills for that session entry.
  3. Edit ~/.openclaw/openclaw.json and add:
    "skills": {
      "entries": {
        "<skill-name>": { "enabled": false }
      }
    }
  4. Restart the gateway (or trigger the config edit while the gateway is stopped — the in-memory bump in gateway/config-reload.ts won't fire).
  5. Send another message on the same session within the freshness window.
  6. Re-inspect sessions.json for that session entry — the skillsSnapshot.prompt <available_skills> block still lists <skill-name>. The model also still sees it in the system prompt.

Empirical evidence from one of my live sessions.json files:

Session entrySnapshot built (UTC)Has now-disabled baoyu-* skills?
agent:main:telegram:default:direct:…2026-04-28 17:05 (before config edit)YES — baoyu-slide-deck, baoyu-markdown-to-html, baoyu-danger-x-to-markdown, baoyu-danger-gemini-web, acp-router, all explicitly enabled:false
agent:main:main2026-04-28 19:13 (after config edit)NO — filter applied correctly

Both snapshots have version: 0 in sessions.json, same machine, same config, same code — the only difference is the rebuild timestamp. So the filter logic itself works; the bug is purely stale-cache.

Root cause

The filter at src/agents/skills/config.ts:73-105 correctly returns false for any entry where skills.entries[skillKey].enabled === false, and buildWorkspaceSkillSnapshot() is invoked with the live config (src/agents/agent-command.ts:599-614). I verified by replaying shouldIncludeSkill() standalone against the user's config and the leaking skills' frontmatter — they're correctly rejected.

The bug is in the cache-invalidation path:

// src/agents/agent-command.ts:580-587
const skillsSnapshotVersion = getSkillsSnapshotVersion(workspaceDir);
const currentSkillsSnapshot = sessionEntry?.skillsSnapshot;
const shouldRefreshSkillsSnapshot =
  !currentSkillsSnapshot ||
  shouldRefreshSnapshotForVersion(currentSkillsSnapshot.version, skillsSnapshotVersion) ||
  !matchesSkillFilter(currentSkillsSnapshot.skillFilter, skillFilter);
const needsSkillsSnapshot = isNewSession || shouldRefreshSkillsSnapshot;
// src/agents/skills/refresh-state.ts:65-72
export function shouldRefreshSnapshotForVersion(cachedVersion?, nextVersion?): boolean {
  const cached = typeof cachedVersion === "number" ? cachedVersion : 0;
  const next   = typeof nextVersion   === "number" ? nextVersion   : 0;
  return next === 0 ? cached > 0 : cached < next;
}

globalVersion lives in module-scope state in refresh-state.ts:9 and resets to 0 on every restart. Persisted SkillSnapshot.version is also 0 for every entry on my disk. So shouldRefreshSnapshotForVersion(0, 0) returns false and the cached snapshot is reused — regardless of how much the underlying openclaw.json has changed.

The only invalidation mechanism that survives is bumpSkillsSnapshotVersion called from gateway/config-reload.ts:268-276 when the watcher sees a skills.* path change. That path only fires while the gateway is up and watching, and the bumped value is in-memory only, so it's lost the next time the gateway restarts.

There is no fallback (config hash, meta.lastTouchedAt comparison, mtime check) that would catch stale snapshots after restart. Confirmed by searching the repo for configHash/skillsConfigHash/mtime in the skill snapshot path — nothing relevant exists.

Why some enabled: false skills appear and others don't

A user observing this bug may notice that only some of their disabled skills leak. In my case, the leaking ones (baoyu-*, acp-router) are all on the active loader path (~/.agents/skills/ and bundled extensions); the apparently-correctly-filtered ones (wacli, voice-call, trello, etc.) simply don't exist on any active root — they're only in a stale ~/.openclaw/extensions/node_modules/openclaw.stale-… backup directory, so they were never loaded into any snapshot in the first place. This is unrelated to enabled: false and is easy to mistake for "partial filter behavior."

Suggested fix

Either one would close the gap; combined is best:

  1. Persist a content hash with the snapshot. Compute a stable hash over config.skills plus the resolved skill roots' enumeration, store it as SkillSnapshot.inputsHash. In agent-command.ts, refresh whenever the recomputed hash differs. Survives restarts; doesn't depend on the watcher.
  2. Persist bumpSkillsSnapshotVersion to disk. Save current globalVersion/workspaceVersions to a tiny JSON next to sessions.json on bump; load it at startup. Simpler change but only catches edits observed by the watcher.

A quick partial mitigation: include meta.lastTouchedAt from openclaw.json in the snapshot, refresh when it differs. Closes most of this bug for ~5 LOC.

Environment

  • OpenClaw: rebased PR #70071 onto v2026.4.24 + 172 commits (commit 1081067476), but the same code path exists on mainsrc/agents/skills/refresh-state.ts and src/agents/agent-command.ts are unchanged from upstream.
  • Platform: macOS (Darwin 25.4.0)
  • Channel: Telegram (long-lived sticky session)

Related

  • #22517 (closed stale) — same root cause, opposite symptom (new skills don't show up). PRs #22525 and #22568 attempted fixes; both closed unmerged.
  • #50468 — chokidar v5 glob bug stops the watcher from ever firing, amplifying this bug.
  • #39719 — stale workspace paths in sessions.json, same family of cache-invalidation gap.

extent analysis

TL;DR

To fix the issue of disabled skills not being removed from the model's available skills block for existing long-lived sessions, persist a content hash with the snapshot or persist the bumpSkillsSnapshotVersion to disk.

Guidance

  • The issue arises from the cache-invalidation path not properly handling changes to the openclaw.json configuration file, specifically when the enabled property of a skill is set to false.
  • To address this, consider implementing one of the suggested fixes:
    1. Persist a content hash with the snapshot by computing a stable hash over config.skills and the resolved skill roots' enumeration.
    2. Persist bumpSkillsSnapshotVersion to disk by saving the current globalVersion/workspaceVersions to a tiny JSON next to sessions.json on bump.
  • A quick partial mitigation is to include meta.lastTouchedAt from openclaw.json in the snapshot and refresh when it differs.

Example

// Example of persisting a content hash with the snapshot
const skillsSnapshotVersion = getSkillsSnapshotVersion(workspaceDir);
const currentSkillsSnapshot = sessionEntry?.skillsSnapshot;
const configHash = computeConfigHash(config.skills); // Implement computeConfigHash function
const shouldRefreshSkillsSnapshot =
  !currentSkillsSnapshot ||
  currentSkillsSnapshot.inputsHash !== configHash ||
  !matchesSkillFilter(currentSkillsSnapshot.skillFilter, skillFilter);

Notes

  • The suggested fixes aim to ensure that changes to the openclaw.json configuration file are properly reflected in the model's available skills block, even for existing long-lived sessions.
  • The choice of fix depends on the specific requirements and constraints of the project, such as the need for simplicity versus the need for a more robust solution.

Recommendation

Apply the first suggested fix: Persist a content hash with the snapshot. This approach provides a more robust solution that

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING