openclaw - ✅(Solved) Fix Bug: skillsSnapshot never refreshes after gateway restart (version counter resets to 0) [1 pull requests, 1 comments, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#55489Fetched 2026-04-08 01:38:58
View on GitHub
Comments
1
Participants
1
Timeline
4
Reactions
0
Author
Participants
Timeline (top)
commented ×1cross-referenced ×1referenced ×1subscribed ×1

Long-lived sessions (e.g., the main Telegram session) retain stale skill entries indefinitely — even across gateway restarts. Skills that were renamed or deleted days ago still appear in the available_skills prompt, and new/replacement skills are missing.

Root Cause

The skills snapshot version counter (globalVersion and workspaceVersions Map in skills-remote-*.js) is stored in memory and initialized to 0. When the gateway restarts, these reset to 0.

The refresh check in session-updates-*.js is:

const shouldRefreshSnapshot = snapshotVersion > 0 
    && (nextEntry?.skillsSnapshot?.version ?? 0) < snapshotVersion;

After a restart:

  • snapshotVersion = getSkillsSnapshotVersion(workspaceDir) = Math.max(0, 0) = 0
  • shouldRefreshSnapshot = 0 > 0 && ... = false ← always false

The persisted snapshot also has version: 0, so the comparison never triggers a rebuild. The watcher works within a single gateway process lifetime (bumps version on SKILL.md add/change/unlink), but that state is lost on restart.

Fix Action

Fixed

PR fix notes

PR #55495: fix: initialize globalVersion to 1 so skills snapshot refreshes after gateway restart

Description (problem / solution / changelog)

Problem

Long-lived sessions store a skillsSnapshot with version: 0. After a gateway restart, the in-memory globalVersion resets to 0, so the refresh check:

const shouldRefreshSnapshot = snapshotVersion > 0 
    && (nextEntry?.skillsSnapshot?.version ?? 0) < snapshotVersion;

always evaluates to false (0 > 0 is false). The snapshot is never rebuilt, and skills that were added, renamed, or deleted between restarts remain stale in the agent prompt indefinitely.

Fix

Initialize globalVersion to 1 instead of 0 in src/agents/skills/refresh.ts.

This ensures that after any gateway restart, sessions with the default version (0) trigger shouldRefreshSnapshot = true on their first turn, causing a single buildWorkspaceSkillSnapshot() call that picks up the current state of skills on disk.

Verified

Reproduced and fixed on a live system:

Before fix: Main Telegram session showed amazon-shop and walmart-shop (deleted days ago), missing online-shop (the replacement). Multiple gateway restarts did not help. Snapshot version stuck at 0.

After fix: Gateway restart → main session rebuilt snapshot on first turn → all stale entries removed, all current entries present, version bumped to 1.

Side effects

The only change: every session rebuilds its skill snapshot once on the first turn after each gateway restart (stored version 0 < globalVersion 1). This is:

  • Cheap: buildWorkspaceSkillSnapshot reads SKILL.md frontmatter from ~20 files
  • Once per session per restart, not every turn (subsequent turns see version 1 = 1, no refresh)
  • No impact on watcher: bumpVersion still returns Date.now() since Date.now() > 1
  • No race conditions: updateSessionStore uses async lock

Fixes #55489

Changed files

  • src/agents/skills/refresh.ts (modified, +6/-1)

Code Example

const shouldRefreshSnapshot = snapshotVersion > 0 
    && (nextEntry?.skillsSnapshot?.version ?? 0) < snapshotVersion;

---

let globalVersion = 1;  // was: 0
RAW_BUFFERClick to expand / collapse

Summary

Long-lived sessions (e.g., the main Telegram session) retain stale skill entries indefinitely — even across gateway restarts. Skills that were renamed or deleted days ago still appear in the available_skills prompt, and new/replacement skills are missing.

Root Cause

The skills snapshot version counter (globalVersion and workspaceVersions Map in skills-remote-*.js) is stored in memory and initialized to 0. When the gateway restarts, these reset to 0.

The refresh check in session-updates-*.js is:

const shouldRefreshSnapshot = snapshotVersion > 0 
    && (nextEntry?.skillsSnapshot?.version ?? 0) < snapshotVersion;

After a restart:

  • snapshotVersion = getSkillsSnapshotVersion(workspaceDir) = Math.max(0, 0) = 0
  • shouldRefreshSnapshot = 0 > 0 && ... = false ← always false

The persisted snapshot also has version: 0, so the comparison never triggers a rebuild. The watcher works within a single gateway process lifetime (bumps version on SKILL.md add/change/unlink), but that state is lost on restart.

Reproduction

  1. Create a custom skill ~/.openclaw/skills/my-skill/SKILL.md
  2. Start the gateway, send a message (creates session with skillsSnapshot.version: 0)
  3. Rename the skill folder to my-new-skill
  4. Restart the gateway (openclaw gateway restart)
  5. Send another message in the same session
  6. The agent still sees my-skill in available_skills and does NOT see my-new-skill

Subagent sessions are unaffected because they get fresh sessions each time.

Evidence

From a real deployment — sessions.json shows:

SessionSkills countHas stale amazon-shopHas correct online-shop
agent:main:main (Telegram)16✅ Yes (stale)❌ No
Recent subagents17❌ No✅ Yes

The amazon-shop skill was consolidated into online-shop days earlier. Multiple gateway restarts occurred in between. The gateway's live skill scanner (openclaw skills list) correctly shows online-shop, but the main session's snapshot was never updated.

Impact

  • Agent tries to load SKILL.md files that don't exist → skill matching fails silently
  • Agent misses the replacement skill → can't perform tasks it should be able to
  • Affects the longest-lived, most important sessions (main agent sessions)
  • The skillsSnapshot also accounts for ~83% of sessions.json storage (407 KB / 490 KB) due to duplication across sessions

Suggested Fix

Simplest (one-line): Initialize globalVersion to 1 instead of 0 in skills-remote-*.js:

let globalVersion = 1;  // was: 0

This ensures that after any gateway restart, snapshotVersion = 1, and any session with the default version: 0 triggers shouldRefreshSnapshot = true, forcing a fresh buildWorkspaceSkillSnapshot().

Alternative fixes:

  • Persist the version counter to disk (survives restarts)
  • Always rebuild on first turn after gateway start (compare gatewayStartedAt vs snapshot creation time)
  • Validate snapshot entries against disk on load (stat check that SKILL.md paths exist)

Environment

  • OpenClaw version: openclaw --version output
  • macOS (arm64), Node v24.13.0
  • Gateway mode: LaunchAgent

extent analysis

Fix Plan

To resolve the issue of stale skill entries in long-lived sessions, we will implement the simplest suggested fix: initializing globalVersion to 1 instead of 0 in skills-remote-*.js. This ensures that after any gateway restart, snapshotVersion will be 1, triggering a refresh of the skills snapshot for sessions with the default version: 0.

Step-by-Step Solution:

  1. Update globalVersion initialization:

let globalVersion = 1; // was: 0

2. **Verify the change**:
   - Restart the gateway.
   - Send a message in the same session.
   - Check if the agent sees the updated skills in `available_skills`.

### Verification
To verify that the fix worked:
- Check the `sessions.json` file for the updated skills count and the presence of new skills.
- Test the agent's ability to perform tasks with the updated skills.
- Validate that the `skillsSnapshot` version is updated correctly after a gateway restart.

### Extra Tips
- Consider implementing one of the alternative fixes for a more robust solution, such as persisting the version counter to disk or validating snapshot entries against disk on load.
- Regularly review and update the skills and their versions to ensure the agent has the most up-to-date information.
- Monitor the `sessions.json` file size and consider optimizing the skills snapshot storage to reduce storage usage.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING