openclaw - 💡(How to fix) Fix memory-wiki lint: wikilink extractor is not markdown-aware; resolver never matches frontmatter titles [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#70986Fetched 2026-04-24 10:37:04
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0
Participants

openclaw wiki lint produces two classes of false/unfixable "Broken wikilink target" warnings that are rooted in the wikilink extractor and link resolver implementations themselves, not in vault content. After a live vault audit (~150 pages, 84 warnings), roughly all warnings fall into these two buckets:

  1. Parser is markdown-blind — literal [[...]] inside prose/code/transcripts is extracted as a wikilink target, even when it is clearly documentation syntax, an OpenClaw directive tag, or a quoted transcript.
  2. Resolver only indexes slugified path + basename, never frontmatter title — natural-title wikilinks like [[Codex harness plugin 配置清单]] can never resolve, because the resolver's validTargets set only contains post-slugification forms.

These are upstream bugs in the memory-wiki plugin, independent of #64696 (which proposes tag-specific sanitization for [[reply_to_*]]). The fix scope proposed below targets the root causes, which would also cover #64696 without per-tag bandaids.

Installed version: openclaw 2026.4.22 (commit 00bd2cf).

Root Cause

  1. Parser is markdown-blind — literal [[...]] inside prose/code/transcripts is extracted as a wikilink target, even when it is clearly documentation syntax, an OpenClaw directive tag, or a quoted transcript.
  2. Resolver only indexes slugified path + basename, never frontmatter title — natural-title wikilinks like [[Codex harness plugin 配置清单]] can never resolve, because the resolver's validTargets set only contains post-slugification forms.

Code Example

Wiki/sources/bridge-workspace-system-architect-1ca28ce0-memory-2026-04-19-provider-name-c3ee55a4 2.md
line 34:  assistant: [[reply_to_current]]
line 124: assistant: [[reply_to_current]]
line 218: assistant: [[reply_to_current]]
...

---

---
pageType: source
id: source.codex-harness-plugin-配置清单
title: Codex harness plugin 配置清单
sourceType: local-file
sourcePath: //OpenClaw/GPT 5 系列/Codex harness plugin 配置清单.md
---

---

// line 180
const OBSIDIAN_LINK_PATTERN = /\[\[([^\]|]+)(?:\|[^\]]+)?\]\]/g;

// line 273
function extractWikiLinks(markdown) {
  const searchable = markdown.replace(RELATED_BLOCK_PATTERN, "");
  const links = [];
  for (const match of searchable.matchAll(OBSIDIAN_LINK_PATTERN)) {
    const target = match[1]?.trim();
    if (target) links.push(target);
  }
}

---

function collectBrokenLinkIssues(pages) {
  const validTargets = new Set();
  for (const page of pages) {
    const withoutExtension = page.relativePath.replace(/\.md$/i, "");
    validTargets.add(withoutExtension);                // "sources/codex-harness-plugin-配置清单"
    validTargets.add(path.basename(withoutExtension)); // "codex-harness-plugin-配置清单"
  }
  const issues = [];
  for (const page of pages)
    for (const linkTarget of page.linkTargets)
      if (!validTargets.has(linkTarget))
        issues.push({ severity: "warning", category: "links", code: "broken-wikilink",});
  return issues;
}
RAW_BUFFERClick to expand / collapse

Summary

openclaw wiki lint produces two classes of false/unfixable "Broken wikilink target" warnings that are rooted in the wikilink extractor and link resolver implementations themselves, not in vault content. After a live vault audit (~150 pages, 84 warnings), roughly all warnings fall into these two buckets:

  1. Parser is markdown-blind — literal [[...]] inside prose/code/transcripts is extracted as a wikilink target, even when it is clearly documentation syntax, an OpenClaw directive tag, or a quoted transcript.
  2. Resolver only indexes slugified path + basename, never frontmatter title — natural-title wikilinks like [[Codex harness plugin 配置清单]] can never resolve, because the resolver's validTargets set only contains post-slugification forms.

These are upstream bugs in the memory-wiki plugin, independent of #64696 (which proposes tag-specific sanitization for [[reply_to_*]]). The fix scope proposed below targets the root causes, which would also cover #64696 without per-tag bandaids.

Installed version: openclaw 2026.4.22 (commit 00bd2cf).

Evidence

Class 1 — parser false positives on literal [[...]]

Sample warnings from Wiki/reports/lint.md:

  • Broken wikilink target `tts:text`.
  • Broken wikilink target `reply_to:<id>`.
  • Broken wikilink target `reply_to_current`.
  • Broken wikilink target `wikilinks`.
  • Broken wikilink target `...`.

Concrete example — reply_to_current occurs 8 times in this one bridged source page, always in plain prose inside transcript text, not as a real wikilink:

Wiki/sources/bridge-workspace-system-architect-1ca28ce0-memory-2026-04-19-provider-name-c3ee55a4 2.md
line 34:  assistant: [[reply_to_current]]
line 124: assistant: [[reply_to_current]]
line 218: assistant: [[reply_to_current]]
...

These are literal OpenClaw reply directives that were captured verbatim when the session transcript was bridged into the vault. Similar patterns occur for [[tts:text]], [[reply_to:<id>]], and for markdown meta-discussions where users literally write [[wikilinks]] or [[...]] as documentation syntax. [[Omitted long matching line]] (system truncation hint) also gets extracted.

Class 2 — resolver cannot match wikilinks written by natural title

All four of the following are real, on-disk, indexed source pages whose frontmatter title exactly matches the [[...]] target — but lint flags them as broken:

Wikilink in bodyTarget file that existsFrontmatter title
[[Codex harness plugin 配置清单]]Wiki/sources/codex-harness-plugin-配置清单.md (bridged from OpenClaw/GPT 5 系列/Codex harness plugin 配置清单.md)Codex harness plugin 配置清单
[[OpenClaw Agent 体系深度分析 — openclaw-agent-system]]Wiki/sources/openclaw-agent-体系深度分析.mdOpenClaw Agent 体系深度分析
[[深度解析 OpenClaw Prompt Context Harness]]Wiki/sources/深度解析-openclaw-prompt-context-harness.md (verified on-disk)深度解析 OpenClaw Prompt Context Harness
[[OpenClaw × Claude Code × Codex 共享 Memory 方案]]slugified filename drops the × charactersOpenClaw × Claude Code × Codex 共享 Memory 方案

Frontmatter of the first target, verified via head:

---
pageType: source
id: source.codex-harness-plugin-配置清单
title: Codex harness plugin 配置清单
sourceType: local-file
sourcePath: /…/OpenClaw/GPT 5 系列/Codex harness plugin 配置清单.md
---

Users naturally write [[Codex harness plugin 配置清单]]. The resolver never matches, because its validTargets set only contains sources/codex-harness-plugin-配置清单 and codex-harness-plugin-配置清单 — the slugified forms.

Root cause (verified in installed source)

1. Parser is not markdown-aware

/opt/homebrew/lib/node_modules/openclaw/dist/cli-BF_4kd_o.js:

// line 180
const OBSIDIAN_LINK_PATTERN = /\[\[([^\]|]+)(?:\|[^\]]+)?\]\]/g;

// line 273
function extractWikiLinks(markdown) {
  const searchable = markdown.replace(RELATED_BLOCK_PATTERN, "");
  const links = [];
  for (const match of searchable.matchAll(OBSIDIAN_LINK_PATTERN)) {
    const target = match[1]?.trim();
    if (target) links.push(target);
  }
}

The regex is run directly against the raw document minus only the openclaw:wiki:related managed block. It does not skip:

  • fenced code blocks (```` ``` `)
  • inline code (`)
  • YAML frontmatter
  • HTML comments
  • Obsidian callouts with code examples

Any document that talks about OpenClaw directives, Obsidian wikilink syntax, or session transcripts will emit spurious targets. Known Issue tts-parser-markdown-blind (our internal tracker) describes the exact same class of bug in the TTS directive parser — this is the wiki parser's sibling.

2. Resolver only knows slugified paths

Same bundle, lines ~2286–2300:

function collectBrokenLinkIssues(pages) {
  const validTargets = new Set();
  for (const page of pages) {
    const withoutExtension = page.relativePath.replace(/\.md$/i, "");
    validTargets.add(withoutExtension);                // "sources/codex-harness-plugin-配置清单"
    validTargets.add(path.basename(withoutExtension)); // "codex-harness-plugin-配置清单"
  }
  const issues = [];
  for (const page of pages)
    for (const linkTarget of page.linkTargets)
      if (!validTargets.has(linkTarget))
        issues.push({ severity: "warning", category: "links", code: "broken-wikilink",});
  return issues;
}

validTargets never contains page.title, so [[<natural title>]] is guaranteed to miss whenever the stored file has been slugified (which is the default for ingested Chinese-title content, any filename with special characters like ×, or long-title files). formatWikiLink() emits Obsidian-style [[slug|Title]] aliases, which the extractor correctly strips via its (?:\|[^\]]+)? group — but anything the user or Librarian writes as [[Title]] bare is unresolvable.

Expected behavior

  1. The wikilink extractor should respect Markdown / MDX structure and not extract [[...]] tokens that appear inside:
    • fenced code blocks ( and ~~~)
    • inline code (`…`)
    • YAML frontmatter
    • HTML comments
  2. The resolver should additionally accept:
    • a match against page.title (case-insensitive, or case-sensitive + normalized whitespace)
    • optionally, a match after the user's stored slug rule is applied to the extracted target (so both [[Codex harness plugin 配置清单]] and [[codex-harness-plugin-配置清单]] resolve to the same page)

Proposed fix scope

  • Parser: strip fenced code blocks, inline code, and HTML comments from searchable before the wikilink regex runs. The YAML frontmatter is already addressed upstream; fenced / inline code are the main holes.
  • Resolver: build validTargets as a union of {relativePath sans .md, basename, frontmatter.title, slugify(frontmatter.title), any declared aliases}; compare extracted target against all of them using a normalization function (lowercase, collapse whitespace, unify /-).
  • Docs: if wikilinks by slug remain a blessed form, note it explicitly; but title-based resolution is what every existing Obsidian-vault user expects and what OpenClaw's own formatWikiLink alias rendering implies.

Related

  • #64696 proposes tag-specific sanitization (reply_to_current, reply_to:<id>) at render time. That is a narrower fix for Class 1 and does not address:
    • future OpenClaw tags with the same shape ([[tts:text]], [[embed …]], [[media …]], etc.)
    • general Markdown meta-discussion that literally contains [[wikilinks]]
    • Class 2 at all.

Markdown-awareness in the extractor is the root-cause fix that subsumes the per-tag approach.

Reproduction

  1. In a bridge-mode vault, bridge a session transcript that contains the string [[reply_to_current]] literally in assistant output.
  2. Create sources/example-配置清单.md with frontmatter title: Example 配置清单.
  3. In another page, link it with [[Example 配置清单]].
  4. Run openclaw wiki lint.

Both lines will appear as Broken wikilink target ….

Acceptance criteria

  • openclaw wiki lint does not flag [[…]] that sits inside a fenced code block or inline code span.
  • openclaw wiki lint resolves [[<title>]] when any indexed page's frontmatter title equals the target (normalization policy documented).
  • Vaults upgrading from the current implementation stop seeing false-positive Broken wikilink target for OpenClaw directive tags that appear in bridged transcripts, without requiring a tag-specific allowlist.

Environment

  • OpenClaw: 2026.4.22 (00bd2cf), macOS 15.7.3
  • Plugin: dist/extensions/memory-wiki (bundled)
  • Vault: bridge mode, Obsidian render mode, 150 pages

extent analysis

TL;DR

The most likely fix involves modifying the wikilink extractor to be markdown-aware and updating the resolver to include frontmatter titles in its valid targets.

Guidance

  • Modify the wikilink extractor to strip fenced code blocks, inline code, and HTML comments before applying the regex to prevent false positives.
  • Update the resolver to build validTargets as a union of relative path, basename, frontmatter title, and slugified frontmatter title to enable resolution by natural title.
  • Implement a normalization function for comparing extracted targets against valid targets to handle case sensitivity and whitespace differences.
  • Consider documenting the expected behavior for wikilink resolution, including whether slug-based resolution is supported and how title-based resolution works.

Example

// Example of modified extractWikiLinks function
function extractWikiLinks(markdown) {
  const searchable = markdown
    .replace(RELATED_BLOCK_PATTERN, "")
    .replace(/```.+?```/gs, "") // Remove fenced code blocks
    .replace(/`.+?`/g, ""); // Remove inline code
  const links = [];
  for (const match of searchable.matchAll(OBSIDIAN_LINK_PATTERN)) {
    const target = match[1]?.trim();
    if (target) links.push(target);
  }
  return links;
}

// Example of updated collectBrokenLinkIssues function
function collectBrokenLinkIssues(pages) {
  const validTargets = new Set();
  for (const page of pages) {
    const withoutExtension = page.relativePath.replace(/\.md$/i, "");
    validTargets.add(withoutExtension);
    validTargets.add(path.basename(withoutExtension));
    validTargets.add(page.frontmatter.title);
    validTargets.add(slugify(page.frontmatter.title));
  }
  // ...
}

Notes

The provided examples are based on the code snippets included in the issue and may require adjustments to fit the actual implementation. Additionally, the normalization function for comparing targets

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

  1. The wikilink extractor should respect Markdown / MDX structure and not extract [[...]] tokens that appear inside:
    • fenced code blocks ( and ~~~)
    • inline code (`…`)
    • YAML frontmatter
    • HTML comments
  2. The resolver should additionally accept:
    • a match against page.title (case-insensitive, or case-sensitive + normalized whitespace)
    • optionally, a match after the user's stored slug rule is applied to the extracted target (so both [[Codex harness plugin 配置清单]] and [[codex-harness-plugin-配置清单]] resolve to the same page)

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING