openclaw - 💡(How to fix) Fix Dreaming: REM scorer promotes line-number references and filenames as themes [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#68595Fetched 2026-04-19 15:09:49
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0
Participants

The REM sleep phase of the dreaming system is pattern-matching on citation metadata (line number ranges like 28-31, 13-16, and file paths like memory/2026-04-12.md) and promoting them as meaningful recurring themes.

Error Message

After the first successful dreaming run (2026-04-18), two agents (a coding agent and a math/tutoring agent) both had entries like these promoted into MEMORY.md:

Root Cause

Root cause (suspected)

Code Example

Theme: `28-31` kept surfacing across 39 memories.
  - confidence: 1.00
  - evidence: memory/2026-04-12.md:43-46, ...
  - note: reflection

Theme: `13-16` kept surfacing across 41 memories.
  - confidence: 1.00
  - evidence: memory/2026-04-12.md:43-46, ...
  - note: reflection

Theme: `memory/2026-04-12.md` kept surfacing across 56 memories.
  - confidence: 1.00
  - ...
RAW_BUFFERClick to expand / collapse

Summary

The REM sleep phase of the dreaming system is pattern-matching on citation metadata (line number ranges like 28-31, 13-16, and file paths like memory/2026-04-12.md) and promoting them as meaningful recurring themes.

Observed behavior

After the first successful dreaming run (2026-04-18), two agents (a coding agent and a math/tutoring agent) both had entries like these promoted into MEMORY.md:

Theme: `28-31` kept surfacing across 39 memories.
  - confidence: 1.00
  - evidence: memory/2026-04-12.md:43-46, ...
  - note: reflection

Theme: `13-16` kept surfacing across 41 memories.
  - confidence: 1.00
  - evidence: memory/2026-04-12.md:43-46, ...
  - note: reflection

Theme: `memory/2026-04-12.md` kept surfacing across 56 memories.
  - confidence: 1.00
  - ...

These strings are line citation ranges from the evidence field format (e.g. source=memory/2026-04-12.md:28-31), not actual conceptual themes from session content.

Root cause (suspected)

The REM scorer appears to tokenize or extract "themes" from the full evidence/source string rather than from the semantic content of the candidate text itself. When candidates reference the same source file repeatedly (as expected when dreaming re-processes its own prior dream outputs), the line range suffixes and filenames become the most frequently occurring tokens and get ranked as themes.

Impact

  • Noise entries are permanently promoted into MEMORY.md with confidence 1.00
  • Affects agents with lower session volume (e.g. a coding agent, a math/tutoring agent) more severely — their small session corpus means self-referential dreaming output dominates the token frequency
  • Creates a compounding problem: once these are in MEMORY.md, they become part of the corpus for future dreaming sweeps

Expected behavior

The REM scorer should:

  1. Filter out tokens that match line-range patterns (e.g. /^\d+-\d+$/) from theme extraction
  2. Filter out file path tokens (e.g. strings containing / or ending in .md) from theme extraction
  3. Ideally, extract themes from the candidate text content rather than from the citation/evidence metadata

Environment

  • OpenClaw: v2026.4.12
  • Dreaming config: minScore: 0.8, minRecallCount: 3, minUniqueQueries: 3
  • Memory backend: QMD bridge mode, BM25-only (no vector embeddings — ARM host, no Vulkan)
  • Agents affected most severely: low-volume agents (coding agent, math/tutoring agent); also seen in a research agent

extent analysis

TL;DR

The REM scorer should be modified to filter out line-range patterns and file path tokens from theme extraction to prevent noise entries in MEMORY.md.

Guidance

  • Review the REM scorer's tokenization process to ensure it extracts themes from the candidate text content rather than citation metadata.
  • Implement a filter to exclude tokens matching line-range patterns (e.g., /^\d+-\d+$/) from theme extraction.
  • Consider adding a filter to exclude file path tokens (e.g., strings containing / or ending in .md) from theme extraction.
  • Verify the effectiveness of these filters by monitoring the themes extracted by the REM scorer and checking for noise entries in MEMORY.md.

Example

No code snippet is provided as the issue does not contain specific code references.

Notes

The solution may require modifications to the REM scorer's algorithm and tokenization process. The filters suggested may need to be adjusted based on the specific requirements of the dreaming system.

Recommendation

Apply a workaround by modifying the REM scorer to filter out line-range patterns and file path tokens from theme extraction, as this is the most direct way to address the issue and prevent noise entries in MEMORY.md.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

The REM scorer should:

  1. Filter out tokens that match line-range patterns (e.g. /^\d+-\d+$/) from theme extraction
  2. Filter out file path tokens (e.g. strings containing / or ending in .md) from theme extraction
  3. Ideally, extract themes from the candidate text content rather than from the citation/evidence metadata

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING