openclaw - ✅(Solved) Fix [Bug]: memory-core dreaming pollutes MEMORY.md and vector store with session-corpus data [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#77831Fetched 2026-05-06 06:20:37
View on GitHub
Comments
1
Participants
2
Timeline
4
Reactions
2
Timeline (top)
commented ×1cross-referenced ×1labeled ×1subscribed ×1

The memory-core plugin's dreaming subsystem writes session transcripts to memory/.dreams/session-corpus/ — a directory inside the memory/ tree. The memory search indexer scans the entire memory/ directory, causing these session-corpus files to be indexed as regular memories. Over time this accounts for ~83% of recall store entries (1345/1623), and their content — raw JSON metadata, REM phase markers (<!-- openclaw:dreaming:rem:end -->), confidence: / evidence: patterns — gets promoted into MEMORY.md, polluting it with unreadable noise.

Root Cause

Root cause locations in source code: dreaming-phases-DW9aQqXD.js:796 — hardcoded path memory/.dreams/session-corpus short-term-promotion-DZVrVqhT.js:397 — SHORT_TERM_SESSION_CORPUS_RE explicitly includes this path short-term-promotion-DZVrVqhT.js:839 — isShortTermMemoryPath() returns true for session-corpus files

Fix Action

Fix / Workaround

Current workaround (manual, must be repeated after each dreaming run): Remove session-corpus entries from memory/.dreams/short-term-recall.json Strip <!-- openclaw:dreaming:(light|rem):start/end --> blocks from memory/YYYY-MM-DD.md daily files Run openclaw memory index --force to rebuild vector index Suggested fixes (in priority order): Change session-corpus output path to outside the memory/ tree (e.g., .openclaw/dreams/session-corpus/). This is the cleanest fix and fully prevents the issue. Add memorySearch.extraExcludePaths config option so users can exclude memory/.dreams/session-corpus from indexing. Add path filtering in short-term-promotion to skip session-corpus sources even when isShortTermMemoryPath() currently returns true for them.

PR fix notes

PR #78130: fix(memory-core): exclude session-corpus files from short-term promotion (#77831)

Description (problem / solution / changelog)

Summary

  • Dreaming writes session transcripts to memory/.dreams/session-corpus/ inside the memory tree. The short-term promotion system tracked these paths in the recall store correctly (so dreaming phases can measure organic recall frequency), but also allowed them to pass through both promotion gates — causing 83% of recall-store entries to become session-corpus noise that polluted MEMORY.md with raw JSON, REM phase markers, and evidence lines.
  • Added isSessionCorpusPath() helper that matches memory/.dreams/session-corpus/*.txt|md paths.
  • Applied the guard in two places: (1) recordShortTermRecalls — corpus snippets are now skipped before entering the recall store as promotion candidates; (2) rankShortTermPromotionCandidates — existing corpus entries are skipped during candidate ranking.
  • isShortTermMemoryPath() is unchanged so dreaming-phase signal tracking continues to work correctly.

Closes #77831

Testing

pnpm vitest run extensions/memory-core/src/short-term-promotion.test.ts
 Test Files  1 passed (1)
      Tests  47 passed (47)
   Start at  19:36:25
   Duration  1.56s

Real behavior proof

  • Behavior: rankShortTermPromotionCandidates returned memory/.dreams/session-corpus/YYYY-MM-DD.txt entries as promotion candidates despite those files being internal corpus data; the new isSessionCorpusPath() guard excludes them while keeping daily memory notes eligible.
  • Tested via targeted vitest test added in this PR. No live OpenClaw runtime available — please apply maintainer proof: override or advise on evidence format.

Changed files

  • extensions/memory-core/src/short-term-promotion.test.ts (modified, +92/-0)
  • extensions/memory-core/src/short-term-promotion.ts (modified, +21/-0)

Code Example

Recall store contamination (before cleanup):
Recall store: 1623 entries · session-corpus: 1345 (83%)
Vector store: 487 chunks (bloated)

Example corrupted entry in MEMORY.md:
## Promoted From Short-Term Memory (2026-04-16)
<!-- openclaw-memory-promotion:memory:memory/2026-04-15.md:196:233 -->
- <!-- openclaw:dreaming:rem:end --> --- # 2026-04-15 Daily Log...
  [score=0.594 recalls=4 avg=0.376 source=memory/2026-04-15.md:196-233]

Root cause locations in source code:
dreaming-phases-DW9aQqXD.js:796 — hardcoded path memory/.dreams/session-corpus
short-term-promotion-DZVrVqhT.js:397SHORT_TERM_SESSION_CORPUS_RE explicitly includes this path
short-term-promotion-DZVrVqhT.js:839isShortTermMemoryPath() returns true for session-corpus files
RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

The memory-core plugin's dreaming subsystem writes session transcripts to memory/.dreams/session-corpus/ — a directory inside the memory/ tree. The memory search indexer scans the entire memory/ directory, causing these session-corpus files to be indexed as regular memories. Over time this accounts for ~83% of recall store entries (1345/1623), and their content — raw JSON metadata, REM phase markers (<!-- openclaw:dreaming:rem:end -->), confidence: / evidence: patterns — gets promoted into MEMORY.md, polluting it with unreadable noise.

Steps to reproduce

Let the default dreaming cron (0 5 * * *) run for several days Observe memory/.dreams/session-corpus/ accumulating .txt session transcript files Run openclaw memory status — recall store shows hundreds of session-corpus entries Inspect MEMORY.md — entries with <!-- openclaw:dreaming:rem:end -->, confidence:, evidence: memory/.dreams/session-corpus/... markers appear at the bottom Check vector index size (grows disproportionately to useful memory content)

Expected behavior

Session-corpus files should not be indexed as memories MEMORY.md should only contain meaningful semantic memories, not REM reflection metadata short-term-recall.json entries should be < 300, not 1600+

Actual behavior

Session-corpus files at memory/.dreams/session-corpus/ are indexed by the memory search backend short-term-recall.json contains ~1345 session-corpus entries out of 1623 total MEMORY.md gets promoted entries containing raw REM metadata: Daily log files (memory/YYYY-MM-DD.md) are also polluted with <!-- openclaw:dreaming:rem:start --> / <!-- openclaw:dreaming:rem:end --> reflection blocks

OpenClaw version

2026.5.3-1 (commit 2eae30e)

Operating system

Ubuntu / cloud desktop (Linux 5.15.0)

Install method

npm global (openclaw update)

Model

MiniMax-M2.7-highspeed (primary), MiniMax-M2.7 (fallback)

Provider / routing chain

active-memory → MiniMax-M2.7-highspeed (blocking sub-agent) memory-search → DashScope text-embedding-v4 (OpenAI-compatible endpoint)

Additional provider/model setup details

"agents.defaults.memorySearch": { "enabled": true, "provider": "openai", "model": "text-embedding-v4", "remote": { "baseUrl": "https://dashscope.aliyuncs.com/compatible-mode/v1" } } "plugins.entries.memory-core.config.dreaming": { "enabled": true, "frequency": "0 5 * * *", "timezone": "UTC", "phases": { "light": { "lookbackDays": 3, "limit": 50, "dedupeSimilarity": 0.85 }, "rem": { "lookbackDays": 7, "limit": 30, "minPatternStrength": 0.5 }, "deep": { "limit": 20, "minScore": 0.3, "minRecallCount": 2 } } }

Logs, screenshots, and evidence

Recall store contamination (before cleanup):
Recall store: 1623 entries · session-corpus: 1345 (83%)
Vector store: 487 chunks (bloated)

Example corrupted entry in MEMORY.md:
## Promoted From Short-Term Memory (2026-04-16)
<!-- openclaw-memory-promotion:memory:memory/2026-04-15.md:196:233 -->
- <!-- openclaw:dreaming:rem:end --> --- # 2026-04-15 Daily Log...
  [score=0.594 recalls=4 avg=0.376 source=memory/2026-04-15.md:196-233]

Root cause locations in source code:
dreaming-phases-DW9aQqXD.js:796 — hardcoded path memory/.dreams/session-corpus
short-term-promotion-DZVrVqhT.js:397 — SHORT_TERM_SESSION_CORPUS_RE explicitly includes this path
short-term-promotion-DZVrVqhT.js:839 — isShortTermMemoryPath() returns true for session-corpus files

Impact and severity

Impact Detail Memory quality 83% of recall store is noise; useful memories buried MEMORY.md Polluted with REM metadata, unreadable by humans Active-memory Retrieval surfaces irrelevant session metadata Vector store Wastes embedding quota and storage on noise Severity Medium — does not break functionality but degrades all memory-dependent features

Additional information

Current workaround (manual, must be repeated after each dreaming run): Remove session-corpus entries from memory/.dreams/short-term-recall.json Strip <!-- openclaw:dreaming:(light|rem):start/end --> blocks from memory/YYYY-MM-DD.md daily files Run openclaw memory index --force to rebuild vector index Suggested fixes (in priority order): Change session-corpus output path to outside the memory/ tree (e.g., .openclaw/dreams/session-corpus/). This is the cleanest fix and fully prevents the issue. Add memorySearch.extraExcludePaths config option so users can exclude memory/.dreams/session-corpus from indexing. Add path filtering in short-term-promotion to skip session-corpus sources even when isShortTermMemoryPath() currently returns true for them.

extent analysis

TL;DR

Change the session-corpus output path to outside the memory/ tree to prevent indexing of session-corpus files as memories.

Guidance

  • Identify the current session-corpus output path in the dreaming-phases-DW9aQqXD.js file and consider changing it to a location outside the memory/ tree, such as .openclaw/dreams/session-corpus/.
  • Review the short-term-promotion-DZVrVqhT.js file to understand how the isShortTermMemoryPath() function is used to determine which paths are indexed, and consider modifying it to exclude session-corpus files.
  • If modifying the code is not feasible, consider using the current workaround of manually removing session-corpus entries and rebuilding the vector index after each dreaming run.
  • Evaluate the potential impact of adding a memorySearch.extraExcludePaths config option to allow users to exclude specific paths from indexing.

Example

No code example is provided as the suggested fix involves modifying the output path or adding a config option, which requires a deeper understanding of the codebase.

Notes

The suggested fixes prioritize changing the session-corpus output path, which is considered the cleanest fix. However, adding a config option or modifying the isShortTermMemoryPath() function may also be viable solutions. The current workaround is manual and repetitive, making a permanent fix desirable.

Recommendation

Apply the workaround of changing the session-corpus output path to outside the memory/ tree, as it is the most straightforward and effective solution to prevent the indexing of session-corpus files as memories.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Session-corpus files should not be indexed as memories MEMORY.md should only contain meaningful semantic memories, not REM reflection metadata short-term-recall.json entries should be < 300, not 1600+

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Bug]: memory-core dreaming pollutes MEMORY.md and vector store with session-corpus data [1 pull requests, 1 comments, 2 participants]