openclaw - 💡(How to fix) Fix Memory dreaming: corpus pre-filtering, weighted scoring, and session scan stall [1 participants]

openclaw2026-04-25 16:30:12

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#71656•Fetched 2026-04-26 05:10:14

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Lead2Legacy

Participants

Lead2Legacy

The memory-core dreaming pipeline runs nightly but has three scaling/quality issues that prevent meaningful promotion from light sleep to REM as the corpus grows.

Root Cause

There's no weighting by content type. A decision ("we're switching the auth middleware") scores identically to a heartbeat ping. Lowering minPatternStrength from 0.6 to 0.45 had no effect because the pattern detection itself isn't differentiating.

Code Example

# Check phase signals — session corpus entries frozen at 3 light hits
cat memory/.dreams/phase-signals.json | python3 -c "
import json, sys
d = json.load(sys.stdin)
session = [e for k, e in d['entries'].items() if 'session-corpus' in k]
print(f'Session corpus entries: {len(session)}')
print(f'Sample light hits: {session[0][\"lightHits\"] if session else \"N/A\"}')
print(f'Sample lastLightAt: {session[0][\"lastLightAt\"] if session else \"N/A\"}')
"

RAW_BUFFERClick to expand / collapse

Summary

The memory-core dreaming pipeline runs nightly but has three scaling/quality issues that prevent meaningful promotion from light sleep to REM as the corpus grows.

Environment

OpenClaw 2026.4.22 (00bd2cf)
memory-core plugin, dreaming enabled
19 successful nightly runs since deployment

Issues

1. No corpus pre-filtering — noise drowns signal (~60% of ingested content is noise)

Session corpus files (memory/.dreams/session-corpus/*.txt) ingest raw session lines without any filtering. This includes:

Repeated HEARTBEAT_OK ping/pong exchanges (zero signal)
Duplicate context blocks (same content refinement prompt re-dumped 8+ times)
MC diagnostic one-word test messages
Repeated full task-state dumps (same 50+ task list pasted across multiple chat turns)
Subagent boilerplate context blocks (build instructions repeated verbatim)

Expected: Pre-filter or deduplicate corpus entries before scoring. At minimum, strip heartbeat pings and identical repeated blocks.

2. Flat scoring — all entries get the same score regardless of content quality

The scoring engine assigns uniform scores:

0.58 to all session corpus entries (1,094 entries)
0.62 to all daily log entries (222 entries)
Only entries that already reached REM have differentiated scores (2.48, 1.86, etc.)

Expected: Content-type-aware scoring — decisions, architectural changes, user preferences, and time-sensitive context should score higher than status pings. Recency should also factor in.

3. Session corpus scan stalled after April 9th

All 377 session corpus entries have exactly 3 light hits, all last touched on 2026-04-09. After that date, the nightly cycle continues scanning memory/*.md daily logs but never re-visits the session corpus. This appears to be a scan budget or pagination issue — as the daily log pool grew, it consumed the entire scan window.

Expected: The scan should rotate across all corpus sources, or prioritize unscored/under-scored entries over re-scanning entries with 100+ light hits.

Reproduction

# Check phase signals — session corpus entries frozen at 3 light hits
cat memory/.dreams/phase-signals.json | python3 -c "
import json, sys
d = json.load(sys.stdin)
session = [e for k, e in d['entries'].items() if 'session-corpus' in k]
print(f'Session corpus entries: {len(session)}')
print(f'Sample light hits: {session[0][\"lightHits\"] if session else \"N/A\"}')
print(f'Sample lastLightAt: {session[0][\"lastLightAt\"] if session else \"N/A\"}')
"

Suggested improvements

Corpus pre-filtering — strip noise before indexing (heartbeats, duplicate blocks, test messages)
Weighted scoring — differentiate by content type (decisions > status > pings)
Recency bias — recent entries should get priority scan budget over old high-hit entries
Scan rotation — ensure all sources get scanned each cycle, not just whichever fills the budget first

extent analysis

TL;DR

Implement corpus pre-filtering and weighted scoring to improve the quality of the memory-core dreaming pipeline.

Guidance

Apply corpus pre-filtering to remove noise from session corpus entries, such as heartbeat pings, duplicate blocks, and test messages.
Introduce weighted scoring to differentiate between content types, assigning higher scores to decisions, architectural changes, and user preferences.
Consider implementing recency bias to prioritize recent entries in the scan budget.
Review the scan rotation mechanism to ensure all corpus sources are scanned each cycle.

Example

import json

# Load phase signals
with open('memory/.dreams/phase-signals.json') as f:
    data = json.load(f)

# Filter out noise from session corpus entries
session_corpus = [e for k, e in data['entries'].items() if 'session-corpus' in k and not e['content'].startswith('HEARTBEAT_OK')]

# Assign weighted scores based on content type
for entry in session_corpus:
    if 'decision' in entry['content']:
        entry['score'] = 1.0
    elif 'architectural change' in entry['content']:
        entry['score'] = 0.8
    else:
        entry['score'] = 0.2

Notes

The provided example is a simplified illustration of corpus pre-filtering and weighted scoring. The actual implementation may require more complex logic and fine-tuning of scoring weights.

Recommendation

Apply the suggested improvements, starting with corpus pre-filtering and weighted scoring, to address the scaling and quality issues in the memory-core dreaming pipeline.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#configuration error #environment variable #network issue #logging issue #authentication issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix Memory dreaming: corpus pre-filtering, weighted scoring, and session scan stall [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

Summary

Environment

Issues

1. No corpus pre-filtering — noise drowns signal (~60% of ingested content is noise)

2. Flat scoring — all entries get the same score regardless of content quality

3. Session corpus scan stalled after April 9th

Reproduction

Suggested improvements

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix Memory dreaming: corpus pre-filtering, weighted scoring, and session scan stall [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

Summary

Environment

Issues

1. No corpus pre-filtering — noise drowns signal (~60% of ingested content is noise)

2. Flat scoring — all entries get the same score regardless of content quality

3. Session corpus scan stalled after April 9th

Reproduction

Suggested improvements

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING