openclaw - 💡(How to fix) Fix [Behavior]: Dreaming deep phase promotes stale troubleshooting logs to MEMORY.md without verifying current applicability

StepCodex · 2026-05-08T06:41:57Z

[openclaw] The dreaming deep phase promotes raw verbatim daily-log snippets to MEMORY.md that describe troubleshooting conclusions which are factually outdated… The dreaming deep phase promotes raw verbatim daily-log snippets to `MEMORY.md` that describe troubleshooting conclusions which are **factually outdated** at promotion time. The system has no mechanism to verify whether a promoted entry still reflects the current configuration or operational state. ## Fix / Workaround The promoted entry describes a `deepseek-v4-flash` workaround from April 27 that is **no longer in effect**. The model was changed weeks ago. The entry is factually incorrect as a long-term memory item — it tells the agent "use deepseek-v4-flash for active-memory" when the actual config uses `gpt-oss-120b`. ### Summary The dreaming deep phase promotes raw verbatim daily-log snippets to `MEMORY.md` that describe troubleshooting conclusions which are **factually outdated** at promotion time. The system has no mechanism to verify whether a promoted entry still reflects the current configuration or operational state. ### Concrete case On 2026-05-08, the deep phase promoted 2 entries from daily memory files. The first one: ``` ## Promoted From Short-Term Memory (2026-05-08) - 2026-04-27 09:20 CST：排查 active-memory 召回问题，确认 daily memory 文件已纳入 recall/index 范围... - active-memory 在 cpam/gemini-3-flash 下沿 runEmbeddedPiAgent 链路连续超时...说明问题不在模型裸调用或 embedding provider - 将 active-memory 调整为 model=deepseek/deepseek-v4-flash 后，active-memory 从持续 timeout 变为可完成 - queryMode=message 对"长期偏好"召回测试没有优于 recent，复测后已改回 recent [score=0.961 recalls=10 avg=0.468] ``` **Problem:** At the time of promotion (2026-05-08), the actual `active-memory` configuration was: ```json { "model": "cpam/gpt-oss-120b", "modelFallback": "Open-Router/openai/gpt-oss-120b:free", "queryMode": "recent" } ``` The promoted entry describes a `deepseek-v4-flash` workaround from April 27 that is **no longer in effect**. The model was changed weeks ago. The entry is factually incorrect as a long-term memory item — it tells the agent "use deepseek-v4-flash for active-memory" when the actual config uses `gpt-oss-120b`. ### Why this happens Two compounding issues: **1. No staleness check on promotion candidates** The deep ranking algorithm scores candidates on 6 signals: | Signal | Weight | |--------|--------| | Frequency | 0.24 | | Relevance | 0.30 | | Query diversity | 0.15 | | Recency | 0.15 | | Consolidation | 0.10 | | Conceptual richness | 0.06 | A troubleshooting log that was queried 10 times during an active debugging session scores high on frequency (0.24) and relevance (0.30), even if the underlying issue was resolved and the config changed weeks ago. Recency at 0.15 weight is insufficient to filter stale operational logs. **2. Raw verbatim copy with no distillation** As noted in #67363, the promoted snippet is copied as-is. There is no step that: - Summarizes the insight ("active-memory model config was changed from X to Y") - Strips timestamps and context-specific details - Asks "is this still true?" The snippet from `memory/2026-04-27.md:1-4` is a 4-paragraph troubleshooting log with timestamps, specific model names, and debugging steps. It reads like a JIRA ticket, not a durable memory insight. ### Scoring data for the promoted entry ``` score=0.961 recalls=10 avg=0.468 source=memory/2026-04-27.md:1-4 age=11 days consolidate=1.00 conceptual=1.00 ``` 10 recalls during the debugging session inflated the frequency score. The entry passed all thresholds (minScore=0.8, minRecallCount=3) despite being 11 days old and no longer reflecting current state. ### Additional context - `openclaw memory status --deep` shows the dreaming thresholds are at defaults: `minScore=0.8 · minRecallCount=3 · minUniqueQueries=3 · recencyHalfLifeDays=14 · maxAgeDays=30` - The `maxAgeDays=30` cap did not help here since 11 days < 30 - The second promoted entry (from 2026-04-09, score=0.835) was also a stale troubleshooting log about an exec approval issue that has since been resolved ### Suggested fixes 1. **Staleness-aware scoring**: When a candidate snippet references specific config values (model names, feature flags, tool settings), cross-check against current `openclaw.json`. If the referenced config no longer matches, apply a heavy penalty or reject outright. 2. **Increase recency weight**: Bump recency from 0.15 to at least 0.25. Operational troubleshooting logs should decay faster than behavioral preferences. 3. **Snippet-type classifier**: Troubleshooting logs (timestamps, error messages, model names, "排查"/"修复" keywords) should be tagged as `ephemeral` and excluded from promotion, or at minimum require a higher score threshold. 4. **Config-aware promotion gate**: Before writing to `MEMORY.md`, check if the snippet describes a configuration that still exists. If `active-memory.model` changed from X to Y, reject entries that say "use X for act

openclaw2026-05-08 06:41:57

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

The dreaming deep phase promotes raw verbatim daily-log snippets to MEMORY.md that describe troubleshooting conclusions which are factually outdated at promotion time. The system has no mechanism to verify whether a promoted entry still reflects the current configuration or operational state.

Error Message

Snippet-type classifier: Troubleshooting logs (timestamps, error messages, model names, "排查"/"修复" keywords) should be tagged as ephemeral and excluded from promotion, or at minimum require a higher score threshold.

Root Cause

#67363 — raw verbatim promotion without distillation (same root cause, different failure mode)
#72021 — signalCount mixes daily/session signals with real recalls (inflates frequency for troubleshooting logs)

Fix Action

Fix / Workaround

The promoted entry describes a deepseek-v4-flash workaround from April 27 that is no longer in effect. The model was changed weeks ago. The entry is factually incorrect as a long-term memory item — it tells the agent "use deepseek-v4-flash for active-memory" when the actual config uses gpt-oss-120b.

Code Example

## Promoted From Short-Term Memory (2026-05-08)

<!-- openclaw-memory-promotion:memory:memory/2026-04-27.md:1:4 -->
- 2026-04-27 09:20 CST：排查 active-memory 召回问题，确认 daily memory 文件已纳入 recall/index 范围...
- active-memory 在 cpam/gemini-3-flash 下沿 runEmbeddedPiAgent 链路连续超时...说明问题不在模型裸调用或 embedding provider
- 将 active-memory 调整为 model=deepseek/deepseek-v4-flash 后，active-memory 从持续 timeout 变为可完成
- queryMode=message 对"长期偏好"召回测试没有优于 recent，复测后已改回 recent

[score=0.961 recalls=10 avg=0.468]

---

{
  "model": "cpam/gpt-oss-120b",
  "modelFallback": "Open-Router/openai/gpt-oss-120b:free",
  "queryMode": "recent"
}

---

score=0.961  recalls=10  avg=0.468  source=memory/2026-04-27.md:1-4
age=11 days  consolidate=1.00  conceptual=1.00

RAW_BUFFERClick to expand / collapse

Summary

Concrete case

On 2026-05-08, the deep phase promoted 2 entries from daily memory files. The first one:

## Promoted From Short-Term Memory (2026-05-08)

<!-- openclaw-memory-promotion:memory:memory/2026-04-27.md:1:4 -->
- 2026-04-27 09:20 CST：排查 active-memory 召回问题，确认 daily memory 文件已纳入 recall/index 范围...
- active-memory 在 cpam/gemini-3-flash 下沿 runEmbeddedPiAgent 链路连续超时...说明问题不在模型裸调用或 embedding provider
- 将 active-memory 调整为 model=deepseek/deepseek-v4-flash 后，active-memory 从持续 timeout 变为可完成
- queryMode=message 对"长期偏好"召回测试没有优于 recent，复测后已改回 recent

[score=0.961 recalls=10 avg=0.468]

Problem: At the time of promotion (2026-05-08), the actual active-memory configuration was:

{
  "model": "cpam/gpt-oss-120b",
  "modelFallback": "Open-Router/openai/gpt-oss-120b:free",
  "queryMode": "recent"
}

Why this happens

Two compounding issues:

1. No staleness check on promotion candidates

The deep ranking algorithm scores candidates on 6 signals:

Signal	Weight
Frequency	0.24
Relevance	0.30
Query diversity	0.15
Recency	0.15
Consolidation	0.10
Conceptual richness	0.06

A troubleshooting log that was queried 10 times during an active debugging session scores high on frequency (0.24) and relevance (0.30), even if the underlying issue was resolved and the config changed weeks ago. Recency at 0.15 weight is insufficient to filter stale operational logs.

2. Raw verbatim copy with no distillation

As noted in #67363, the promoted snippet is copied as-is. There is no step that:

Summarizes the insight ("active-memory model config was changed from X to Y")
Strips timestamps and context-specific details
Asks "is this still true?"

The snippet from memory/2026-04-27.md:1-4 is a 4-paragraph troubleshooting log with timestamps, specific model names, and debugging steps. It reads like a JIRA ticket, not a durable memory insight.

Scoring data for the promoted entry

score=0.961  recalls=10  avg=0.468  source=memory/2026-04-27.md:1-4
age=11 days  consolidate=1.00  conceptual=1.00

10 recalls during the debugging session inflated the frequency score. The entry passed all thresholds (minScore=0.8, minRecallCount=3) despite being 11 days old and no longer reflecting current state.

Additional context

openclaw memory status --deep shows the dreaming thresholds are at defaults: minScore=0.8 · minRecallCount=3 · minUniqueQueries=3 · recencyHalfLifeDays=14 · maxAgeDays=30
The maxAgeDays=30 cap did not help here since 11 days < 30
The second promoted entry (from 2026-04-09, score=0.835) was also a stale troubleshooting log about an exec approval issue that has since been resolved

Suggested fixes

Staleness-aware scoring: When a candidate snippet references specific config values (model names, feature flags, tool settings), cross-check against current openclaw.json. If the referenced config no longer matches, apply a heavy penalty or reject outright.
Increase recency weight: Bump recency from 0.15 to at least 0.25. Operational troubleshooting logs should decay faster than behavioral preferences.
Snippet-type classifier: Troubleshooting logs (timestamps, error messages, model names, "排查"/"修复" keywords) should be tagged as ephemeral and excluded from promotion, or at minimum require a higher score threshold.
Config-aware promotion gate: Before writing to MEMORY.md, check if the snippet describes a configuration that still exists. If active-memory.model changed from X to Y, reject entries that say "use X for active-memory".

#67363 — raw verbatim promotion without distillation (same root cause, different failure mode)
#72021 — signalCount mixes daily/session signals with real recalls (inflates frequency for troubleshooting logs)

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#retriever error #indexing error #inference speed #output truncation #response parsing

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix [Behavior]: Dreaming deep phase promotes stale troubleshooting logs to MEMORY.md without verifying current applicability

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Code Example

Summary

Concrete case

Why this happens

Scoring data for the promoted entry

Additional context

Suggested fixes

Related

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix [Behavior]: Dreaming deep phase promotes stale troubleshooting logs to MEMORY.md without verifying current applicability

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Code Example

Summary

Concrete case

Why this happens

Scoring data for the promoted entry

Additional context

Suggested fixes

Related

Still need to ship something?

RELATED_DISCOVERY

TRENDING