openclaw - 💡(How to fix) Fix [Bug] memory-core dreaming: stable narrative session is archived by post-completion cleanup before host plugin can call getSessionMessages → narrative text lost

openclaw2026-05-27 05:40:02

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

After #84802 (stable narrative session keys + bounded cleanup, released in 2026.5.22), a race between the gateway's post-completion subagent cleanup and the memory-core host plugin's narrative extraction causes the model-generated narrative text to be silently discarded in a fraction of dreaming-narrative-* runs.

The fallback writer added in #85821 (released in 2026.5.24) papers over the symptom by appending a snippet-based fallback entry to DREAMS.md, but the actual narrative text the model just generated is still lost — only the phase snippets land in the diary.

Error Message

| memory-core logs produced no text warn | 01:01:25.587 |

Root Cause

In one observed cron sweep: 16 of ~18 narrative attempts failed with produced no text, despite the underlying model runs all completing successfully with 200-300+ output tokens of valid diary text. The first workspace usually succeeds; subsequent workspaces start failing in cascade. dreaming promotion complete reports failed=0 because the promotion step itself is independent — only the narrative extraction silently fails.

Fix Action

Fix / Workaround

The archiving path appears to be subagent-registry's completeCleanupBookkeeping → notifyContextEngineSubagentEnded({reason: "deleted"}) (this is the reason: "deleted" call introduced/strengthened by #84802 in pursuit of bounded cleanup), which dispatches sessions.delete server-side and ultimately calls archiveSessionTranscriptsDetailed({reason:"deleted"}) → file rename to .deleted.<ISO>. The host plugin's getSessionMessages happens AFTER waitForRun resolves, but the cleanup is fire-and-forget relative to host-plugin extraction.

Code Example

memory-core: dream diary entry written for light phase [workspace=…/workspace]      ← OK
memory-core: dream diary entry written for rem phase   [workspace=…/workspace]      ← OK
memory-core: dream diary entry written for deep phase  [workspace=…/workspace]      ← OK
memory-core: dreaming promotion complete (workspaces=6, candidates=50, applied=6, failed=0)
memory-core: dream diary entry written for light phase [workspace=…/telegram-main]  ← OK
memory-core: dream diary entry written for rem phase   [workspace=…/telegram-main]  ← OK
memory-core: narrative generation produced no text for deep phase                   ← FAIL (race)
memory-core: dream diary entry written for light phase [workspace=…/residential]    ← OK
memory-core: dream diary entry written for rem phase   [workspace=…/residential]    ← OK
memory-core: narrative generation produced no text for deep phase                   ← FAIL (race)
memory-core: narrative generation produced no text for light phase                  ← FAIL (race)
…

---

{"type":"session.started","ts":"…01:01:17.326Z","provider":"openai-codex","modelId":"gpt-5.4-mini","modelApi":"openai-codex-responses",…}
{"type":"model.completed","data":{
  "usage":{"input":46266,"output":312,"total":46578},
  "aborted":false,"timedOut":false,
  "assistantTexts":["<full diary entry — ~1.5 KB of structured Russian prose>"],
  …
}}
{"type":"session.ended","data":{"status":"success",…}}

---

{"type":"message","message":{
  "role":"assistant",
  "content":[{"type":"text","text":"<full diary entry text>"}],
  "api":"openai-codex-responses","provider":"openai-codex","model":"gpt-5.4-mini",
  "stopReason":"stop",
  "usage":{…}
}}

RAW_BUFFERClick to expand / collapse

Summary

Environment

OpenClaw: 2026.5.22 (also expected on 2026.5.24+ since #85821 doesn't address the root race)
Linux x86_64, Node v22+
6 active workspaces, memory-core.config.dreaming.enabled = true, cron 0 3 * * *
Model: openai-codex/gpt-5.4-mini (provider-agnostic — race is at the session lifecycle layer, not provider-specific)

Symptom

Gateway journal during dreaming cron:

memory-core: dream diary entry written for light phase [workspace=…/workspace]      ← OK
memory-core: dream diary entry written for rem phase   [workspace=…/workspace]      ← OK
memory-core: dream diary entry written for deep phase  [workspace=…/workspace]      ← OK
memory-core: dreaming promotion complete (workspaces=6, candidates=50, applied=6, failed=0)
memory-core: dream diary entry written for light phase [workspace=…/telegram-main]  ← OK
memory-core: dream diary entry written for rem phase   [workspace=…/telegram-main]  ← OK
memory-core: narrative generation produced no text for deep phase                   ← FAIL (race)
memory-core: dream diary entry written for light phase [workspace=…/residential]    ← OK
memory-core: dream diary entry written for rem phase   [workspace=…/residential]    ← OK
memory-core: narrative generation produced no text for deep phase                   ← FAIL (race)
memory-core: narrative generation produced no text for light phase                  ← FAIL (race)
…

Root cause (with trace evidence)

For one concrete failing run on 2026-05-27 03:01:22Z, runId=dreaming-narrative-rem-<workspaceHash>-1779843621170:

Trajectory file agents/main/sessions/2189132b-….trajectory.jsonl shows the run completed successfully:

{"type":"session.started","ts":"…01:01:17.326Z","provider":"openai-codex","modelId":"gpt-5.4-mini","modelApi":"openai-codex-responses",…}
{"type":"model.completed","data":{
  "usage":{"input":46266,"output":312,"total":46578},
  "aborted":false,"timedOut":false,
  "assistantTexts":["<full diary entry — ~1.5 KB of structured Russian prose>"],
  …
}}
{"type":"session.ended","data":{"status":"success",…}}

Session JSONL (path is <sessionId>.jsonl, now renamed to <sessionId>.jsonl.deleted.2026-05-27T01-01-24.361Z) was correctly written with an assistant message in exactly the shape extractNarrativeText expects:

{"type":"message","message":{
  "role":"assistant",
  "content":[{"type":"text","text":"<full diary entry text>"}],
  "api":"openai-codex-responses","provider":"openai-codex","model":"gpt-5.4-mini",
  "stopReason":"stop",
  "usage":{…}
}}

Timeline of one failing attempt:

event	ts (UTC)
session JSONL written by gateway (model.completed)	`01:01:22.288`
session JSONL renamed to `.deleted.<ISO>` by gateway	`01:01:24.361`
`memory-core` logs `produced no text` warn	`01:01:25.587`

The session was archived 1.226 seconds before memory-core logged its extraction failure — i.e. subagent.getSessionMessages({sessionKey, limit:5}) either returned [] or returned messages without the assistant text because the on-disk store was already in the .deleted.<ISO> state.

Why #85821 doesn't fully fix this

#85821 detects the empty-extraction case and writes a deterministic fallback entry to DREAMS.md so the diary is no longer silently empty. But the fallback content is just a re-formatted version of the phase snippets (pattern surfaced from recall candidates etc.) — the model-generated narrative text is still discarded. For users who rely on the diary as the dreaming surface (review, audit, downstream personal-facts curation), the loss is significant.

Reproduction

Workspace with memory-core.config.dreaming.enabled = true and ≥3 workspaces configured.
Ensure each workspace has enough phase candidates for narrative generation (≥1 promotion or snippet per phase).
Trigger dreaming cron once (e.g. openclaw cron run <dreaming-job-id> --wait).
Wait for the detached narrative subagents to complete (DETACHED_NARRATIVE_CONCURRENCY = 3 queue).
Observe journalctl … | grep -E "memory-core: (dream diary entry|narrative generation produced no text)" — typically the first 1-2 (workspace, phase) pairs succeed and subsequent ones cascade-fail with produced no text, despite trajectory files showing successful model.completed events with non-empty assistantTexts.

Suggested fixes (in order of preference)

Defer the post-completion sessions.delete until the host plugin signals "done with messages", or until a configurable grace period after waitForRun resolution. Most host plugins that use subagent.run for one-shot narrative extraction follow the pattern run → waitForRun → getSessionMessages → deleteSession (explicit); the gateway's implicit cleanup is racing the host's explicit one.
Add a subagent.run option keepSessionUntilExplicitDelete: true that suppresses the post-completion auto-archive. Host plugins that need post-run extraction opt in. memory-core's narrative generator is the obvious caller; other host plugins that just run + waitForRun + deleteSession continue working unchanged.
Make subagent.getSessionMessages transparently read the archived .deleted.<ISO> file if the live file is missing but the session store entry resolves. Smaller blast radius, but couples the extraction path to archive naming.
(Already merged) Keep #85821 as a true safety net for cases where the trajectory is also unavailable — but its fallback should not be the primary correctness mechanism.

Workaround (current — for anyone affected before an upstream fix)

In extensions/memory-core/src/dreaming-narrative.ts, when extractNarrativeText(messages) returns null, fall back to reading the matching <sessionId>.trajectory.jsonl (longer-lived than the session JSONL because it's not archived by sessions.delete) and extract narrative text from the last model.completed.data.assistantTexts. The trajectory persists past the archive, so the actual model output is recoverable as long as the trajectory hasn't been pruned by a separate cleanup.

Observed effect of this workaround on our deployment: 17 diary entries written per cron sweep (vs 2-3 before), 11 of them via trajectory recovery, 0 net produced no text failures.

Happy to provide additional traces, prod-side journal excerpts, or test against a candidate fix.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering