openclaw - ✅(Solved) Fix Dreaming needs configurable session/cron exclusions; isolated cron transcripts still enter session corpus [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#72611Fetched 2026-04-28 06:33:59
View on GitHub
Comments
0
Participants
1
Timeline
7
Reactions
0
Author
Participants
Timeline (top)
referenced ×5cross-referenced ×2

Dreaming currently has no documented/configurable way to exclude specific sessions, cron jobs, groups/topics, or session-key prefixes from session transcript ingestion. On a real 2026.4.24 deployment, isolated cron runs are still appearing in memory/.dreams/session-corpus/YYYY-MM-DD.txt, even though the runtime contains generatedByCronRun / DIRECT_CRON_PROMPT_RE filtering logic.

This makes cron-heavy deployments difficult to operate without maintaining local Dreaming patches. Moving the cron delivery target to a Telegram group/topic does not help, because Dreaming scans agent session transcript files rather than the final delivery surface.

Related: #68449, but this issue is specifically about the missing operator-facing exclusion/config path and the remaining leak of isolated cron transcripts into the session corpus.

Root Cause

Cron jobs often contain operational prompts, tool stdout summaries, and repetitive status text that are useful as notifications but harmful as durable Dreaming input. In a personal-agent deployment, users should not have to fork/patch Dreaming just to keep maintenance cron transcripts out of memory.

Changing Telegram delivery target is not a workaround: the transcript is still stored under the agent and scanned by Dreaming.

Fix Action

Fix / Workaround

This makes cron-heavy deployments difficult to operate without maintaining local Dreaming patches. Moving the cron delivery target to a Telegram group/topic does not help, because Dreaming scans agent session transcript files rather than the final delivery surface.

Cron jobs often contain operational prompts, tool stdout summaries, and repetitive status text that are useful as notifications but harmful as durable Dreaming input. In a personal-agent deployment, users should not have to fork/patch Dreaming just to keep maintenance cron transcripts out of memory.

Changing Telegram delivery target is not a workaround: the transcript is still stored under the agent and scanned by Dreaming.

PR fix notes

PR #72913: fix(memory-core): keep rotated cron transcripts out of dreaming corpus + add operator filters

Description (problem / solution / changelog)

Closes #72611.

Problem

Two related leaks let isolated cron transcripts flow into the Dreaming session corpus (memory/.dreams/session-corpus/<day>.txt) on real deployments, even with sessionTarget: isolated and existing generatedByCronRun / DIRECT_CRON_PROMPT_RE filtering:

  1. Path-comparison miss after rotation. loadSessionTranscriptClassificationForSessionsDir indexes cron / dreaming-narrative transcripts by the live sessionFile absolute path read from sessions.json. As soon as that transcript is rotated to *.jsonl.deleted.<ts> (or *.trajectory.jsonl[.deleted.<ts>]) the on-disk path no longer matches the live entry, so the rotated artifact is no longer attributed to the cron session. dreaming-phases.collectSessionIngestionBatches then reads it as a normal transcript and ingests its [cron:<id> ...] content lines. Real-world recurrence in #72611 with paths like main/sessions/<uuid>.jsonl.deleted.2026-04-25T06-33-10.801Z.

  2. No operator-facing exclusion knob. dreaming.* only exposes phase-level settings. There is no way to say "skip every cron run for the main agent" or "skip session keys starting with agent:ops:" without forking memory-core. Cron runs whose key is the broader cron:<id> shape (without :run:<runId>) are not even matched by isCronRunSessionKey so the built-in classifier silently lets them through.

Fix

Two complementary changes in one commit:

1. Session-id-aware classification (rotated artifacts)

SessionTranscriptClassification now also exposes session-id sets and reverse-lookup maps:

type SessionTranscriptClassification = {
  // existing path sets retained for back-compat
  dreamingNarrativeTranscriptPaths: ReadonlySet<string>;
  cronRunTranscriptPaths: ReadonlySet<string>;
  // new: sessionId-keyed sets for rotated artifacts
  dreamingNarrativeSessionIds: ReadonlySet<string>;
  cronRunSessionIds: ReadonlySet<string>;
  // new: reverse lookup so callers can resolve sessionKey for any transcript
  transcriptPathToSessionKey: ReadonlyMap<string, string>;
  sessionIdToSessionKey: ReadonlyMap<string, string>;
};

New helpers in session-files.ts (also re-exported from openclaw/plugin-sdk/memory-core-host-engine-qmd):

  • extractSessionIdFromTranscriptFileName(fileName) — handles <id>.jsonl, <id>.jsonl.deleted.<ts>, <id>.jsonl.reset.<ts>, <id>.trajectory.jsonl[.deleted.<ts> | .reset.<ts>]. Returns null for non-transcript file shapes.
  • isCronRunTranscriptPath(classification, absPath) / isDreamingNarrativeTranscriptPath(...) — try direct path lookup first, fall back to extractSessionIdFromTranscriptFileName(...) plus the new sessionId set.
  • lookupSessionKeyForTranscriptPath(classification, absPath) — resolves the owning session key for a (possibly rotated) transcript; needed by the operator filter below.

dreaming-phases.collectSessionIngestionBatches is wired to use the new helpers in place of Set.has(normalizedPath).

2. Operator-facing dreaming session filters

New dreaming.sessionFilter config block in extensions/memory-core/openclaw.plugin.json:

{
  "plugins": {
    "entries": {
      "memory-core": {
        "config": {
          "dreaming": {
            "sessionFilter": {
              "excludeCronJobIds":          ["job-1", "job-2"],
              "excludeSessionKeyPrefixes":  ["agent:main:cron:", "agent:ops:"],
              "excludeAgentIds":            ["batch-runner"],
              "excludeSourcePathRegex":     ["^ops/sessions/.*\\.jsonl$"]
            }
          }
        }
      }
    }
  }
}

resolveSessionIngestionExcludePredicate(cfg, logger) builds a single predicate per ingestion sweep:

  • Returns a no-op () => false when no filter is configured (zero overhead in the common case).
  • Compiles excludeSourcePathRegex once; invalid patterns are logged as warnings and skipped, never throw.
  • Operator-driven; the built-in classifier still runs, so this is defense-in-depth for shapes the classifier misses (notably cron:<id> without :run:).

excludeSessionKeyPrefixes is the most flexible knob since it covers both cron:<id>:run:<runId> and cron:<id> shapes via a single agent:<id>:cron: prefix.

Tests

src/memory-host-sdk/host/session-files.test.ts (+10 tests, 34 total):

  • extractSessionIdFromTranscriptFileName — primary jsonl, .jsonl.deleted.<ts>, .jsonl.reset.<ts>, .trajectory.jsonl, .trajectory.jsonl.deleted.<ts>, non-transcript / null shapes.
  • isCronRunTranscriptPath / isDreamingNarrativeTranscriptPath — live path classification, rotated .deleted.<ts> recovery, rotated .trajectory.jsonl.deleted.<ts> recovery, live .trajectory.jsonl recovery, unrelated transcripts not matched.
  • lookupSessionKeyForTranscriptPath — live path, rotated path, unknown path.

extensions/memory-core/src/dreaming-phases.test.ts (+2 tests, 32 total):

  • "skips rotated cron run transcripts (.jsonl.deleted.<ts>) via session id (#72611)" — full ingestion harness, asserts session-corpus/2026-04-05.txt is not created.
  • "respects dreaming.sessionFilter.excludeSessionKeyPrefixes for operator-driven exclusion" — uses a cron:<id> (non-run) key the built-in classifier wouldn't catch; only the operator filter drops it.

Out of scope

  • Does not change isCronRunSessionKey regex — kept strict for back-compat. Operators who want to drop cron:<id> (non-run) shapes use the new excludeSessionKeyPrefixes knob.
  • Does not change disk repair (session-file-repair.ts) — only the in-memory ingestion path.
  • Live cron transcripts already in the corpus stay; this fix only stops new ingestion. A separate cleanup task (e.g. openclaw doctor --fix extension) could prune historical leaks.

Verified

  • pnpm tsgo:core
  • pnpm tsgo:extensions
  • pnpm test src/memory-host-sdk/host/session-files.test.ts → 34/34 pass
  • pnpm test extensions/memory-core/src/dreaming-phases.test.ts → 32/32 pass
  • pnpm check:changed → all gates pass (lint / typecheck / 0 import cycles / policy guards)

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • extensions/memory-core/openclaw.plugin.json (modified, +47/-0)
  • extensions/memory-core/src/dreaming-phases.test.ts (modified, +441/-0)
  • extensions/memory-core/src/dreaming-phases.ts (modified, +224/-6)
  • src/memory-host-sdk/engine-qmd.ts (modified, +4/-0)
  • src/memory-host-sdk/host/session-files.test.ts (modified, +232/-1)
  • src/memory-host-sdk/host/session-files.ts (modified, +186/-4)

Code Example

[main/sessions/<uuid>.jsonl.deleted.2026-04-25T06-33-10.801Z#L5] User: [cron:<cron-id> <cron-name>] ...

---

main/sessions/<uuid>.jsonl.deleted.<timestamp>
main/sessions/<uuid>.trajectory.jsonl.deleted.<timestamp>
RAW_BUFFERClick to expand / collapse

Summary

Dreaming currently has no documented/configurable way to exclude specific sessions, cron jobs, groups/topics, or session-key prefixes from session transcript ingestion. On a real 2026.4.24 deployment, isolated cron runs are still appearing in memory/.dreams/session-corpus/YYYY-MM-DD.txt, even though the runtime contains generatedByCronRun / DIRECT_CRON_PROMPT_RE filtering logic.

This makes cron-heavy deployments difficult to operate without maintaining local Dreaming patches. Moving the cron delivery target to a Telegram group/topic does not help, because Dreaming scans agent session transcript files rather than the final delivery surface.

Related: #68449, but this issue is specifically about the missing operator-facing exclusion/config path and the remaining leak of isolated cron transcripts into the session corpus.

Environment

  • OpenClaw: 2026.4.24 (764822a)
  • Dreaming: enabled via plugins.entries.memory-core.config.dreaming.enabled: true
  • Memory backend: builtin
  • Affected corpus path: memory/.dreams/session-corpus/YYYY-MM-DD.txt
  • Cron job type: payload.kind=agentTurn, sessionTarget=isolated, delivery.mode=announce

What I expected

At least one of these should be true:

  1. sessionTarget: isolated cron runs are not ingested into Dreaming's session corpus, or
  2. Cron-run transcripts are reliably classified as generatedByCronRun and skipped, including deleted/rotated transcript filenames, or
  3. Operators can configure Dreaming to ignore specific cron jobs/session-key prefixes/agents/surfaces, e.g. an allowlist/blocklist such as:
    • dreaming.sessionFilter.excludeSessionKeyPrefixes
    • dreaming.sessionFilter.excludeCronJobIds
    • dreaming.sessionFilter.excludeAgents
    • dreaming.sessionFilter.excludeRegex

What happened

A cron job already configured as sessionTarget: isolated still appeared in Dreaming session corpus.

Counts from local memory/.dreams/session-corpus inspection:

  • 2026-04-24.txt: 11 entries for the cron job
  • 2026-04-25.txt: 17 entries for the cron job
  • 2026-04-26.txt: 2 entries for the cron job
  • 2026-04-27.txt: 0 entries at the time of inspection

Example corpus line shape:

[main/sessions/<uuid>.jsonl.deleted.2026-04-25T06-33-10.801Z#L5] User: [cron:<cron-id> <cron-name>] ...

The source paths are important: many contaminated entries reference deleted/rotated transcript filenames such as:

main/sessions/<uuid>.jsonl.deleted.<timestamp>
main/sessions/<uuid>.trajectory.jsonl.deleted.<timestamp>

The current runtime code appears to rely on session-store classification to mark cron transcripts:

  • loadSessionTranscriptClassificationForSessionsDir(...)
  • isCronRunSessionKey(...)
  • cronRunTranscriptPaths
  • generatedByCronRun

But once transcript filenames are rotated/deleted, the path in session-corpus may no longer match the live sessions.json sessionFile path, so classification can fail. DIRECT_CRON_PROMPT_RE should still catch direct User: [cron:...] messages, but the corpus evidence shows these prompt lines were already ingested in recent runs.

Config gap checked

I checked the current config schema. memory only exposes:

  • memory.backend
  • memory.citations
  • memory.qmd.*

The memory-core Dreaming plugin config schema currently exposes only broad phase settings:

  • dreaming.enabled
  • dreaming.frequency
  • dreaming.timezone
  • dreaming.verboseLogging
  • dreaming.storage
  • dreaming.phases.light.*
  • dreaming.phases.deep.*
  • dreaming.phases.rem.*

I could not find a config option to skip a session, group/topic, cron job, or session-key prefix.

Why this matters

Cron jobs often contain operational prompts, tool stdout summaries, and repetitive status text that are useful as notifications but harmful as durable Dreaming input. In a personal-agent deployment, users should not have to fork/patch Dreaming just to keep maintenance cron transcripts out of memory.

Changing Telegram delivery target is not a workaround: the transcript is still stored under the agent and scanned by Dreaming.

Suggested fixes

A robust fix probably needs both:

  1. Make cron transcript skipping reliable

    • classify by record/session metadata where available, not only by current session-store transcript path
    • handle .jsonl.deleted.<timestamp> and .trajectory.jsonl.deleted.<timestamp> rotated transcript artifacts
    • keep direct [cron:...] prompt filtering as a defense-in-depth path
  2. Add operator-facing Dreaming session filters

    • block by cron job id/name
    • block by session key prefix, e.g. agent:main:cron:
    • block by agent id
    • block by regex over rendered source path/snippet
    • optionally allow an explicit include-only policy for session ingestion

This would let deployments keep Dreaming on for real human conversations while excluding high-volume automation/cron sessions without local patches.

extent analysis

TL;DR

To fix the issue of cron job transcripts being ingested into Dreaming's session corpus, implement reliable cron transcript skipping and add operator-facing Dreaming session filters.

Guidance

  • Review the current runtime code to ensure that session-store classification is correctly marking cron transcripts, and consider enhancing the classification to handle rotated/deleted transcript filenames.
  • Implement a defense-in-depth approach by keeping direct [cron:...] prompt filtering to catch any cron transcripts that may not be caught by session-store classification.
  • Add configuration options to allow operators to exclude specific cron jobs, session-key prefixes, agents, or surfaces from Dreaming's session corpus, such as dreaming.sessionFilter.excludeSessionKeyPrefixes or dreaming.sessionFilter.excludeCronJobIds.
  • Consider adding an explicit include-only policy for session ingestion to provide more fine-grained control over what is included in the session corpus.

Example

No code snippet is provided as the issue does not contain sufficient information to generate a specific code example.

Notes

The suggested fixes require modifications to the Dreaming plugin and its configuration schema. The implementation details may vary depending on the specific requirements and constraints of the deployment.

Recommendation

Apply a workaround by implementing the suggested fixes, specifically adding operator-facing Dreaming session filters and making cron transcript skipping reliable. This will allow deployments to keep Dreaming on for real human conversations while excluding high-volume automation/cron sessions without local patches.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix Dreaming needs configurable session/cron exclusions; isolated cron transcripts still enter session corpus [1 pull requests, 1 participants]