openclaw - 💡(How to fix) Fix [Bug]: First dreaming-narrative subagent run after gateway idle exceeds NARRATIVE_TIMEOUT_MS due to plugin-host cold-start

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

On a gateway that has been idle long enough for the subagent runtime's plugin host to cool, the first memory-core dreaming-narrative subagent.run of the morning sweep routinely exceeds the hard-coded 60 s NARRATIVE_TIMEOUT_MS because loading non-bundled tsx extensions eats the full timeout before Claude is invoked — so the first workspace deterministically gets no light-phase dream-diary entry while every subsequent workspace in the same sweep completes in ~10-15 s.

Root Cause

On a gateway that has been idle long enough for the subagent runtime's plugin host to cool, the first memory-core dreaming-narrative subagent.run of the morning sweep routinely exceeds the hard-coded 60 s NARRATIVE_TIMEOUT_MS because loading non-bundled tsx extensions eats the full timeout before Claude is invoked — so the first workspace deterministically gets no light-phase dream-diary entry while every subsequent workspace in the same sweep completes in ~10-15 s.

Fix Action

Fix / Workaround

  1. Eager-load non-bundled plugins at gateway startup so the first runtime consumer doesn't pay the tsx compile cost. plugins.allow already exists as the trust boundary — it could also gate eager-load. Setting plugins.allow in openclaw.json on an affected install already causes the gateway to list the expected plugins as loaded on startup (gateway] ready (N plugins: ...; 18.7s)), so the mechanism exists; it just doesn't preempt the per-session plugin-host init on the subagent runtime path.
  2. Make NARRATIVE_TIMEOUT_MS configurable via plugins.entries["memory-core"].config.narrativeTimeoutMs (same shape as the existing dreaming.* config). This is a partial mitigation — it masks the slow cold-start instead of fixing it, but would at least make the first dream of the morning reliable for installs where eager-loading is undesirable.
  3. Warm the subagent runtime's plugin host in parallel with gateway boot, so scheduled crons that fire soon after boot also hit a hot host.

Current client-side workaround: a lightweight agentTurn cron scheduled 2 min before the dreaming cron, which forces the subagent runtime to initialize on a throwaway session. Works, but it's an ugly papering-over of a cold-start that should not be in the hot path of the built-in dreaming cron.

Code Example

"plugins": {
     "entries": {
       "memory-core": {
         "enabled": true,
         "config": {
           "dreaming": { "enabled": true, "frequency": "0 5 * * *" }
         }
       }
     }
   }

---

05:00:00.158 [plugins] memory-core: managed dreaming cron could not be reconciled (cron service unavailable).
05:00:01.149 [plugins] memory-core: light dreaming staged 70 candidate(s) [workspace=~/.openclaw/workspace-<first>]
05:00:34.931 [whatsapp] Web connection closed (status 408). Retry 1/12 in 2.41s…   # incidental; not related
05:00:46.875 [plugins] plugins.allow is empty; discovered non-bundled plugins may auto-load: <four non-bundled ids> (~/.openclaw/extensions/<id>/src/index.ts ...). Set plugins.allow to explicit trusted ids.
05:00:47.721 [<plugin-a>] register() called, registering hook + tools
05:00:47.735 [<plugin-b>] register() called, registering hook + tool
05:01:07.956 [plugins] memory-core: narrative generation ended with status=timeout for light phase.
05:01:25.763 [plugins] memory-core: REM dreaming wrote reflections from N recent memory trace(s) [workspace=~/.openclaw/workspace-<first>]
05:01:38.190 [plugins] memory-core: dream diary entry written for rem phase [workspace=~/.openclaw/workspace-<first>]
05:01:38.972 [plugins] memory-core: light dreaming staged 79 candidate(s) [workspace=~/.openclaw/workspace-<second>]
05:01:51.115 [plugins] memory-core: dream diary entry written for light phase [workspace=~/.openclaw/workspace-<second>]   # ← 12 s; plugin host now warm
RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

On a gateway that has been idle long enough for the subagent runtime's plugin host to cool, the first memory-core dreaming-narrative subagent.run of the morning sweep routinely exceeds the hard-coded 60 s NARRATIVE_TIMEOUT_MS because loading non-bundled tsx extensions eats the full timeout before Claude is invoked — so the first workspace deterministically gets no light-phase dream-diary entry while every subsequent workspace in the same sweep completes in ~10-15 s.

Steps to reproduce

  1. Install 4 non-bundled extensions under ~/.openclaw/extensions/*/src/index.ts (ones that tsx must compile on first load). Keep them out of plugins.allow.

  2. Enable the memory-core nightly dreaming cron:

    "plugins": {
      "entries": {
        "memory-core": {
          "enabled": true,
          "config": {
            "dreaming": { "enabled": true, "frequency": "0 5 * * *" }
          }
        }
      }
    }
  3. Configure ≥ 2 workspaces in agents.list (the one that sorts first in the sweep becomes the affected victim).

  4. Let the gateway sit idle ≥ ~30 min before the cron fires (e.g. run the gateway overnight with no inbound messages). The subagent runtime's plugin host cools during the idle window.

  5. Wait for the dreaming cron to fire.

  6. Inspect the journal/log around the fire time.

Expected behavior

Every workspace's light-phase narrative completes within a similar latency envelope (~10-15 s end-to-end for 50-80 staged candidates), with the light-phase dream-diary entry written to DREAMS.md. This is what the 2nd–Nth workspaces in the same sweep do on the same run.

Actual behavior

The first-in-sweep workspace consistently hits status=timeout on light phase and gets no light-phase diary entry. REM phase recovers for that workspace because by then the plugin host has warmed. All other workspaces in the same sweep succeed.

Timeline from one affected morning (relevant lines only, timestamps verbatim, paths anonymized):

05:00:00.158 [plugins] memory-core: managed dreaming cron could not be reconciled (cron service unavailable).
05:00:01.149 [plugins] memory-core: light dreaming staged 70 candidate(s) [workspace=~/.openclaw/workspace-<first>]
05:00:34.931 [whatsapp] Web connection closed (status 408). Retry 1/12 in 2.41s…   # incidental; not related
05:00:46.875 [plugins] plugins.allow is empty; discovered non-bundled plugins may auto-load: <four non-bundled ids> (~/.openclaw/extensions/<id>/src/index.ts ...). Set plugins.allow to explicit trusted ids.
05:00:47.721 [<plugin-a>] register() called, registering hook + tools
05:00:47.735 [<plugin-b>] register() called, registering hook + tool
05:01:07.956 [plugins] memory-core: narrative generation ended with status=timeout for light phase.
05:01:25.763 [plugins] memory-core: REM dreaming wrote reflections from N recent memory trace(s) [workspace=~/.openclaw/workspace-<first>]
05:01:38.190 [plugins] memory-core: dream diary entry written for rem phase [workspace=~/.openclaw/workspace-<first>]
05:01:38.972 [plugins] memory-core: light dreaming staged 79 candidate(s) [workspace=~/.openclaw/workspace-<second>]
05:01:51.115 [plugins] memory-core: dream diary entry written for light phase [workspace=~/.openclaw/workspace-<second>]   # ← 12 s; plugin host now warm

Net: light dreaming stagedplugins.allow is empty + register() = ~46 s of silence (the tsx-import + plugin-host init), after which there is ~22 s left in the timeout budget, which runs out before Claude returns. Total wall time from stage to timeout = ~66 s > NARRATIVE_TIMEOUT_MS = 60_000.

Reproduced 3/4 consecutive mornings on the same install. On the 4th morning, ~25 min of inbound channel activity in the hour before the sweep kept the runtime warm, and the first workspace completed in 23 s.

The session transcript written for the timed-out run (sessionKey dreaming-narrative-light-<hash>-<nowMs>) confirms that the Claude call itself, once it actually starts, completes in ~5-13 s — the bottleneck is everything in front of it, not the model.

OpenClaw version

2026.4.14 (323493f)

Operating system

Ubuntu 24.04 LTS

Install method

npm global (/usr/lib/node_modules/openclaw)

Model

anthropic/claude-sonnet-4-6

Provider / routing chain

openclaw -> anthropic

Additional provider/model setup details

None relevant. Bug is runtime-init / plugin-loader timing, not model- or provider-dependent. Reproduces with stock anthropic provider config; prompt is the built-in NARRATIVE_SYSTEM_PROMPT in extensions/memory-core/src/dreaming-narrative.ts.

Logs, screenshots, and evidence

Source references:

  • extensions/memory-core/src/dreaming-narrative.ts:87const NARRATIVE_TIMEOUT_MS = 60_000;
  • extensions/memory-core/src/dreaming-narrative.ts:877 — the subagent.waitForRun({ runId, timeoutMs: NARRATIVE_TIMEOUT_MS }) call
  • extensions/memory-core/src/dreaming-narrative.ts generateAndAppendDreamNarrative — the call site that emits narrative generation ended with status=timeout when the wait elapses

Log excerpts already inlined above. Session transcripts corroborate: the Claude call only starts writing to the session jsonl ~40-65 s after light dreaming staged, i.e. the full NARRATIVE_TIMEOUT_MS budget is spent on tsx-loading plugins, not on the narrative generation itself.

Impact and severity

  • Affected: any deployment that (a) has non-bundled extensions under ~/.openclaw/extensions/ loaded from TypeScript sources via tsx, (b) runs the memory-core nightly dreaming cron, (c) has typical idle overnight before the cron fires. This is the default shape for any install that adds even one custom TypeScript extension and leaves the gateway up 24/7.
  • Severity: Moderate — the first workspace in the sweep silently loses one of its two daily dream-diary entries (light phase). Quality/parity regression, not a crash.
  • Frequency: Deterministic on cold mornings (3/3 observed); masked only by luck when unrelated channel activity happens to warm the runtime just before the cron fires.
  • Consequence: One workspace's light-phase narrative and its associated short-term promotions for that day never land. Asymmetric across workspaces (always the same one in the sort order), so a single user's dream-diary silently thins out over time relative to others.

Additional information

Suggested fixes (in rough order of preference, not prescriptive):

  1. Eager-load non-bundled plugins at gateway startup so the first runtime consumer doesn't pay the tsx compile cost. plugins.allow already exists as the trust boundary — it could also gate eager-load. Setting plugins.allow in openclaw.json on an affected install already causes the gateway to list the expected plugins as loaded on startup (gateway] ready (N plugins: ...; 18.7s)), so the mechanism exists; it just doesn't preempt the per-session plugin-host init on the subagent runtime path.
  2. Make NARRATIVE_TIMEOUT_MS configurable via plugins.entries["memory-core"].config.narrativeTimeoutMs (same shape as the existing dreaming.* config). This is a partial mitigation — it masks the slow cold-start instead of fixing it, but would at least make the first dream of the morning reliable for installs where eager-loading is undesirable.
  3. Warm the subagent runtime's plugin host in parallel with gateway boot, so scheduled crons that fire soon after boot also hit a hot host.

Current client-side workaround: a lightweight agentTurn cron scheduled 2 min before the dreaming cron, which forces the subagent runtime to initialize on a throwaway session. Works, but it's an ugly papering-over of a cold-start that should not be in the hot path of the built-in dreaming cron.

extent analysis

TL;DR

Increase the NARRATIVE_TIMEOUT_MS value or implement eager-loading of non-bundled plugins to prevent timeouts during the first workspace's light-phase narrative generation.

Guidance

  • Identify the root cause of the timeout: the tsx compilation of non-bundled plugins is taking longer than the allocated NARRATIVE_TIMEOUT_MS (60 seconds).
  • Consider increasing the NARRATIVE_TIMEOUT_MS value to accommodate the compilation time, but note that this is a partial mitigation.
  • Implement eager-loading of non-bundled plugins at gateway startup by utilizing the existing plugins.allow mechanism to gate eager-load, ensuring the first runtime consumer doesn't pay the tsx compile cost.
  • As an alternative, warm the subagent runtime's plugin host in parallel with gateway boot to prevent cold-start issues.

Example

No code snippet is provided as the issue is more related to configuration and plugin loading rather than a specific code fix.

Notes

The provided suggestions are based on the information given in the issue and may require further testing and validation to ensure they fully resolve the problem. The eager-loading approach seems to be the most preferred solution as it addresses the root cause of the issue.

Recommendation

Apply the eager-loading workaround by utilizing the plugins.allow mechanism to gate eager-load of non-bundled plugins, as this approach directly addresses the cold-start issue and prevents the first workspace from silently losing its light-phase dream-diary entry.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Every workspace's light-phase narrative completes within a similar latency envelope (~10-15 s end-to-end for 50-80 staged candidates), with the light-phase dream-diary entry written to DREAMS.md. This is what the 2nd–Nth workspaces in the same sweep do on the same run.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: First dreaming-narrative subagent run after gateway idle exceeds NARRATIVE_TIMEOUT_MS due to plugin-host cold-start