openclaw - 💡(How to fix) Fix cron `systemEvent` on long-lived session re-sends full transcript per tick — recurring jobs amplify cost

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

When a recurring cron job uses `payload.kind: "systemEvent"` targeted at a long-lived session (`sessionTarget: "main"` or an explicit `sessionKey` for an existing session), each scheduled tick starts an embedded agent run that re-sends the entire prior session transcript to the model.

If the host session has been running for days and accumulated significant history, each "ping" costs on the order of a heavy turn rather than a small one. A single tick can also keep one model call open for many minutes — in my logs I have observed `durationMs ≈ 670s` (~11 min) for a single embedded run, along with repeated diagnostic lines of the form:

``` [diagnostic] stalled session: ... activeWorkKind=model_call age=...s reason=active_work_without_progress classification=stalled_agent_run recovery=none ```

Combined with any case where a recurring job runs longer than the user intended (e.g. `deleteAfterRun` not honored for `kind: "every"` — filed as a separate issue), this turns a "tiny periodic reminder" into a substantial unintended LLM spend, silently.

Root Cause

When a recurring cron job uses `payload.kind: "systemEvent"` targeted at a long-lived session (`sessionTarget: "main"` or an explicit `sessionKey` for an existing session), each scheduled tick starts an embedded agent run that re-sends the entire prior session transcript to the model.

If the host session has been running for days and accumulated significant history, each "ping" costs on the order of a heavy turn rather than a small one. A single tick can also keep one model call open for many minutes — in my logs I have observed `durationMs ≈ 670s` (~11 min) for a single embedded run, along with repeated diagnostic lines of the form:

``` [diagnostic] stalled session: ... activeWorkKind=model_call age=...s reason=active_work_without_progress classification=stalled_agent_run recovery=none ```

Combined with any case where a recurring job runs longer than the user intended (e.g. `deleteAfterRun` not honored for `kind: "every"` — filed as a separate issue), this turns a "tiny periodic reminder" into a substantial unintended LLM spend, silently.

Fix Action

Fix / Workaround

Suggested mitigations (any combination would help)

RAW_BUFFERClick to expand / collapse

Summary

When a recurring cron job uses `payload.kind: "systemEvent"` targeted at a long-lived session (`sessionTarget: "main"` or an explicit `sessionKey` for an existing session), each scheduled tick starts an embedded agent run that re-sends the entire prior session transcript to the model.

If the host session has been running for days and accumulated significant history, each "ping" costs on the order of a heavy turn rather than a small one. A single tick can also keep one model call open for many minutes — in my logs I have observed `durationMs ≈ 670s` (~11 min) for a single embedded run, along with repeated diagnostic lines of the form:

``` [diagnostic] stalled session: ... activeWorkKind=model_call age=...s reason=active_work_without_progress classification=stalled_agent_run recovery=none ```

Combined with any case where a recurring job runs longer than the user intended (e.g. `deleteAfterRun` not honored for `kind: "every"` — filed as a separate issue), this turns a "tiny periodic reminder" into a substantial unintended LLM spend, silently.

Reproduction sketch

  1. Run a `sessionTarget: "main"` agent session and let it accumulate several days of events (hundreds-to-thousands of entries in the session JSONL).
  2. Schedule a recurring cron job with `payload.kind: "systemEvent"` targeted at that session.
  3. Observe per-tick LLM token usage and per-tick wall time grow as the host session ages.

Suggested mitigations (any combination would help)

  1. Default to an ephemeral session for recurring `systemEvent` jobs, instead of attaching to the host session. Session reuse becomes opt-in for the cases where the user actually wants the host session to "see" the message.

  2. Per-job context budget / compaction trigger before the embedded run executes — cap the prefix sent to the model when the host session exceeds N events / M MB.

  3. Watchdog recovery for stalled model calls. The current `stalled_agent_run` diagnostic emits `recovery=none` for `activeWorkKind=model_call`. A non-`none` recovery path (cancel after a configurable budget) would prevent a single tick from silently burning a large amount of tokens.

Environment

  • openclaw v2026.4.8

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix cron `systemEvent` on long-lived session re-sends full transcript per tick — recurring jobs amplify cost