openclaw - 💡(How to fix) Fix Cron startup catch-up fires duplicate when previous run was itself a late catch-up [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#60083Fetched 2026-04-08 02:36:36
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0
Author
Participants

Root Cause

isRunnableJob() has two detection paths. Path 1 (nextRunAtMs >= nowMs) correctly says "not due." But Path 2 (startup-only, gated by allowCronMissedRunByLastRun) recomputes previousRunAtMs from the cron expression:

const previousRunAtMs = computeJobPreviousRunAtMs(job, nowMs);
const lastRunAtMs = job.state.lastRunAtMs;
return previousRunAtMs > lastRunAtMs;

At 04:21 UTC:

  • previousRunAtMs = most recent 0 20 * * * in Europe/Berlin before now = ~18:00 UTC (same day)
  • lastRunAtMs = 01:56 UTC (when catch-up actually ran)
  • 18:00 > 01:56 → TRUE → classified as missed → duplicate fires

The comparison is purely timestamp-based and cannot distinguish "this run was fulfilling yesterday's slot via late catch-up" from "this run happened before today's slot."

Code Example

const previousRunAtMs = computeJobPreviousRunAtMs(job, nowMs);
const lastRunAtMs = job.state.lastRunAtMs;
return previousRunAtMs > lastRunAtMs;
RAW_BUFFERClick to expand / collapse

Bug Description

The cron scheduler's startup catch-up logic (planStartupCatchupisRunnableJob) has a structural flaw where a job that ran as a late catch-up can be incorrectly classified as "missed" on the next gateway restart, causing a duplicate execution.

Steps to Reproduce

  1. A cron job is scheduled at 0 20 * * * in Europe/Berlin (20:00 CET = ~18:00 UTC)
  2. Gateway is down at scheduled time → job doesn't fire
  3. Gateway restarts at e.g. 01:56 UTC (next day) → catch-up fires correctly (lastRunAtMs = 01:56 UTC, nextRunAtMs set to next day's slot)
  4. Gateway restarts again at e.g. 04:21 UTC the same day

Expected Behavior

Job should NOT fire — it already ran 2.5 hours ago and nextRunAtMs points to the next day.

Actual Behavior

Job fires again as a catch-up duplicate.

Root Cause

isRunnableJob() has two detection paths. Path 1 (nextRunAtMs >= nowMs) correctly says "not due." But Path 2 (startup-only, gated by allowCronMissedRunByLastRun) recomputes previousRunAtMs from the cron expression:

const previousRunAtMs = computeJobPreviousRunAtMs(job, nowMs);
const lastRunAtMs = job.state.lastRunAtMs;
return previousRunAtMs > lastRunAtMs;

At 04:21 UTC:

  • previousRunAtMs = most recent 0 20 * * * in Europe/Berlin before now = ~18:00 UTC (same day)
  • lastRunAtMs = 01:56 UTC (when catch-up actually ran)
  • 18:00 > 01:56 → TRUE → classified as missed → duplicate fires

The comparison is purely timestamp-based and cannot distinguish "this run was fulfilling yesterday's slot via late catch-up" from "this run happened before today's slot."

Suggested Fix

When applyJobResult records a successful run, it could also store the cron slot timestamp that the run was fulfilling (e.g. lastFulfilledSlotMs). Path 2 would then compare previousRunAtMs > lastFulfilledSlotMs instead of previousRunAtMs > lastRunAtMs, preventing false-positive missed-run detection for late catch-ups.

Alternatively, Path 2 could be skipped entirely when nextRunAtMs is set and in the future (Path 1 already covers this case correctly).

Environment

  • OpenClaw version: latest (pnpm global install)
  • Cron expression: 0 20 * * * with tz: Europe/Berlin
  • wakeMode: now (but irrelevant — Path 2 ignores wakeMode)

extent analysis

TL;DR

Update the isRunnableJob logic to compare previousRunAtMs with lastFulfilledSlotMs instead of lastRunAtMs to prevent duplicate executions.

Guidance

  • Store the cron slot timestamp that the run was fulfilling (lastFulfilledSlotMs) when recording a successful run in applyJobResult.
  • Update Path 2 in isRunnableJob to compare previousRunAtMs with lastFulfilledSlotMs instead of lastRunAtMs.
  • Alternatively, consider skipping Path 2 entirely when nextRunAtMs is set and in the future, as Path 1 already covers this case correctly.
  • Verify the fix by reproducing the steps to reproduce and checking that the job does not fire again as a catch-up duplicate.

Example

const lastFulfilledSlotMs = job.state.lastFulfilledSlotMs;
return previousRunAtMs > lastFulfilledSlotMs;

Notes

The suggested fix assumes that the lastFulfilledSlotMs can be accurately stored and retrieved. The alternative approach of skipping Path 2 when nextRunAtMs is set and in the future may also be effective, but its implications should be carefully considered.

Recommendation

Apply the workaround by updating the isRunnableJob logic to compare previousRunAtMs with lastFulfilledSlotMs instead of lastRunAtMs, as this approach directly addresses the root cause of the issue.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING