openclaw - ✅(Solved) Fix Cron jobs without `id` field cause runtime state to collapse into first job (find-collision) [2 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#72849Fetched 2026-04-28 06:31:26
View on GitHub
Comments
1
Participants
2
Timeline
3
Reactions
0
Timeline (top)
cross-referenced ×2commented ×1

When ~/.openclaw/cron/jobs.json contains jobs missing the id field, the scheduler's Array#find() matches the first entry every time, causing every job's runtime state (errors, lastRunAtMs, nextRunAtMs, consecutiveErrors) to be written into the first job's slot. This makes diagnostics deeply misleading — one job appears to be the source of all failures when in fact it may be uninvolved.

Error Message

  1. Let any cron job error. if (legacyJobIdIssue) { /* existing warn */ } state.deps.log.warn( state.deps.log.error({ result }, "cron: refusing to apply outcome with undefined jobId");

Root Cause

In dist/server.impl-DYBxSjUs.js, applyOutcomeToStoredJob matches by id:

const job = jobs.find((entry) => entry.id === result.jobId);

When all jobs have id === undefined, this returns the first entry unconditionally.

The constructor path (createJob in dist/jobs-CXWpCyUn.js line 526–555) already generates crypto.randomUUID(), but plain config-load via loadCronStore (dist/store-C0bwjphP.js) does not backfill. normalizeCronJobIdentityFields only migrates legacyJobId → id; with neither field present, raw.id stays undefined.

Other affected paths (verified by grep): runDueJob (server.impl line 4798), nextWakeAtMs job iteration (jobs.js line 513).

Fix Action

Fix / Workaround

User-visible workaround (no patch required)

PR fix notes

PR #72853: fix(cron): backfill missing job ids

Description (problem / solution / changelog)

Summary

  • generate and persist stable IDs for hand-authored cron jobs missing both id and legacy jobId
  • guard cron outcome application from invalid job IDs so runtime state cannot match an id-less row
  • document the auto-backfill behavior and add a changelog entry

Fixes #72849.

Tests

  • CI=true OPENCLAW_LOCAL_CHECK=0 npm exec [email protected] -- test src/cron/service/store.test.ts src/cron/service/timer.test.ts src/cron/service/timer.regression.test.ts
  • CI=true OPENCLAW_LOCAL_CHECK=0 npm exec [email protected] -- run check:changed

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • docs/cli/cron.md (modified, +2/-0)
  • src/cron/service/store.test.ts (modified, +74/-0)
  • src/cron/service/store.ts (modified, +22/-0)
  • src/cron/service/timer.ts (modified, +7/-0)

PR #72855: fix(cron): backfill missing job ids at load time to prevent find-collision (#72849)

Description (problem / solution / changelog)

Fixes #72849.

Summary

When `~/.openclaw/cron/jobs.json` contains jobs missing the `id` field, every job's runtime outcome (errors, `lastRunAtMs`, `nextRunAtMs`, `consecutiveErrors`) gets written into the first entry's slot. `applyOutcomeToStoredJob` does `jobs.find(entry => entry.id === result.jobId)` and when both sides are `undefined` the find returns the first entry unconditionally.

The reporter (#72849) spent hours chasing what looked like a single constantly-failing disabled job before realising the failures actually came from a different job whose outcomes were collapsing into that slot via `undefined === undefined`. The only visible tell was a `[cron:undefined]` log line.

Fix

Three layers:

  1. `normalize-job-identity.ts` — when neither `id` nor legacy `jobId` is present, generate a fresh `randomUUID()` and signal it back via a new `backfilledMissingId: boolean` field on the return value.
  2. `service/store.ts:ensureLoaded` — log a warning whenever the loader backfills, prompting the operator to run `openclaw doctor --fix` to persist the canonical shape.
  3. `service/timer.ts:applyOutcomeToStoredJob` — defensive guard refuses to apply outcomes with missing `result.jobId` instead of letting `find` silently match the first job. Belt-and-suspenders with the load-time backfill.

Tests

  • 4 existing `normalize-job-identity` tests still pass.
  • 3 new regression tests cover:
    • Backfill produces distinct UUIDs across calls (so identity-based finds cannot collide on a fresh load).
    • Empty / whitespace `id` triggers backfill.
    • Pre-existing `id` is preserved (no churn).

``` Test Files 84 passed (84) Tests 734 passed (734) ```

Full cron suite green.

Changed files

  • CHANGELOG.md (modified, +4/-0)
  • src/cron/normalize-job-identity.test.ts (modified, +30/-0)
  • src/cron/normalize-job-identity.ts (modified, +18/-3)
  • src/cron/service/store.ts (modified, +11/-1)
  • src/cron/service/timer.ts (modified, +12/-0)

Code Example

const job = jobs.find((entry) => entry.id === result.jobId);

---

import { randomUUID } from "node:crypto";

for (const job of jobs) {
    const raw = job;
    const { legacyJobIdIssue } = normalizeCronJobIdentityFields(raw);
    if (legacyJobIdIssue) { /* existing warn */ }
    if (typeof job.enabled !== "boolean") job.enabled = true;
    if (typeof job.id !== "string" || !job.id.trim()) {
        job.id = randomUUID();
        state.deps.log.warn(
          { storePath, jobName: job.name, newId: job.id },
          "cron: backfilled missing id for job; writing canonical shape"
        );
        state.requiresPersist = true;
    }
}
if (state.requiresPersist && !opts?.skipPersist) await persist(state);

---

if (result.jobId === undefined) {
    state.deps.log.error({ result }, "cron: refusing to apply outcome with undefined jobId");
    return;
}
const job = jobs.find((entry) => entry.id === result.jobId);

---

python3 -c 'import uuid; [print(uuid.uuid4()) for _ in range(N)]'

---

jq '(.jobs[] | select(.name == "<first_job_name>")).state = {}' \
  ~/.openclaw/cron/jobs.json > /tmp/jobs.json.new && \
  mv /tmp/jobs.json.new ~/.openclaw/cron/jobs.json
RAW_BUFFERClick to expand / collapse

Cron jobs without id field cause runtime state to collapse into first job (find-collision)

Summary

When ~/.openclaw/cron/jobs.json contains jobs missing the id field, the scheduler's Array#find() matches the first entry every time, causing every job's runtime state (errors, lastRunAtMs, nextRunAtMs, consecutiveErrors) to be written into the first job's slot. This makes diagnostics deeply misleading — one job appears to be the source of all failures when in fact it may be uninvolved.

Version

[email protected] (npm-global install, Linux x86_64, Ubuntu 24.04.4)

Repro

  1. Create ~/.openclaw/cron/jobs.json with multiple jobs[] entries that have name set but no id field. (This is the natural state when jobs.json is hand-authored or migrated from older config that didn't require id.)
  2. Restart openclaw-gateway.
  3. Let any cron job error.

Symptom

  • The first job in the array accumulates consecutiveErrors, lastError, lastRunAtMs, nextRunAtMs from all other jobs.
  • Journal shows [cron:undefined] log lines (e.g. [cron:undefined] skipping stale delivery scheduled at <timestamp>).
  • A disabled: true job can appear to be the failing one even when it never fires, because it occupies the first slot.

Root cause

In dist/server.impl-DYBxSjUs.js, applyOutcomeToStoredJob matches by id:

const job = jobs.find((entry) => entry.id === result.jobId);

When all jobs have id === undefined, this returns the first entry unconditionally.

The constructor path (createJob in dist/jobs-CXWpCyUn.js line 526–555) already generates crypto.randomUUID(), but plain config-load via loadCronStore (dist/store-C0bwjphP.js) does not backfill. normalizeCronJobIdentityFields only migrates legacyJobId → id; with neither field present, raw.id stays undefined.

Other affected paths (verified by grep): runDueJob (server.impl line 4798), nextWakeAtMs job iteration (jobs.js line 513).

Proposed fix

Two changes:

1. Backfill UUIDs at load time in loadCronStore / ensureLoaded

import { randomUUID } from "node:crypto";

for (const job of jobs) {
    const raw = job;
    const { legacyJobIdIssue } = normalizeCronJobIdentityFields(raw);
    if (legacyJobIdIssue) { /* existing warn */ }
    if (typeof job.enabled !== "boolean") job.enabled = true;
    if (typeof job.id !== "string" || !job.id.trim()) {
        job.id = randomUUID();
        state.deps.log.warn(
          { storePath, jobName: job.name, newId: job.id },
          "cron: backfilled missing id for job; writing canonical shape"
        );
        state.requiresPersist = true;
    }
}
if (state.requiresPersist && !opts?.skipPersist) await persist(state);

2. Defensive guard against undefined jobId

In applyOutcomeToStoredJob (server.impl line 4697), runDueJob (line 4798), and the collectRunnableJobs mapping:

if (result.jobId === undefined) {
    state.deps.log.error({ result }, "cron: refusing to apply outcome with undefined jobId");
    return;
}
const job = jobs.find((entry) => entry.id === result.jobId);

User-visible workaround (no patch required)

Add explicit UUIDs to each job in ~/.openclaw/cron/jobs.json:

python3 -c 'import uuid; [print(uuid.uuid4()) for _ in range(N)]'

Paste one per job under each entry's id field. Then reset the polluted state of the first job so accumulated errors don't influence the next run:

jq '(.jobs[] | select(.name == "<first_job_name>")).state = {}' \
  ~/.openclaw/cron/jobs.json > /tmp/jobs.json.new && \
  mv /tmp/jobs.json.new ~/.openclaw/cron/jobs.json

Why this matters

The find-collision silently misattributes state across jobs. Operators investigating "why is job X failing 46 times" can spend hours chasing the wrong job — in our case, auto-memory-dream (disabled) appeared to be the failing one when in fact health-snapshot was OOM-ing on Ollama and its errors were being recorded in auto-memory-dream's slot. The [cron:undefined] log line is the only visible tell.

Affected files (v2026.4.20)

  • dist/jobs-CXWpCyUn.jscreateJob already has UUID gen; normalizeCronJobIdentityFields is the migration entrypoint that should fall through to UUID gen for missing-id case
  • dist/store-C0bwjphP.jsloadCronStore (where the backfill should happen)
  • dist/server.impl-DYBxSjUs.jsapplyOutcomeToStoredJob (line 4697), runDueJob (line 4798), collectRunnableJobs

extent analysis

TL;DR

To fix the issue, backfill missing id fields in ~/.openclaw/cron/jobs.json with unique UUIDs and add a defensive guard against undefined jobId in the code.

Guidance

  • Add explicit UUIDs to each job in ~/.openclaw/cron/jobs.json using a tool like python3 -c 'import uuid; [print(uuid.uuid4()) for _ in range(N)]'.
  • Reset the polluted state of the first job by setting its state to an empty object using jq.
  • Consider applying the proposed fix to loadCronStore and applyOutcomeToStoredJob to prevent similar issues in the future.
  • Verify that the fix worked by checking the logs for [cron:undefined] lines and ensuring that each job's state is correctly attributed.

Example

To backfill missing id fields, you can use the following Python command:

python3 -c 'import uuid; [print(uuid.uuid4()) for _ in range(N)]'

Replace N with the number of jobs in your jobs.json file.

Notes

The proposed fix involves modifying the loadCronStore and applyOutcomeToStoredJob functions to handle missing id fields and undefined jobId values. This fix should be applied to the affected files, including dist/jobs-CXWpCyUn.js, dist/store-C0bwjphP.js, and dist/server.impl-DYBxSjUs.js.

Recommendation

Apply the proposed fix to prevent similar issues in the future. The fix involves backfilling missing id fields and adding a defensive guard against undefined jobId values, which should prevent the silent misattribution of state across jobs.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix Cron jobs without `id` field cause runtime state to collapse into first job (find-collision) [2 pull requests, 1 comments, 2 participants]