openclaw - ✅(Solved) Fix Cron engine crashes on startup when job has no state field [3 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#65916Fetched 2026-04-14 05:39:37
View on GitHub
Comments
1
Participants
2
Timeline
6
Reactions
0
Timeline (top)
cross-referenced ×3referenced ×2commented ×1

cron.start() crashes with TypeError: Cannot read properties of undefined (reading 'runningAtMs') when any job in jobs.json has a null or missing state field.

This prevents all cron jobs from running — not just the one with the missing state.

Root Cause

In cron-cli-BBHRXUsj.js:184, the formatStatus function accesses job.state.runningAtMs without null-checking job.state:

const formatStatus = (job) => {
    if (!job.enabled) return "disabled";
    if (job.state.runningAtMs) return "running";  // crashes here
    return job.state.lastStatus ?? "idle";
};

The cron scheduler itself handles this case correctly (line 5786-5787 of gateway-cli):

if (!job.state) {
    job.state = {};

But formatStatus is called during cron.start() before the scheduler gets a chance to initialize the state, so the entire cron system fails to start.

Fix Action

Workaround

Ensure all jobs have at least "state": {} before the gateway starts.

PR fix notes

PR #65979: fix(cron): normalize missing persisted job state

Description (problem / solution / changelog)

Closes #65916

Summary

  • Problem: persisted cron jobs with missing state or state: null crash cron.start() while reading job.state.runningAtMs.
  • Why it matters: one malformed restored/programmatic job can prevent the entire cron scheduler from starting.
  • What changed: cron job hydration now normalizes invalid runtime state to {} before service startup logic runs, and CLI list rendering uses the same state guard.
  • What did NOT change (scope boundary): this does not change cron schedule semantics, job payloads, delivery behavior, or doctor migration policy.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #65916
  • Related #
  • This PR fixes a bug or regression

Root Cause (if applicable)

  • Root cause: loadCronStore() accepts persisted JSON and ensureLoaded() hydrates entries as CronJob, but it did not restore the required runtime state object when the on-disk job omitted it or set it to null.
  • Missing detection / guardrail: startup regression coverage only covered legacy jobs with state: {}, not malformed/missing runtime state.
  • Contributing context (if known): doctor already repairs invalid cron state on disk, but runtime startup still needed to be resilient before users run doctor.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file:
    • src/cron/service.issue-regressions.test.ts
    • src/cli/cron-cli/shared.test.ts
  • Scenario the test should lock in: persisted jobs with missing state and state: null should not crash cron startup or cron list rendering.
  • Why this is the smallest reliable guardrail: the service regression test exercises the persisted store hydration path that previously crashed cron.start(), and the CLI test covers the adjacent direct rendering path.
  • Existing test that already covers this (if any): none.
  • If no new test is added, why not: N/A.

User-visible / Behavior Changes

Cron startup now tolerates restored or programmatically-created jobs that omit runtime state or contain state: null; the scheduler starts and recomputes runtime state instead of failing the whole cron service.

Diagram (if applicable)

Before:
[jobs.json has job without state] -> [cron.start()] -> [job.state.runningAtMs crash]

After:
[jobs.json has job without state] -> [hydrate job state as {}] -> [cron.start()] -> [scheduler runs]

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: local Node / pnpm
  • Model/provider: N/A
  • Integration/channel (if any): cron scheduler
  • Relevant config (redacted): temporary cron store with one job missing state and one job using state: null

Steps

  1. Write a jobs.json with a valid cron job that omits state.
  2. Write a second job with state: null.
  3. Start CronService against that store.

Expected

  • cron.start() succeeds.
  • Runtime state is initialized and nextRunAtMs is recomputed.
  • CLI list rendering does not throw on missing/null state.

Actual

  • Before this fix: startup threw TypeError: Cannot read properties of undefined/null (reading 'runningAtMs').
  • After this fix: startup succeeds for both missing and null state cases.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios:
    • pnpm test src/cron/service.issue-regressions.test.ts src/cli/cron-cli/shared.test.ts
    • A direct node --import tsx --eval ... repro confirmed state: null and missing state both start successfully.
    • ./node_modules/.bin/oxfmt --check src/cron/job-state.ts src/cron/service/store.ts src/cron/service/jobs.ts src/cron/service/timer.ts src/cli/cron-cli/shared.ts src/cron/service.issue-regressions.test.ts src/cli/cron-cli/shared.test.ts
    • git diff --check
  • Edge cases checked:
    • Missing state
    • state: null
    • CLI list rendering with missing/null state
  • What you did not verify:
    • Full pnpm format:check, because it currently fails on unrelated pre-existing formatting issues in untouched files.
    • Full pnpm tsgo, because it currently fails on unrelated Discord monitor firstHeartbeatTimeout type errors in untouched files.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: normalizing invalid runtime state could hide malformed persisted cron entries.
    • Mitigation: this only restores the runtime-only state object that cron already treats as mutable/recomputable; existing doctor repair remains the path for canonical on-disk cleanup.

Changed files

  • src/cli/cron-cli/shared.test.ts (modified, +18/-0)
  • src/cli/cron-cli/shared.ts (modified, +7/-4)
  • src/cron/job-state.ts (added, +16/-0)
  • src/cron/service.issue-regressions.test.ts (modified, +44/-0)
  • src/cron/service/jobs.ts (modified, +16/-16)
  • src/cron/service/store.ts (modified, +2/-0)
  • src/cron/service/timer.ts (modified, +11/-14)

PR #65989: fix: guard against null job.state in cron list and startup paths

Description (problem / solution / changelog)

Summary

Fixes #65916 — Cron engine crashes on startup when job has no state field.

Root cause: Three code paths access job.state.xxx without null-checking job.state:

  1. formatStatus() and printCronList() in cron-cli — crash on openclaw cron list
  2. Startup stale-marker cleanup loop in ops.ts — the actual crash site when cron.start() fails

The scheduler itself (normalizeJobTickState) already guards against null state, but it is skipped when ensureLoaded is called with skipRecompute: true — which is exactly what the startup path does.

Changes

  • src/cli/cron-cli/shared.ts — Optional chaining on all job.state accesses in formatStatus and printCronList
  • src/cron/service/ops.ts — Added job.state ??= {} guard before the stale-marker loop (the startup crash site)
  • src/cron/service/store.ts — Added job.state ??= {} in ensureLoaded hydration loop, so all paths are safe at the source

Tests

  • src/cli/cron-cli/shared.test.ts — 3 new tests: null state, undefined state, idle status fallback
  • src/cron/service.null-state-startup.test.ts (new) — 3 regression tests: startup with state: null, startup with missing state, cron list with null state

All 15 new tests pass, full build succeeds.

AI Disclosure

  • AI-assisted (Claude Code via OpenClaw)
  • Fully tested (build + test suite pass)
  • Root cause verified against source code before implementation
  • Understands what the code does — three distinct crash paths, all defended at the narrowest safe point

Changed files

  • src/cli/cron-cli/shared.test.ts (modified, +23/-0)
  • src/cli/cron-cli/shared.ts (modified, +4/-4)
  • src/cron/service.null-state-startup.test.ts (added, +110/-0)
  • src/cron/service/ops.ts (modified, +6/-3)
  • src/cron/service/store.ts (modified, +9/-38)

PR #66083: fix(cron): stop unresolved next-run refire loops

Description (problem / solution / changelog)

Summary

  • fix the cron scheduler path where computeJobNextRunAtMs returning undefined was treated as a short retry instead of an unresolved schedule
  • keep the #17821 lower-bound guard for same-second refires, but stop inventing synthetic retries for unschedulable cron runs
  • keep a periodic maintenance wake armed for enabled jobs with no nextRunAtMs so the scheduler does not go fully idle after clearing an unresolved schedule
  • add focused regression coverage for both the completion path and the cron error-backoff path

Root cause

src/cron/service/timer.ts used MIN_REFIRE_GAP_MS and backoff delays for two different meanings:

  1. lower bounds when a valid next run exists
  2. fallback schedule values when cron next-run computation returned undefined

That second meaning was wrong. An unschedulable cron run could be re-armed a few seconds later and refire forever.

Scope

In scope:

  • #66019

Explicitly out of scope:

  • #66016, #65916, #65193: missing job.state startup-crash family
  • #65981: isolated cron-agent execution / cron-tool mismatch
  • #65987: task timestamp audit noise

Validation

  • pnpm test -- src/cron/service/timer.regression.test.ts src/cron/service.armtimer-tight-loop.test.ts

Notes

  • pnpm check is currently failing on unrelated latest-main TypeScript errors outside this slice (Discord, Feishu, Nextcloud Talk, WhatsApp, MCP, wizard setup, and one existing cron isolated-agent test type issue). I did not broaden this PR into those unrelated failures.

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • src/cron/service.armtimer-tight-loop.test.ts (modified, +37/-0)
  • src/cron/service.issue-66019-unresolved-next-run.test.ts (added, +114/-0)
  • src/cron/service/timer.regression.test.ts (modified, +86/-3)
  • src/cron/service/timer.ts (modified, +47/-3)

Code Example

const formatStatus = (job) => {
    if (!job.enabled) return "disabled";
    if (job.state.runningAtMs) return "running";  // crashes here
    return job.state.lastStatus ?? "idle";
};

---

if (!job.state) {
    job.state = {};

---

const formatStatus = (job) => {
    if (!job.enabled) return "disabled";
    if (job.state?.runningAtMs) return "running";
    return job.state?.lastStatus ?? "idle";
};
RAW_BUFFERClick to expand / collapse

Description

cron.start() crashes with TypeError: Cannot read properties of undefined (reading 'runningAtMs') when any job in jobs.json has a null or missing state field.

This prevents all cron jobs from running — not just the one with the missing state.

Version

OpenClaw 2026.4.1 (da64a97)

Steps to reproduce

  1. Add a job to ~/.openclaw/cron/jobs.json without a state field (or with "state": null)
  2. Restart the gateway
  3. Observe in logs: [cron] failed to start: TypeError: Cannot read properties of undefined (reading 'runningAtMs')

This can happen when jobs are added programmatically or restored from a backup/reference file that doesn't include runtime state.

Root cause

In cron-cli-BBHRXUsj.js:184, the formatStatus function accesses job.state.runningAtMs without null-checking job.state:

const formatStatus = (job) => {
    if (!job.enabled) return "disabled";
    if (job.state.runningAtMs) return "running";  // crashes here
    return job.state.lastStatus ?? "idle";
};

The cron scheduler itself handles this case correctly (line 5786-5787 of gateway-cli):

if (!job.state) {
    job.state = {};

But formatStatus is called during cron.start() before the scheduler gets a chance to initialize the state, so the entire cron system fails to start.

Suggested fix

const formatStatus = (job) => {
    if (!job.enabled) return "disabled";
    if (job.state?.runningAtMs) return "running";
    return job.state?.lastStatus ?? "idle";
};

Workaround

Ensure all jobs have at least "state": {} before the gateway starts.

Impact

When this bug triggers, all cron jobs stop running (not just the affected one), since the entire cron service fails to start. No scheduled tasks, no deliveries, no monitoring — silent failure with only a single log line.

extent analysis

TL;DR

The most likely fix is to update the formatStatus function to include null-checking for job.state to prevent the TypeError.

Guidance

  • Verify the issue by checking the jobs.json file for any jobs with missing or null state fields and ensure that all jobs have at least an empty state object ("state": {}).
  • Apply the suggested fix by updating the formatStatus function to include optional chaining (job.state?.runningAtMs) to prevent the TypeError.
  • As a temporary workaround, ensure all jobs have a valid state field before starting the gateway to prevent the cron service from failing to start.
  • Review the gateway-cli code to understand how the cron scheduler handles jobs with missing state fields and ensure consistency in handling such cases.

Example

const formatStatus = (job) => {
    if (!job.enabled) return "disabled";
    if (job.state?.runningAtMs) return "running";
    return job.state?.lastStatus ?? "idle";
};

Notes

This fix assumes that the formatStatus function is the only place where job.state is accessed without null-checking. Additional checks may be necessary to ensure that all parts of the code handle jobs with missing state fields correctly.

Recommendation

Apply the workaround by ensuring all jobs have a valid state field before starting the gateway, as this is a simpler and more immediate solution that can prevent the cron service from failing to start.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING