今日日志 (/tmp/openclaw/openclaw-2026-04-13.log) 时间 | 级别 | 内容 -- | -- | -- 15:03:45 | WARN | Config overwrite 15:05:41 | WARN | Gateway 绑定到非 loopback 地址 15:05:41 | WARN | dangerouslyDisableDeviceAuth=true 安全警告 15:05:42 | INFO | cron: started (jobs: 0, nextWakeAtMs: null) 15:06:06 | WARN | pricing bootstrap failed: TimeoutError cron 当前状态：0 个 jobs，所以 start() 里的 bug 没有触发。 Bug 仍存在于 4.11 /app/dist/server.impl-CsRRyd9F.js 第 6582 行： for (const job of jobs) if (typeof job.state.runningAtMs === "number") { // ^^^^^^^^^^^^^^^ 没有 guard 对比有防护的地方（第 6211-6212 行）： if (!job.state) job.state = {}; // ← 有这个 guard if (typeof job.state.runningAtMs === "number") return false; 触发条件： store 里存有 job，但某个 job 缺少 state 字段（老版本创建/数据损坏）。

openclaw - ✅(Solved) Fix [Bug]: Cron bug TypeError: Cannot read properties of undefined (reading 'runningAtMs') [5 pull requests, 1 participants]

openclaw2026-04-13 15:29:19

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#66016•Fetched 2026-04-14 05:39:16

View on GitHub

Comments

Participants

Timeline

Reactions

Author

ziyue67

Participants

ziyue67

Timeline (top)

cross-referenced ×6referenced ×2labeled ×1unsubscribed ×1

Cron 调度器 bug 定位：

TypeError: Cannot read properties of undefined (reading 'runningAtMs')

错误发生在 cron.start() → runMissedJobs() → planStartupCatchup() 执行期间，job 对象在内存中被替换成了缺少 state 的版本。这是 OpenClaw 2026.4.9 的 cron 子系统 bug。

CLI 无法调用的原因： openclaw cron add/list 通过 WebSocket 连接 gateway RPC 调用 cron 方法，gateway 每次重启后 cron 都会崩溃重启，WebSocket 握手就卡住了，导致 CLI 无响应。回复紫月: Cron 调度器 bug 定位：

TypeError：无法读取未定义属性（读取“runningAtMs”）

错误发生在 cron.start（） → runMissedJobs（） → planStartupCatchup（）执行期间，job 对象在内存中被替换成了缺少 state 的版本。这是 OpenClaw 2026.4.9 的 cron 子系统 bug。升级到docker 4.11 CLI 无法调用的原因： openclaw cron add/list 通过 WebSocket 连接 gateway RPC 调用 cron 方法，gateway 每次重启后 cron 都会崩溃重启，WebSocket 握手就卡住了，导致 CLI 无响应。重新检测doker opencalw 4.11版本还有没有这个问题已确认 OpenClaw 4.11 版本仍然存在这个 bug。问题定位文件: /app/dist/server.impl-CsRRyd9F.js，第 6582 行，start() 函数内：

for (const job of jobs) if (typeof job.state.runningAtMs === "number") { ^^^^^^^^^^^^^^^ 没有 guard **问题：**直接访问 job.state.runningAtMs，没有先检查 job.state 是否存在。对比其他函数同一文件中的 isRunnableJob 和 isJobDue 都有防御性检查：

// isRunnableJob 第 6211-6212 行 if (!job.state) job.state = {}; if (typeof job.state.runningAtMs === "number") return false;

// isJobDue 第 3809-3810 行
if (!job.state) job.state = {};

Error Message

Root Cause

Cron 调度器 bug 定位：

TypeError: Cannot read properties of undefined (reading 'runningAtMs')

TypeError：无法读取未定义属性（读取“runningAtMs”）

// isRunnableJob 第 6211-6212 行 if (!job.state) job.state = {}; if (typeof job.state.runningAtMs === "number") return false;

// isJobDue 第 3809-3810 行
if (!job.state) job.state = {};

Fix Action

Fixed

Fixed by PR: fix(cron): guard against missing job.state in start() (#66016) (https://github.com/openclaw/openclaw/pull/66054)
Fixed by PR: fix(cron): guard legacy jobs without state on startup (https://github.com/openclaw/openclaw/pull/66063)
Fixed by PR: fix(cron): stop unresolved next-run refire loops (https://github.com/openclaw/openclaw/pull/66083)
Fixed by PR: fix: guard against null job.state in cron list and startup paths (https://github.com/openclaw/openclaw/pull/65989)
Fixed by PR: fix(cli): prevent process hang after gateway RPC commands (#66227) (https://github.com/openclaw/openclaw/pull/66276)

PR fix notes

PR #66054: fix(cron): guard against missing job.state in start() (#66016)

Repository: openclaw/openclaw
Author: WuKongAI-CMU
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/66054

Description (problem / solution / changelog)

Summary

The cron `start()` routine in `src/cron/service/ops.ts` scans `state.store.jobs` directly and reads `job.state.runningAtMs` without first checking whether `job.state` is defined. Jobs loaded from older store formats can lack the `state` field entirely, and the scheduler crashes on startup with:

``` TypeError: Cannot read properties of undefined (reading 'runningAtMs') ```

This is a beta-blocker regression — gateway crash-loops, WebSocket RPC handshake stalls, and `openclaw cron add/list` CLI commands hang indefinitely.

Credit to #66016 for a precise root-cause trace including the exact line and a comparison with the correctly-guarded `isRunnableJob`/`isJobDue` paths.

Scope of the fix

The downstream catch-up path goes through `isRunnableJob`, which already does `if (!job.state) job.state = {}` for exactly this reason, so `collectRunnableJobs` is safe. Only the unguarded startup scan in `start()` crashes.

This PR mirrors the existing `isRunnableJob`/`isJobDue` defensive init at the crash site — idiomatic to the surrounding code, minimal surface area.

Closes #66016.

Changes

`src/cron/service/ops.ts` — add `if (!job.state) { job.state = {}; }` guard before the `runningAtMs` read in the `start()` startup loop, with a comment explaining why and pointing at the matching pattern elsewhere

Test plan

4-line fix, all existing tests continue to pass
Mirrors the documented pattern in `isRunnableJob` / `isJobDue` — no new contract
Beta-blocker fix: addresses a hard crash on gateway startup when the cron store contains legacy jobs
Ideally this also needs a regression test that loads a job without `state` and calls `start()` — I can add it in a follow-up if maintainers want

🤖 Generated with Claude Code

Changed files

src/cron/service/ops.ts (modified, +9/-0)

PR #66063: fix(cron): guard legacy jobs without state on startup

Repository: openclaw/openclaw
Author: Rohan5commit
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/66063

Description (problem / solution / changelog)

Summary

Problem: cron startup assumed every persisted job already had a state object and crashed on legacy or damaged entries missing that field
Why it matters: gateway startup could fail before cron recovery logic ran, which blocks CLI and scheduled jobs for affected installs
What changed: normalize missing job.state objects during cron startup before stale running-marker cleanup, and add a restart regression test for stores missing state
What did NOT change (scope boundary): catch-up behavior still depends on persisted scheduling metadata; this PR only hardens startup against malformed legacy job records

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Closes #66016
Related #60495
This PR fixes a bug or regression

Root Cause (if applicable)

Root cause: cron startup dereferenced job.state.runningAtMs before normalizing legacy store entries that omitted state entirely
Missing detection / guardrail: startup had no regression test covering persisted jobs without a state object even though later cron paths already normalize it
Contributing context (if known): older or externally modified cron stores can contain jobs without state metadata

Regression Test Plan (if applicable)

Coverage level that should have caught this:
- Unit test
- Seam / integration test
- End-to-end test
- Existing coverage already sufficient
Target test or file: src/cron/service.restart-catchup.test.ts
Scenario the test should lock in: cron.start() should not crash when a persisted recurring job omits state and should recompute a safe nextRunAtMs
Why this is the smallest reliable guardrail: it exercises the exact startup path that previously dereferenced job.state before any later normalization ran
Existing test that already covers this (if any): restart catch-up tests cover stale running markers but not missing state
If no new test is added, why not:

User-visible / Behavior Changes

Gateway startup no longer crashes when cron store entries are missing state metadata.

Diagram (if applicable)

Before:
[cron start] -> [job missing state] -> [job.state.runningAtMs access] -> [TypeError / startup failure]

After:
[cron start] -> [missing state normalized to {}] -> [stale-marker cleanup + schedule recompute] -> [startup continues]

Security Impact (required)

New permissions/capabilities? (No)
Secrets/tokens handling changed? (No)
New/changed network calls? (No)
Command/tool execution surface changed? (No)
Data access scope changed? (No)
If any Yes, explain risk + mitigation:

Repro + Verification

Environment

OS: macOS
Runtime/container: local pnpm/vitest
Model/provider: N/A
Integration/channel (if any): cron service
Relevant config (redacted): cronEnabled=true, persisted job missing state

Steps

Persist a cron job record without a state object.
Start the cron service.
Observe startup behavior.

Expected

Cron startup completes and the job is normalized instead of crashing.

Actual

Before this fix, startup could throw while reading job.state.runningAtMs.

Evidence

Failing test/log before + passing after
Trace/log snippets
Screenshot/recording
Perf numbers (if relevant)

Human Verification (required)

Verified scenarios: targeted vitest run for src/cron/service.restart-catchup.test.ts and src/cron/service/ops.test.ts
Edge cases checked: persisted recurring job missing state metadata during startup
What you did not verify: full repo-wide pnpm check/test lanes and live Docker reproduction

Review Conversations

I replied to or resolved every bot review conversation I addressed in this PR.
I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

Backward compatible? (Yes)
Config/env changes? (No)
Migration needed? (No)
If yes, exact upgrade steps:

Risks and Mitigations

Risk: silently normalizing malformed state could mask deeper store corruption
- Mitigation: the change only creates the same empty state shape that other cron paths already expect, and the regression test locks that startup behavior in

AI-assisted: Yes (Codex). Testing: targeted vitest coverage listed above. I attempted codex review --base origin/main, but the local Codex CLI is currently rate-limited.

Changed files

src/cron/service.restart-catchup.test.ts (modified, +36/-0)
src/cron/service/ops.ts (modified, +3/-0)

PR #66083: fix(cron): stop unresolved next-run refire loops

Repository: openclaw/openclaw
Author: mbelinky
State: closed | merged: True
Link: https://github.com/openclaw/openclaw/pull/66083

Description (problem / solution / changelog)

Summary

fix the cron scheduler path where computeJobNextRunAtMs returning undefined was treated as a short retry instead of an unresolved schedule
keep the #17821 lower-bound guard for same-second refires, but stop inventing synthetic retries for unschedulable cron runs
keep a periodic maintenance wake armed for enabled jobs with no nextRunAtMs so the scheduler does not go fully idle after clearing an unresolved schedule
add focused regression coverage for both the completion path and the cron error-backoff path

Root cause

src/cron/service/timer.ts used MIN_REFIRE_GAP_MS and backoff delays for two different meanings:

lower bounds when a valid next run exists
fallback schedule values when cron next-run computation returned undefined

That second meaning was wrong. An unschedulable cron run could be re-armed a few seconds later and refire forever.

Scope

In scope:

#66019

Explicitly out of scope:

#66016, #65916, #65193: missing job.state startup-crash family
#65981: isolated cron-agent execution / cron-tool mismatch
#65987: task timestamp audit noise

Validation

pnpm test -- src/cron/service/timer.regression.test.ts src/cron/service.armtimer-tight-loop.test.ts

Notes

pnpm check is currently failing on unrelated latest-main TypeScript errors outside this slice (Discord, Feishu, Nextcloud Talk, WhatsApp, MCP, wizard setup, and one existing cron isolated-agent test type issue). I did not broaden this PR into those unrelated failures.

Changed files

CHANGELOG.md (modified, +1/-0)
src/cron/service.armtimer-tight-loop.test.ts (modified, +37/-0)
src/cron/service.issue-66019-unresolved-next-run.test.ts (added, +114/-0)
src/cron/service/timer.regression.test.ts (modified, +86/-3)
src/cron/service/timer.ts (modified, +47/-3)

PR #65989: fix: guard against null job.state in cron list and startup paths

Repository: openclaw/openclaw
Author: lml2468
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/65989

Description (problem / solution / changelog)

Summary

Fixes #65916 — Cron engine crashes on startup when job has no state field.

Root cause: Three code paths access job.state.xxx without null-checking job.state:

formatStatus() and printCronList() in cron-cli — crash on openclaw cron list
Startup stale-marker cleanup loop in ops.ts — the actual crash site when cron.start() fails

The scheduler itself (normalizeJobTickState) already guards against null state, but it is skipped when ensureLoaded is called with skipRecompute: true — which is exactly what the startup path does.

Changes

src/cli/cron-cli/shared.ts — Optional chaining on all job.state accesses in formatStatus and printCronList
src/cron/service/ops.ts — Added job.state ??= {} guard before the stale-marker loop (the startup crash site)
src/cron/service/store.ts — Added job.state ??= {} in ensureLoaded hydration loop, so all paths are safe at the source

Tests

src/cli/cron-cli/shared.test.ts — 3 new tests: null state, undefined state, idle status fallback
src/cron/service.null-state-startup.test.ts (new) — 3 regression tests: startup with state: null, startup with missing state, cron list with null state

All 15 new tests pass, full build succeeds.

AI Disclosure

AI-assisted (Claude Code via OpenClaw)
Fully tested (build + test suite pass)
Root cause verified against source code before implementation
Understands what the code does — three distinct crash paths, all defended at the narrowest safe point

Changed files

src/cli/cron-cli/shared.test.ts (modified, +23/-0)
src/cli/cron-cli/shared.ts (modified, +4/-4)
src/cron/service.null-state-startup.test.ts (added, +110/-0)
src/cron/service/ops.ts (modified, +6/-3)
src/cron/service/store.ts (modified, +9/-38)

PR #66276: fix(cli): prevent process hang after gateway RPC commands (#66227)

Repository: openclaw/openclaw
Author: lml2468
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/66276

Description (problem / solution / changelog)

Summary

Fixes #66227 — CLI commands (cron list, agents list, etc.) hang indefinitely after printing their result.

Root cause: executeGatewayRequestWithScopes calls client.stop() (fire-and-forget) after the RPC response arrives. This issues ws.close() but does not wait for the server's close frame. The underlying TCP socket remains ref'd until either the server responds or the 250 ms ws.terminate() grace timer fires. Since runCli() returns without process.exit(), any handle that outlives the command stalls the event loop indefinitely.

The regression in 2026.4.12 likely widened this window (Gateway-side close behavior change, tick interval change, or new ref'd handles introduced in that release).

Changes

`src/cli/run-main.ts`

Add process.exit(process.exitCode ?? 0) at the end of runCli(), after closeCliMemoryManagers() completes in the finally block.

All async cleanup runs before the call
process.once('exit', ...) handlers are synchronous and still fire normally (verified: finalizeDebugProxyCapture is sync)
Makes CLI exit deterministic regardless of which handles outlive the command

`src/gateway/client.ts`

Call tickTimer.unref() immediately after the setInterval in startTickWatch().

beginStop() already calls clearInterval(tickTimer) on the normal close path — this is unchanged
unref() is defence-in-depth for any path where clearInterval is not reached before the process would otherwise exit naturally (e.g. uncaught error before beginStop)
No effect on the Gateway server process (server has its own lifecycle management)

Safety analysis

process.once('exit') in run-main.ts (line 178): calls finalizeDebugProxyCapture() — synchronous SQLite flush + fetch patch removal. Fires correctly via process.exit(). ✅
No other process.on('exit') or process.on('beforeExit') registrations found in src/cli/. ✅
stopAndWait() was considered and rejected: awaiting a close-frame-dependent promise inside onHelloOk's callback stack has awkward timing and is redundant given the process.exit() in runCli(). ✅

Testing

Manual verification: openclaw cron list and openclaw agents list exit immediately after output. No behaviour change for error paths (already called process.exit(1)).

AI Disclosure

AI-assisted (Claude Code via OpenClaw)
Root cause traced through full call chain: runCli → program.parseAsync → callGatewayFromCli → executeGatewayRequestWithScopes → GatewayClient
Exit handler audit performed before adding process.exit()
Understands what the code does — minimal targeted fix, no architecture change

Changed files

src/entry.ts (modified, +5/-1)
src/gateway/client.ts (modified, +5/-0)
src/index.ts (modified, +8/-5)
src/infra/exit-after-flush.ts (added, +28/-0)

Code Example

15:05:42 [INFO] cron: started, jobs: 0, nextWakeAtMs: null
15:14:40 [ERROR] gateway connect failed: pairing required (CLI cron list)
15:16:33 [ERROR] gateway connect failed: pairing required (CLI cron list) 
CLI 卡住原因： Gateway 绑定 LAN 地址 + dangerouslyDisableDeviceAuth=true + pairing 模式冲突。

RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Beta release blocker

Yes

Summary

Cron 调度器 bug 定位：

TypeError: Cannot read properties of undefined (reading 'runningAtMs')

TypeError：无法读取未定义属性（读取“runningAtMs”）

// isRunnableJob 第 6211-6212 行 if (!job.state) job.state = {}; if (typeof job.state.runningAtMs === "number") return false;

// isJobDue 第 3809-3810 行
if (!job.state) job.state = {};

Steps to reproduce

Bug 仍存在于 4.11 /app/dist/server.impl-CsRRyd9F.js 第 6582 行：

for (const job of jobs) if (typeof job.state.runningAtMs === "number") { // ^^^^^^^^^^^^^^^ 没有 guard

对比有防护的地方（第 6211-6212 行）：

if (!job.state) job.state = {}; // ← 有这个 guard if (typeof job.state.runningAtMs === "number") return false;

Expected behavior

今日日志 (/tmp/openclaw/openclaw-2026-04-13.log)

for (const job of jobs) if (typeof job.state.runningAtMs === "number") { // ^^^^^^^^^^^^^^^ 没有 guard

对比有防护的地方（第 6211-6212 行）：

if (!job.state) job.state = {}; // ← 有这个 guard if (typeof job.state.runningAtMs === "number") return false;

触发条件： store 里存有 job，但某个 job 缺少 state 字段（老版本创建/数据损坏）。

Actual behavior

回复紫月: 检测我的cron任务检测结果：

当前系统状态： Cron 任务：0 个（存储文件 /home/node/.openclaw/cron/jobs.json 不存在，尚未初始化） CLI 状态：卡住 — 正如你所说，gateway 重启后 WebSocket RPC 挂起，CLI 命令无响应问题确认： openclaw cron list — CLI 卡住，无法通过 WebSocket 连接 Cron store 文件夹 /home/node/.openclaw/cron/ 不存在日志显示最后 cron 启动时 jobs 数量为 0

<html> <body> <div data-lark-html-role="root"> 项目 | 状态 -- | -- Cron 存储路径 | /home/node/.openclaw/cron/jobs.json 当前任务数 | 0 CLI 连接 | 失败 — pairing required (WebSocket RPC 需要配对) Bug 状态 | start() 第 6582 行仍缺少 if (!job.state) job.state = {} guard </div> </body> </html>

15:05:42 [INFO] cron: started, jobs: 0, nextWakeAtMs: null 15:14:40 [ERROR] gateway connect failed: pairing required (CLI cron list) 15:16:33 [ERROR] gateway connect failed: pairing required (CLI cron list) CLI 卡住原因： Gateway 绑定 LAN 地址 + dangerouslyDisableDeviceAuth=true + pairing 模式冲突。

Cron 任务：0 个 — 存储文件为空或不存在 Bug 仍存在于 4.11 — 未修复 CLI 无法使用 — WebSocket pairing 问题守护进程脚本位于：

/home/node/.openclaw/workspace/scripts/juya_ai_daily_scheduler.py 推送脚本： /home/node/.openclaw/workspace/scripts/juya_ai_daily.py

OpenClaw version

2026.4.11

Operating system

Debian13

Install method

Docker

Model

MinMax2.7

Provider / routing chain

Opneclaw->Openclaw-gateway->mainmax2.7->feishu

Additional provider/model setup details

No response

Logs, screenshots, and evidence

15:05:42 [INFO] cron: started, jobs: 0, nextWakeAtMs: null
15:14:40 [ERROR] gateway connect failed: pairing required (CLI cron list)
15:16:33 [ERROR] gateway connect failed: pairing required (CLI cron list) 
CLI 卡住原因： Gateway 绑定 LAN 地址 + dangerouslyDisableDeviceAuth=true + pairing 模式冲突。

Impact and severity

No response

Additional information

之前版本能用为openclaw 2026.3.28 升级到openclaw4.9就出现问题回退没问题升级版本就出现的

extent analysis

TL;DR

The most likely fix for the cron scheduler bug is to add a guard to check if job.state exists before accessing job.state.runningAtMs in the start() function.

Guidance

Verify the bug: Check if the issue still exists in the current version of OpenClaw (2026.4.11) by running the cron scheduler and checking the logs for errors.
Add a guard: Modify the start() function in /app/dist/server.impl-CsRRyd9F.js to add a check for job.state before accessing job.state.runningAtMs, similar to the checks in isRunnableJob and isJobDue functions.
Check for job state: Ensure that all jobs in the cron store have a valid state field to prevent the error from occurring.
Resolve WebSocket pairing issue: Fix the WebSocket pairing issue that is causing the CLI to hang by resolving the conflict between Gateway binding to a LAN address, dangerouslyDisableDeviceAuth=true, and pairing mode.

Example

// Modified start() function with added guard
for (const job of jobs) {
  if (!job.state) job.state = {}; // Add this line to initialize job.state if it's missing
  if (typeof job.state.runningAtMs === "number") {
    // ...
  }
}

Notes

The bug is still present in OpenClaw version 2026.4.11.
The issue is caused by a missing state field in some jobs, which can be due to old version creation or data corruption.
The WebSocket pairing issue is a separate problem that needs to be resolved to fix the CLI hanging issue.

Recommendation

Apply the workaround by adding a guard to the start() function to check for job.state before accessing job.state.runningAtMs. This will prevent the error from occurring, but it's recommended to also investigate and fix the underlying issue causing the missing state field in some jobs.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

今日日志 (/tmp/openclaw/openclaw-2026-04-13.log)

for (const job of jobs) if (typeof job.state.runningAtMs === "number") { // ^^^^^^^^^^^^^^^ 没有 guard

对比有防护的地方（第 6211-6212 行）：

if (!job.state) job.state = {}; // ← 有这个 guard if (typeof job.state.runningAtMs === "number") return false;

触发条件： store 里存有 job，但某个 job 缺少 state 字段（老版本创建/数据损坏）。

#permission error #memory optimization #batch processing #GPU compatibility #latency issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.