openclaw - ✅(Solved) Fix cron: 'skipped' status bypasses failureAlert — persistently skipped jobs go undetected [1 pull requests, 1 participants]

openclaw2026-04-04 11:47:56

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#60846•Fetched 2026-04-08 02:46:31

View on GitHub

Comments

Participants

Timeline

Reactions

Author

slideshow-dingo

Participants

slideshow-dingo

Timeline (top)

cross-referenced ×2referenced ×1

When a cron job transitions to "skipped" status (e.g., gateway-restart checks health and skips), the failureAlert is never evaluated. This means jobs that are configured to alert on failure silently skip with no notification, even when the skip itself may represent an unmet operational requirement (e.g., a health check that never succeeds, a job that has been permanently stuck in "skipped").

Error Message

The job has failureAlert configured but the alert never fires because the status is "skipped", not "error". The failureAlert delivery logic only evaluates when lastStatus === "error" or lastStatus === "failed". The "skipped" status is not considered a failure condition, which is correct from a success/failure perspective — but it means there is no visibility into jobs that never actually execute.

#60845 — Cron failureAlert deliveryStatus always "not-requested" on error runs

Root Cause

The failureAlert delivery logic only evaluates when lastStatus === "error" or lastStatus === "failed". The "skipped" status is not considered a failure condition, which is correct from a success/failure perspective — but it means there is no visibility into jobs that never actually execute.

Fix Action

Fixed

Fixed by PR: fix: cron failureAlert fires correctly on error and skipped runs (https://github.com/openclaw/openclaw/pull/60876)

PR fix notes

PR #60876: fix: cron failureAlert fires correctly on error and skipped runs

Repository: openclaw/openclaw
Author: lml2468
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/60876

Description (problem / solution / changelog)

Summary

Two related cron monitoring failures fixed in applyJobResult (src/cron/service/timer.ts).

Fix: #60845 — `failureAlert` never fires on error runs

Root cause: The local fork added an extra condition to the isBestEffort guard in applyJobResult:

// Before (broken):
const isBestEffort =
  job.delivery?.bestEffort === true ||
  (job.payload.kind === "agentTurn" && job.payload.bestEffortDeliver === true);

payload.bestEffortDeliver is a legacy field that gates output delivery, not failure alerting. Any agentTurn job that had bestEffortDeliver: true in its payload (set during migration from the old top-level format) would silently skip failureAlert forever — consecutiveErrors incremented correctly but emitFailureAlert was never called.

Fix: Revert to the upstream guard:

// After (correct):
const isBestEffort = job.delivery?.bestEffort === true;

Fix: #60846 — `failureAlert` never evaluated for "skipped" runs

Root cause: The else branch in applyJobResult handled both "ok" and "skipped" results identically — resetting consecutiveErrors to 0 and never calling resolveFailureAlert. A job that is permanently stuck in "skipped" state (e.g. gateway-restart health-check jobs, jobs with empty systemEvent text) generated zero alerts regardless of failureAlert configuration.

Fix: Split "skipped" into its own branch with:

A new consecutiveSkips counter (mirrors consecutiveErrors)
A new lastSkipAlertAtMs cooldown timestamp (mirrors lastFailureAlertAtMs)
resolveFailureAlert evaluation against consecutiveSkips >= alertConfig.after
emitFailureAlert with isSkip: true for a distinct message ("skipped N times / Reason: ...")
consecutiveErrors + lastFailureAlertAtMs still reset on "skipped" (a skip is not an error)
Both counters reset on "ok" (clean run clears all alert state)

Files Changed

src/cron/service/timer.ts — emitFailureAlert + applyJobResult
src/cron/types.ts — added consecutiveSkips?: number and lastSkipAlertAtMs?: number to CronJobState
src/gateway/protocol/schema/cron.ts — added consecutiveSkips and lastSkipAlertAtMs to CronJobStateSchema

Fixes openclaw/openclaw#60845 Fixes openclaw/openclaw#60846

Changed files

src/config/io.ts (modified, +49/-3)
src/cron/service/timer.ts (modified, +64/-26)
src/cron/types.ts (modified, +4/-0)
src/gateway/protocol/schema/cron.ts (modified, +2/-0)

Code Example

{
  "name": "gateway-restart",
  "lastStatus": "skipped",
  "lastRunStatus": "skipped",
  "lastDeliveryStatus": "not-requested",
  "lastError": "disabled",
  "failureAlert": { "after": 1, "channel": "discord", "to": "channel:..." }
}

---

{
  "failureAlert": {
    "after": 3,
    "includeSkipped": true,
    "cooldownMs": 3600000
  }
}

RAW_BUFFERClick to expand / collapse

Bug type

Bug / Silent Failure

OpenClaw version

OpenClaw 2026.4.2 (d74a122)

OS and install method

Linux 6.8.0-106-generic (x64), Node.js v24.14.1, npm global, systemd user service.

Summary

Steps to reproduce

Configure a cron job with failureAlert: { after: 1, channel: discord, to: channel:xxx }
Ensure the job's script exits with status 0 but reports "skipped" (e.g., a health check that always passes)
Observe the job state: "lastStatus": "skipped", "lastDeliveryStatus": "not-requested"
Even after many consecutive skips, no alert fires

Expected behavior

At minimum, the operator should receive feedback that a job is persistently being skipped — either via failureAlert, a warning in openclaw cron list, or a separate skipAlert configuration.

Actual behavior

Jobs that are stuck in "skipped" go completely unmonitored. No warning is logged, no alert is delivered, and openclaw cron list only shows "skipped" without indicating how many consecutive skips have occurred.

Logs and evidence

The gateway-restart cron job on a live instance has been in "skipped" state for 19+ hours:

{
  "name": "gateway-restart",
  "lastStatus": "skipped",
  "lastRunStatus": "skipped",
  "lastDeliveryStatus": "not-requested",
  "lastError": "disabled",
  "failureAlert": { "after": 1, "channel": "discord", "to": "channel:..." }
}

The job has failureAlert configured but the alert never fires because the status is "skipped", not "error".

Root cause analysis

Impact

Jobs that are permanently stuck in "skipped" (e.g., misconfigured health checks, cron expressions that fire at the wrong time) go undetected indefinitely
No way to distinguish "healthily skipping" from "permanently stuck"
The failureAlert configuration provides a false sense of monitoring coverage

Proposed fix

Add a new skipAlert configuration option (or extend failureAlert):

{
  "failureAlert": {
    "after": 3,
    "includeSkipped": true,
    "cooldownMs": 3600000
  }
}

Or alternatively, make failureAlert evaluate against any non-"ok" / non-"idle" terminal state (including "skipped").

#60845 — Cron failureAlert deliveryStatus always "not-requested" on error runs
#54834 — Cron isolated agentTurn announce delivery can complete with deliveryStatus: "unknown"

extent analysis

TL;DR

To address the issue of jobs silently skipping without notification, consider adding a skipAlert configuration option or extending the failureAlert to include skipped states.

Guidance

Review the failureAlert configuration to understand its current behavior and limitations.
Evaluate the proposed fix of adding a skipAlert option or extending failureAlert to include skipped states, considering the potential impact on monitoring and alerting.
Assess the current job configuration and identify potential cases where jobs may be permanently stuck in a "skipped" state, requiring additional monitoring or alerting.
Consider implementing a temporary workaround, such as manually monitoring job states or adding custom alerting logic, until a permanent fix is implemented.

Example

A potential configuration update could involve adding a skipAlert option, such as:

{
  "skipAlert": {
    "after": 3,
    "channel": "discord",
    "to": "channel:xxx"
  }
}

Alternatively, extending the failureAlert to include skipped states could be achieved by updating the configuration to:

{
  "failureAlert": {
    "after": 3,
    "includeSkipped": true,
    "channel": "discord",
    "to": "channel:xxx"
  }
}

Notes

The proposed fix requires careful consideration of the potential impact on monitoring and alerting, as well as the trade-offs between false positives and false negatives. Additionally, the implementation details of the skipAlert or extended failureAlert configuration will depend on the specific requirements and constraints of the system.

Recommendation

Apply a workaround by implementing a custom monitoring or alerting solution to detect and notify about jobs stuck in a "skipped" state, until a permanent fix is implemented. This will provide temporary visibility into potential issues and allow for more informed decision-making.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

At minimum, the operator should receive feedback that a job is persistently being skipped — either via failureAlert, a warning in openclaw cron list, or a separate skipAlert configuration.

#batch processing #GPU compatibility #latency issue #model loading #dependency error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix cron: 'skipped' status bypasses failureAlert — persistently skipped jobs go undetected [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fixed

PR fix notes

PR #60876: fix: cron failureAlert fires correctly on error and skipped runs

Description (problem / solution / changelog)

Summary

Fix: #60845 — failureAlert never fires on error runs

Fix: #60846 — failureAlert never evaluated for "skipped" runs

Files Changed

Changed files

Code Example

Bug type

OpenClaw version

OS and install method

Summary

Steps to reproduce

Expected behavior

Actual behavior

Logs and evidence

Root cause analysis

Impact

Proposed fix

Related

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING

Fix: #60845 — `failureAlert` never fires on error runs

Fix: #60846 — `failureAlert` never evaluated for "skipped" runs