openclaw - ✅(Solved) Fix cron: failureAlert never fires — all error jobs show deliveryStatus 'not-requested' [1 pull requests, 1 participants]

Q: Expected behavior

After `consecutiveErrors >= failureAlert.after` (e.g., 1), the gateway should deliver the failure alert to the configured channel.

openclaw2026-04-04 11:47:34

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#60845•Fetched 2026-04-08 02:46:33

View on GitHub

Comments

Participants

Timeline

Reactions

Author

slideshow-dingo

Participants

slideshow-dingo

Timeline (top)

cross-referenced ×3referenced ×1

Error Message

Cron job failureAlert is configured but never fires because deliveryStatus is always "not-requested" on error runs. The consecutiveErrors counter increments correctly, but the delivery path is skipped entirely. No alert reaches Discord, Slack, or any configured channel. 2. Trigger the job to fail (script error or timeout) 3. Run openclaw cron list — job shows error status with consecutiveErrors: 1 (or higher) 4. Query job runs: openclaw cron runs <id> — every error run shows deliveryStatus: not-requested Every error run shows deliveryStatus: "not-requested". No delivery attempt is logged. The failureAlert.after threshold has no effect — alerts never fire. The failureAlert delivery path appears to be decoupled from the job runner error handling. When a cron job fails (either agentTurn timeout or script error), the consecutiveErrors counter increments but the delivery request is never triggered. This is distinct from #56521 (feature request for agent-turn alerts) — this bug means even the baseline announce delivery mechanism does not fire.

Audit the cron error handling path to confirm requestDelivery() is not called on job failure

Root Cause

The failureAlert delivery path appears to be decoupled from the job runner error handling. When a cron job fails (either agentTurn timeout or script error), the consecutiveErrors counter increments but the delivery request is never triggered. This is distinct from #56521 (feature request for agent-turn alerts) — this bug means even the baseline announce delivery mechanism does not fire.

Fix Action

Fixed

Fixed by PR: fix: cron failureAlert fires correctly on error and skipped runs (https://github.com/openclaw/openclaw/pull/60876)

PR fix notes

PR #60876: fix: cron failureAlert fires correctly on error and skipped runs

Repository: openclaw/openclaw
Author: lml2468
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/60876

Description (problem / solution / changelog)

Summary

Two related cron monitoring failures fixed in applyJobResult (src/cron/service/timer.ts).

Fix: #60845 — `failureAlert` never fires on error runs

Root cause: The local fork added an extra condition to the isBestEffort guard in applyJobResult:

// Before (broken):
const isBestEffort =
  job.delivery?.bestEffort === true ||
  (job.payload.kind === "agentTurn" && job.payload.bestEffortDeliver === true);

payload.bestEffortDeliver is a legacy field that gates output delivery, not failure alerting. Any agentTurn job that had bestEffortDeliver: true in its payload (set during migration from the old top-level format) would silently skip failureAlert forever — consecutiveErrors incremented correctly but emitFailureAlert was never called.

Fix: Revert to the upstream guard:

// After (correct):
const isBestEffort = job.delivery?.bestEffort === true;

Fix: #60846 — `failureAlert` never evaluated for "skipped" runs

Root cause: The else branch in applyJobResult handled both "ok" and "skipped" results identically — resetting consecutiveErrors to 0 and never calling resolveFailureAlert. A job that is permanently stuck in "skipped" state (e.g. gateway-restart health-check jobs, jobs with empty systemEvent text) generated zero alerts regardless of failureAlert configuration.

Fix: Split "skipped" into its own branch with:

A new consecutiveSkips counter (mirrors consecutiveErrors)
A new lastSkipAlertAtMs cooldown timestamp (mirrors lastFailureAlertAtMs)
resolveFailureAlert evaluation against consecutiveSkips >= alertConfig.after
emitFailureAlert with isSkip: true for a distinct message ("skipped N times / Reason: ...")
consecutiveErrors + lastFailureAlertAtMs still reset on "skipped" (a skip is not an error)
Both counters reset on "ok" (clean run clears all alert state)

Files Changed

src/cron/service/timer.ts — emitFailureAlert + applyJobResult
src/cron/types.ts — added consecutiveSkips?: number and lastSkipAlertAtMs?: number to CronJobState
src/gateway/protocol/schema/cron.ts — added consecutiveSkips and lastSkipAlertAtMs to CronJobStateSchema

Fixes openclaw/openclaw#60845 Fixes openclaw/openclaw#60846

Changed files

src/config/io.ts (modified, +49/-3)
src/cron/service/timer.ts (modified, +64/-26)
src/cron/types.ts (modified, +4/-0)
src/gateway/protocol/schema/cron.ts (modified, +2/-0)

RAW_BUFFERClick to expand / collapse

Bug type

Bug / Silent Failure

OpenClaw version

OpenClaw 2026.4.2 (d74a122)

OS and install method

Linux 6.8.0-106-generic (x64), Node.js v24.14.1, npm global, systemd user service.

Summary

Steps to reproduce

Create a cron job with failureAlert: { after: 1, channel: discord, to: channel:xxx }
Trigger the job to fail (script error or timeout)
Run openclaw cron list — job shows error status with consecutiveErrors: 1 (or higher)
Query job runs: openclaw cron runs <id> — every error run shows deliveryStatus: not-requested
Check the Discord channel — no alert message was posted

Expected behavior

After consecutiveErrors >= failureAlert.after (e.g., 1), the gateway should deliver the failure alert to the configured channel.

Actual behavior

Every error run shows deliveryStatus: "not-requested". No delivery attempt is logged. The failureAlert.after threshold has no effect — alerts never fire.

Logs and evidence

Two jobs confirmed on a live instance running 2026.4.2:

pr47305-monitor: 28 errors today, all "not-requested"
memory-health-unified: 2 errors today, all "not-requested"

Gateway logs (/tmp/openclaw/openclaw-2026-04-04.log) show zero delivery request entries for any failureAlert-configured job.

Root cause analysis

Impact

All cron failure alerts are silently dropped — operators receive zero notification when jobs fail, regardless of failureAlert configuration
This affects systemEvent and agentTurn jobs alike
The failureAlert.after setting is effectively a no-op

Proposed fix

Audit the cron error handling path to confirm requestDelivery() is not called on job failure
Wire up failureAlert delivery to fire when consecutiveErrors >= failureAlert.after
Add a deliveryStatus value of "failed" (deliver failed) distinct from "not-requested" (never attempted)

#56521 — Feature: Route failure alerts as agent-turn events (this bug blocks that feature from working at all)
#54834 — Cron isolated agentTurn announce delivery can complete with deliveryStatus: "unknown"

extent analysis

TL;DR

The failureAlert delivery path needs to be coupled with the job runner error handling to trigger alerts when a cron job fails.

Guidance

Review the cron error handling code to ensure requestDelivery() is called when a job fails, specifically when consecutiveErrors >= failureAlert.after.
Verify that the deliveryStatus is updated correctly to reflect the delivery attempt, such as adding a "failed" status.
Check the gateway logs for any delivery request entries related to failureAlert-configured jobs to confirm the issue.
Test the failureAlert functionality with a simple cron job to isolate the problem and verify the fix.

Example

No code snippet is provided as the issue does not include specific code references.

Notes

The proposed fix involves auditing the cron error handling path and wiring up the failureAlert delivery. However, without access to the codebase, it's difficult to provide a more detailed solution.

Recommendation

Apply a workaround by manually triggering the failureAlert delivery when a job fails, until the underlying issue is fixed. This can be done by creating a custom script that checks the job status and triggers the alert when necessary.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

After consecutiveErrors >= failureAlert.after (e.g., 1), the gateway should deliver the failure alert to the configured channel.

#GPU compatibility #latency issue #model loading #dependency error #configuration error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix cron: failureAlert never fires — all error jobs show deliveryStatus 'not-requested' [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fixed

PR fix notes

PR #60876: fix: cron failureAlert fires correctly on error and skipped runs

Description (problem / solution / changelog)

Summary

Fix: #60845 — failureAlert never fires on error runs

Fix: #60846 — failureAlert never evaluated for "skipped" runs

Files Changed

Changed files

Bug type

OpenClaw version

OS and install method

Summary

Steps to reproduce

Expected behavior

Actual behavior

Logs and evidence

Root cause analysis

Impact

Proposed fix

Related

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING

Fix: #60845 — `failureAlert` never fires on error runs

Fix: #60846 — `failureAlert` never evaluated for "skipped" runs