openclaw - 💡(How to fix) Fix cron: pre-execution watchdog formula halves effective job timeout for silent tool executions [2 pull requests]

Official PRs (…)
ON THIS PAGE

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Fix Action

Fixed

Code Example

function resolveCronAgentPreExecutionWatchdogMs(jobTimeoutMs: number): number {
  return Math.max(
    CRON_AGENT_PRE_EXECUTION_MIN_WATCHDOG_MS,
    Math.min(CRON_AGENT_PRE_EXECUTION_WATCHDOG_MS, Math.floor(jobTimeoutMs / 2)),
  );
}

---

function resolveCronAgentPreExecutionWatchdogMs(jobTimeoutMs: number): number {
  return Math.max(
    CRON_AGENT_PRE_EXECUTION_MIN_WATCHDOG_MS,
    Math.min(CRON_AGENT_PRE_EXECUTION_WATCHDOG_MS, jobTimeoutMs),
  );
}
RAW_BUFFERClick to expand / collapse

Problem

resolveCronAgentPreExecutionWatchdogMs() uses Math.floor(jobTimeoutMs / 2) as the upper bound for the stale-progress watchdog:

function resolveCronAgentPreExecutionWatchdogMs(jobTimeoutMs: number): number {
  return Math.max(
    CRON_AGENT_PRE_EXECUTION_MIN_WATCHDOG_MS,
    Math.min(CRON_AGENT_PRE_EXECUTION_WATCHDOG_MS, Math.floor(jobTimeoutMs / 2)),
  );
}

This means a job configured with timeoutSeconds: 1800 (30 min) is silently killed at 15 minutes (or at the CRON_AGENT_PRE_EXECUTION_WATCHDOG_MS cap, whichever is lower) if no intermediate phase events arrive after execution starts.

Impact

Shell commands that run silently for their entire duration — rclone sync, rsync, pg_dump, large file transfers — produce no intermediate phase events during execution. The model calls the tool, the tool runs, and eventually returns output. During that silent window the stale-progress watchdog fires at jobTimeoutMs / 2, aborting an otherwise healthy run well before the user-configured job timeout.

Reproduction

  1. Create a cron job with timeoutSeconds: 600 whose agent runs a shell command taking ~400s
  2. The job is killed at 300s by the pre-execution watchdog (half the job timeout)
  3. Increasing timeoutSeconds to 1800 only moves the kill point to 600s (capped by CRON_AGENT_PRE_EXECUTION_WATCHDOG_MS)

Suggested Fix

Change the formula to use the full jobTimeoutMs:

function resolveCronAgentPreExecutionWatchdogMs(jobTimeoutMs: number): number {
  return Math.max(
    CRON_AGENT_PRE_EXECUTION_MIN_WATCHDOG_MS,
    Math.min(CRON_AGENT_PRE_EXECUTION_WATCHDOG_MS, jobTimeoutMs),
  );
}

The job-level timeout already provides the hard upper bound for total run duration. The stale-progress detector should not impose a stricter limit — it only needs to catch genuinely stalled processes (no progress at all), not processes doing legitimate silent work.

Alternatively, tool executions could emit periodic heartbeat phase events so the watchdog knows the tool is still active.

Related

  • #83066 — makes the watchdog constants configurable and raises defaults (addresses the cap being too low)
  • #82223 — original report of isolated cron agent timeout failures

Environment

  • OpenClaw 2026.5.12 (f066dd2)
  • Raspberry Pi 5, 4GB, ARM64, Debian bookworm
  • Job: weekly rclone sync to Cloudflare R2 (~5 min transfer time)

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix cron: pre-execution watchdog formula halves effective job timeout for silent tool executions [2 pull requests]