openclaw - ✅(Solved) Fix [Bug]: Cron exec timeout kills workspace backup step silently [1 pull requests, 4 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#57963Fetched 2026-04-08 01:55:37
View on GitHub
Comments
4
Participants
2
Timeline
9
Reactions
0
Author
Timeline (top)
commented ×4closed ×1cross-referenced ×1locked ×1

Error Message

  • Surface the timeout error to the user in the cron run report
  • This appears to be a general issue: scripts that work interactively may silently fail in cron isolation without any error surfaced

Root Cause

Additional context

  • Root cause in our case: cp -R of a workspace containing node_modules (~1.1 GB+) exceeds the default exec timeout for isolated cron sessions
  • Fix applied locally: Replaced cp -R with rsync -a --exclude='node_modules' --exclude='.cache' --exclude='.git' to avoid copying large dependency directories
  • The tarball step is unaffected because it operates on the same filesystem without crossing the exec timeout
  • This appears to be a general issue: scripts that work interactively may silently fail in cron isolation without any error surfaced

PR fix notes

PR #58247: fix(cron): surface exec timeouts in isolated cron runs

Description (problem / solution / changelog)

Summary

Describe the problem and fix in 2–5 bullets:

  • Problem: isolated cron runs can hit an exec/bash timeout, but the timeout warning is suppressed when cron is running with verbose: off, so the run can look like it failed silently.
  • Why it matters: users do not get a visible failure signal for long-running cron jobs, which makes timeouts look like unexplained cron breakage instead of a tool timeout.
  • What changed: cron session keys now surface exec/bash tool timeout errors in the embedded run payload even when verbose mode is off, and a regression test was added for that exact path.
  • What did NOT change (scope boundary): this PR does not change exec timeout configuration, cron scheduling, retry behavior, or user backup script behavior.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #57963
  • Related #20560
  • This PR fixes a bug or regression

Root Cause / Regression History (if applicable)

  • Root cause: isolated cron runs default to verbose: off, and the embedded payload builder suppresses exec/bash tool failure warnings in non-verbose mode. That suppression also hid timeout failures for cron sessions, leaving no user-facing error payload.
  • Missing detection / guardrail: there was coverage for generic non-verbose suppression, but no regression test asserting that cron sessions must still surface timeout failures.
  • Prior context (git blame, prior PR, issue, or refactor if known): the behavior appears related to the warning-policy change in #20560 after issue #20101, where exec warnings were gated behind verbose mode.
  • Why this regressed now: isolated cron runs are especially sensitive because they run without interactive context and rely on the final payload for error visibility.
  • If unknown, what was ruled out: ruled out missing timeout configurability; exec timeout configuration already exists and was not the root problem here.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/agents/pi-embedded-runner/run/payloads.test.ts
  • Scenario the test should lock in: for a cron session key with verbose: off, an exec timeout error should still produce a visible warning payload instead of returning no payload.
  • Why this is the smallest reliable guardrail: the bug is in payload-generation policy, so the most direct and stable guardrail is a unit test at the payload builder boundary.
  • Existing test that already covers this (if any): existing tests covered generic non-verbose suppression for exec, but not the cron-specific exception needed here.
  • If no new test is added, why not: N/A

User-visible / Behavior Changes

Isolated cron runs now show exec/bash timeout failures in the cron output even when verbose mode is off, instead of appearing to fail silently.

Diagram (if applicable)

Before:
[cron run] -> [exec timeout] -> [warning suppressed by verbose=off] -> [no visible cron error]

After:
[cron run] -> [exec timeout] -> [cron session exception in payload builder] -> [visible timeout warning]

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: local repo checkout with pnpm / Vitest
  • Model/provider: N/A
  • Integration/channel (if any): isolated cron session
  • Relevant config (redacted): cron isolated run with default verbose: off; exec timeout path

Steps

  1. Run an isolated cron flow, or directly build the embedded payload for a cron session key with verbose: off.
  2. Inject or trigger an exec timeout failure.
  3. Inspect the resulting cron payload.

Expected

  • The cron run should surface a visible timeout warning so the failure is actionable.

Actual

  • Before this fix, the timeout was captured internally but no user-facing cron warning payload was emitted.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios: reproduced the cron-session payload path locally before the fix, confirmed the timeout warning was suppressed in non-verbose mode, then verified the cron path now emits a visible warning payload; also ran pnpm test -- src/agents/pi-embedded-runner/run/payloads.test.ts and pnpm test -- src/cron/service.issue-regressions.test.ts -t "applies timeoutSeconds to manual cron.run isolated executions".
  • Edge cases checked: non-cron non-verbose behavior remains covered by the existing payload tests, so this change stays scoped to cron session keys plus exec/bash failures.
  • What you did not verify: a full end-to-end real backup workload under the scheduler, or changes to timeout configuration behavior.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk:
    • Cron runs may now show timeout details that were previously hidden in non-verbose mode.
  • Mitigation:
    • The change is intentionally narrow: it only applies to cron session keys and only for exec/bash tool failures; normal non-verbose suppression behavior remains unchanged outside cron.

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • src/agents/bash-tools.exec-types.ts (modified, +1/-0)
  • src/agents/bash-tools.exec.ts (modified, +1/-0)
  • src/agents/pi-embedded-runner/run.ts (modified, +1/-0)
  • src/agents/pi-embedded-runner/run/payloads.test-helpers.ts (modified, +1/-0)
  • src/agents/pi-embedded-runner/run/payloads.test.ts (modified, +45/-0)
  • src/agents/pi-embedded-runner/run/payloads.ts (modified, +14/-2)
  • src/agents/pi-embedded-runner/run/types.ts (modified, +1/-0)
  • src/agents/pi-embedded-subscribe.handlers.tools.test.ts (modified, +36/-0)
  • src/agents/pi-embedded-subscribe.handlers.tools.ts (modified, +2/-0)
  • src/agents/pi-embedded-subscribe.handlers.types.ts (modified, +1/-0)
  • src/agents/pi-embedded-subscribe.tools.ts (modified, +19/-0)
RAW_BUFFERClick to expand / collapse

Describe the bug When backup_openclaw_to_obsidian.sh runs as a cron job, the workspace-copy step (cp -R $SRC_ROOT/workspace/. $DST/workspace-copy/) is silently killed before completing. The tarball creation (~54s, ~1.1 GB) succeeds, but the subsequent workspace-copy step — which copies the full workspace including node_modules — exceeds the exec timeout and gets killed with no user-facing signal.

To Reproduce

  1. Configure a cron job that runs a shell script copying a large workspace (several GB including node_modules)
  2. Observe that the script runs successfully in terminal but gets silently killed when run as an isolated cron session
  3. The tarball step completes; the workspace-copy step silently dies

Expected behavior Cron exec jobs should either:

  • Surface the timeout error to the user in the cron run report
  • Or provide a configurable exec timeout for cron job shells
  • Or exclude large directories like node_modules by default when doing workspace copies

Environment

  • OpenClaw version: (run openclaw --version to get version)
  • OS: macOS (likely affects all platforms)

Additional context

  • Root cause in our case: cp -R of a workspace containing node_modules (~1.1 GB+) exceeds the default exec timeout for isolated cron sessions
  • Fix applied locally: Replaced cp -R with rsync -a --exclude='node_modules' --exclude='.cache' --exclude='.git' to avoid copying large dependency directories
  • The tarball step is unaffected because it operates on the same filesystem without crossing the exec timeout
  • This appears to be a general issue: scripts that work interactively may silently fail in cron isolation without any error surfaced

Tags

extent analysis

Fix Plan

To resolve the issue of the cp -R command being silently killed due to exceeding the exec timeout in cron jobs, we will replace cp -R with rsync and exclude large directories like node_modules.

Steps to Fix

  • Replace the cp -R command in backup_openclaw_to_obsidian.sh with rsync -a and exclude large directories:
rsync -a --exclude='node_modules' --exclude='.cache' --exclude='.git' $SRC_ROOT/workspace/. $DST/workspace-copy/
  • Alternatively, increase the cron job timeout if possible, but this may vary depending on the system configuration.
  • Consider adding error handling to the script to catch and report any errors that may occur during execution.

Verification

  • Run the modified backup_openclaw_to_obsidian.sh script as a cron job and verify that the workspace-copy step completes successfully without being killed.
  • Check the cron job logs for any error messages or warnings.

Extra Tips

  • Use rsync instead of cp -R for large directory copies to avoid timeouts and improve performance.
  • Consider excluding other large directories or files that are not necessary for the backup.
  • Test the modified script interactively before running it as a cron job to ensure it works as expected.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Bug]: Cron exec timeout kills workspace backup step silently [1 pull requests, 4 comments, 2 participants]