claude-code - 💡(How to fix) Fix scheduled-task sessions: intermittent silent boot failures + mid-session bash-bridge degradation

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Running two Claude Code scheduled tasks (Claude Desktop scheduled-tasks subsystem) over the past 24 hours surfaces three distinct intermittent failure modes that only occur in the scheduler plane. Interactive ("chat-plane") Cowork sessions on the same host are healthy. The pattern is the scheduler runtime itself — sometimes the scheduled session never gets a workable sandbox; sometimes it gets one but the mcp__workspace__bash bridge degrades mid-session.

After landing a 15-minute warm-clock + per-tick heartbeat (so we can see which ticks ran at all), the failure pattern becomes measurable: of 11 expected ticks between 2026-05-30T23:53Z and 2026-05-31T02:38Z, 4 ticks never appeared in the heartbeat log — a 75-min blackout 00:08Z → 01:08Z, plus two single-tick gaps. The scheduled-task launcher itself is silently failing to start sessions ~36% of the time in that window.

We have strong but anonymized internal repro evidence (see §3 below); we're filing for visibility / triage. Happy to share host log snippets, scheduled-task JSON definitions, or any other artifact that helps narrow this.

Error Message

Symptom: Scheduled task fires per cron. lastRunAt is set. No artifact of any kind: no runs.jsonl line, no inbox file, no CAPTURE-ERROR self-report, nothing in ~/Library/Logs/Claude/ recognizable as that tick. The scheduled session appears to have been launched and immediately died before its first MCP call returned. Symptom: Scheduled task fires, agent starts, gets bash partially working (one or two probe writes succeed), then everything stops — no further file writes, no errors emitted, no CAPTURE-ERROR self-report, agent exhausts elapsed cap silently. Repro: upwork-feed-dv1-request-drain at 2026-05-30T03:45:11Z. Wrote two probe files (.stage/_iso and .stage/_ts) at 03:48:50Z (~3.5 min into a 6-min cap), then silence. No subsequent file system mutations, no runs.jsonl append, no CAPTURE-ERROR. Symptom: After flipping the cron from 0 */6 * * * to */15 * * * * and adding a bash echo health-ok-{run_id} pre-flight probe plus a per-tick heartbeat append, the heartbeat JSONL surfaces a different problem: expected ticks simply don't appear at all. The scheduler subsystem appears to not launch the session — no probe runs, no CAPTURE-ERROR is written (the agent's bash echo probe can't run if the session never starts), no host-log entries for that tick.

Root Cause

Running two Claude Code scheduled tasks (Claude Desktop scheduled-tasks subsystem) over the past 24 hours surfaces three distinct intermittent failure modes that only occur in the scheduler plane. Interactive ("chat-plane") Cowork sessions on the same host are healthy. The pattern is the scheduler runtime itself — sometimes the scheduled session never gets a workable sandbox; sometimes it gets one but the mcp__workspace__bash bridge degrades mid-session.

After landing a 15-minute warm-clock + per-tick heartbeat (so we can see which ticks ran at all), the failure pattern becomes measurable: of 11 expected ticks between 2026-05-30T23:53Z and 2026-05-31T02:38Z, 4 ticks never appeared in the heartbeat log — a 75-min blackout 00:08Z → 01:08Z, plus two single-tick gaps. The scheduled-task launcher itself is silently failing to start sessions ~36% of the time in that window.

We have strong but anonymized internal repro evidence (see §3 below); we're filing for visibility / triage. Happy to share host log snippets, scheduled-task JSON definitions, or any other artifact that helps narrow this.

Fix Action

Fix / Workaround

  • Claude Desktop — macOS, Apple Silicon (M-series), Darwin 25.5.0
  • Two scheduled tasks registered via mcp__scheduled-tasks MCP:
    • upwork-feed-dispatcher — runbook prompt that drives Chrome (via the Claude in Chrome MCP), scrapes a results page, writes markdown to a sister directory
    • upwork-feed-dv1-request-drain — a one-shot drain that fulfills detail-fetch requests from a requests/ queue
  • Cron — previously 0 */6 * * *; flipped to */15 * * * * mid-investigation to (a) warm the sandbox more frequently and (b) make intermittent failures observable via a per-tick heartbeat JSONL.
  • MCP surface used in tickmcp__workspace__bash, mcp__cowork__request_cowork_directory, mcp__cowork__allow_cowork_file_delete, mcp__Claude_in_Chrome__*, mcp__scheduled-tasks__*
  • Cowork (the Anthropic-issued interactive-session product) is running in parallel on the same machine; its chat-plane sessions are healthy throughout this window. Several were verified mid-degradation:
    • 2026-05-30T04:03:38Z host log: mcp__workspace__bash returned outcome=ok in 330ms inside a Cowork chat session — meanwhile two scheduled-task sessions had silently failed within the previous hour.

Repro: upwork-feed-dispatcher at 2026-05-30T03:37:54Z. Diagnosed via filesystem absence + host-log scrape. From the agent's runbook perspective, nothing ran; from lastRunAt's perspective, it did.

Repro: upwork-feed-dispatcher at 2026-05-30T18:38Z (run run-202605301838). The agent's own debrief is in our internal communication thread; happy to share verbatim.

RAW_BUFFERClick to expand / collapse

Summary

Running two Claude Code scheduled tasks (Claude Desktop scheduled-tasks subsystem) over the past 24 hours surfaces three distinct intermittent failure modes that only occur in the scheduler plane. Interactive ("chat-plane") Cowork sessions on the same host are healthy. The pattern is the scheduler runtime itself — sometimes the scheduled session never gets a workable sandbox; sometimes it gets one but the mcp__workspace__bash bridge degrades mid-session.

After landing a 15-minute warm-clock + per-tick heartbeat (so we can see which ticks ran at all), the failure pattern becomes measurable: of 11 expected ticks between 2026-05-30T23:53Z and 2026-05-31T02:38Z, 4 ticks never appeared in the heartbeat log — a 75-min blackout 00:08Z → 01:08Z, plus two single-tick gaps. The scheduled-task launcher itself is silently failing to start sessions ~36% of the time in that window.

We have strong but anonymized internal repro evidence (see §3 below); we're filing for visibility / triage. Happy to share host log snippets, scheduled-task JSON definitions, or any other artifact that helps narrow this.

Environment

  • Claude Desktop — macOS, Apple Silicon (M-series), Darwin 25.5.0
  • Two scheduled tasks registered via mcp__scheduled-tasks MCP:
    • upwork-feed-dispatcher — runbook prompt that drives Chrome (via the Claude in Chrome MCP), scrapes a results page, writes markdown to a sister directory
    • upwork-feed-dv1-request-drain — a one-shot drain that fulfills detail-fetch requests from a requests/ queue
  • Cron — previously 0 */6 * * *; flipped to */15 * * * * mid-investigation to (a) warm the sandbox more frequently and (b) make intermittent failures observable via a per-tick heartbeat JSONL.
  • MCP surface used in tickmcp__workspace__bash, mcp__cowork__request_cowork_directory, mcp__cowork__allow_cowork_file_delete, mcp__Claude_in_Chrome__*, mcp__scheduled-tasks__*
  • Cowork (the Anthropic-issued interactive-session product) is running in parallel on the same machine; its chat-plane sessions are healthy throughout this window. Several were verified mid-degradation:
    • 2026-05-30T04:03:38Z host log: mcp__workspace__bash returned outcome=ok in 330ms inside a Cowork chat session — meanwhile two scheduled-task sessions had silently failed within the previous hour.

Failure modes observed

Failure mode 1 — never-booted (silent)

Symptom: Scheduled task fires per cron. lastRunAt is set. No artifact of any kind: no runs.jsonl line, no inbox file, no CAPTURE-ERROR self-report, nothing in ~/Library/Logs/Claude/ recognizable as that tick. The scheduled session appears to have been launched and immediately died before its first MCP call returned.

Repro: upwork-feed-dispatcher at 2026-05-30T03:37:54Z. Diagnosed via filesystem absence + host-log scrape. From the agent's runbook perspective, nothing ran; from lastRunAt's perspective, it did.

Failure mode 2 — partial-boot, mid-tick silent stall

Symptom: Scheduled task fires, agent starts, gets bash partially working (one or two probe writes succeed), then everything stops — no further file writes, no errors emitted, no CAPTURE-ERROR self-report, agent exhausts elapsed cap silently.

Repro: upwork-feed-dv1-request-drain at 2026-05-30T03:45:11Z. Wrote two probe files (.stage/_iso and .stage/_ts) at 03:48:50Z (~3.5 min into a 6-min cap), then silence. No subsequent file system mutations, no runs.jsonl append, no CAPTURE-ERROR.

Failure mode 3 — full boot, mid-tick bash-bridge degradation into multi-execution

Symptom: Scheduled task fires, agent boots clean, runs §pre-flight health probe (single bash echo → returns within 10s), proceeds through mcp__Claude_in_Chrome__* flow (50 results scraped, Cloudflare cleared, Blob snapshot downloaded to ~/Downloads successfully). Then the bash/Python render layer hits a degraded mcp__workspace__bash — every command triple-executes, stdout interleaves with echoed command text, counters become non-deterministic.

The agent's runbook reports contradictory totals (tiles_written: 50, tiles_dedupe_skipped: 50, errors: 50, against tiles_seen_raw: 50); ground-truth ls <expected_dir> | wc -l on the relevant timestamp returned 0 — no files were actually persisted. Agent aborted past the 3-min cap rather than burn more tokens on an unreliable sandbox.

Repro: upwork-feed-dispatcher at 2026-05-30T18:38Z (run run-202605301838). The agent's own debrief is in our internal communication thread; happy to share verbatim.

Failure mode 4 — scheduler-launcher itself silently not launching sessions (post-mitigation observation)

Symptom: After flipping the cron from 0 */6 * * * to */15 * * * * and adding a bash echo health-ok-{run_id} pre-flight probe plus a per-tick heartbeat append, the heartbeat JSONL surfaces a different problem: expected ticks simply don't appear at all. The scheduler subsystem appears to not launch the session — no probe runs, no CAPTURE-ERROR is written (the agent's bash echo probe can't run if the session never starts), no host-log entries for that tick.

Concrete window: between 2026-05-30T23:53Z and 2026-05-31T02:38Z:

ExpectedActual heartbeat lineNotes
23:53Z✅ presentlast entry on 30-May file
00:08Z❌ missingstart of 75-min blackout
00:23Z❌ missing
00:38Z❌ missing
00:53Z❌ missing
01:08Z✅ presentfirst entry on 31-May file — 75-min gap from 23:53
01:23Z❌ missing
01:38Z✅ present
01:53Z✅ present
02:08Z✅ present
02:23Z✅ present
02:38Z✅ present

7 of 11 ticks landed. The 4 missing ticks have no artifact anywhere we can find — not in the heartbeat JSONL, not in ~/Library/Logs/Claude/, not in the scheduled task's local working directory.

What's not the cause (already ruled out)

  • ❌ Not a permission prompt. Permissions were baked into the task; manual operator-triggered sessions of the same runbook from the same machine run end-to-end without intervention.
  • ❌ Not Cloudflare / Chrome / network. Failure mode 1 happens before any MCP call returns. Failure mode 2 happens after probe writes but before any browser nav. Failure mode 3 happens after a successful 50-tile scrape + Blob download.
  • ❌ Not the runbook prompt. Same prompt runs clean in interactive Cowork sessions on the same machine, on the same MCP surface, in the same hour.
  • ❌ Not file corruption. SKILL.md and config.json files load and parse correctly in the manual sessions immediately before/after each failure.

What helps (partial mitigations)

  • Warm-clock cron (*/15 * * * *) — sandbox warming reduces (but does not eliminate) failure mode 1. Most warm ticks succeed.
  • Pre-flight bash echo probe with 10s timeout — catches failure mode 1 reliably when the session starts at all. Doesn't help with mode 4 (session never starts) and doesn't catch mode 3 (degradation after a clean probe).
  • Per-tick heartbeat append — turns "silent failure" into "measurable miss-rate." Without this we'd have no idea modes 1 + 4 were happening at all.

What would help from your side

In rough priority order, anything that lets us narrow the scheduler-launcher path:

  1. Per-tick stderr / launcher log somewhere readable from the host CLI. We looked under ~/Library/Application Support/Claude/, ~/Library/Logs/Claude/, and inside the scheduled task's working directory — found no per-tick artifact attributable to the failed ticks.
  2. A way to detect mid-session mcp__workspace__bash bridge degradation from inside the runbook (mode 3) — currently a clean probe says nothing about whether the next bash call will return cleanly.
  3. Confirmation or denial that the */15 * * * * cron itself is a supported usage pattern for scheduled tasks (we noticed ~36% miss rate in mode 4 even on the warm cron — wondering if 15-min cadence stresses the launcher in a way 6h doesn't).

Reproducer

Self-contained reproducer is hard to share — the runbook depends on a Chrome MCP session authenticated against a third-party site (Upwork). But the runbook itself, the two scheduled-task JSON definitions, the heartbeat JSONL, host log snippets from a few of the failure events, and the cron config are all available — just point us at where to send them.

Co-authored on the operator's side by Cowork (the interactive Anthropic-issued session product); this is a collaborative investigation between Cowork's internal logs + Claude Code's host-CLI vantage. Both agree the failure is in the scheduler plane, not the runbook.


Filed by Claude Code on behalf of operator. Operator handle: borahanmirzaii. Internal tracking thread: borhan-upwork/upwork-os/comms/026, 029, 030, 031, 033, 035, 036.

<!-- draft-id: 2026-05-31-upstream-claude-code-scheduler-runtime-bug -->

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix scheduled-task sessions: intermittent silent boot failures + mid-session bash-bridge degradation