claude-code - 💡(How to fix) Fix Code Routines: cron fires but cloud containers never execute prompts (silent failure with ended_reason: run_once_fired) [2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#54260Fetched 2026-04-29 06:32:02
View on GitHub
Comments
2
Participants
2
Timeline
8
Reactions
0
Author
Timeline (top)
labeled ×3commented ×2cross-referenced ×2closed ×1

Multiple Claude Code Routines on my account have stopped executing their prompts. Cron triggers fire and the routine state advances (ended_reason: run_once_fired), but the cloud container never reaches prompt execution. This has been silently affecting all my routines for at least 28 days.

The visible "fired" signal led me to spend hours debugging downstream causes (env_id, secrets, MCP attachments) when the actual symptom is upstream at the runtime layer.

Root Cause

Multiple Claude Code Routines on my account have stopped executing their prompts. Cron triggers fire and the routine state advances (ended_reason: run_once_fired), but the cloud container never reaches prompt execution. This has been silently affecting all my routines for at least 28 days.

The visible "fired" signal led me to spend hours debugging downstream causes (env_id, secrets, MCP attachments) when the actual symptom is upstream at the runtime layer.

RAW_BUFFERClick to expand / collapse

Summary

Multiple Claude Code Routines on my account have stopped executing their prompts. Cron triggers fire and the routine state advances (ended_reason: run_once_fired), but the cloud container never reaches prompt execution. This has been silently affecting all my routines for at least 28 days.

The visible "fired" signal led me to spend hours debugging downstream causes (env_id, secrets, MCP attachments) when the actual symptom is upstream at the runtime layer.

Reproducible smoke test

  • Routine ID: trig_01GGa6a6jU7GKcsjkTpWRdXD (name: "Runtime Smoke Test")
  • Fired: 2026-04-28T09:57:07Z (one-shot via run_once_at)
  • Configuration: minimal
    • environment_id: env_01XE23cmQ6V1ioMS2fSUJ7rp (Default per the schedule skill)
    • Single-line Bash prompt that does one curl POST to a webhook I control
    • No MCP attachments
    • No slash command
    • No git operations beyond the standard repo source
  • Expected: the curl reaches my webhook within 1-2 minutes of fire, my server logs show the inbound request, the routine prints the response and exits.
  • Actual: routine state shows ended_reason: run_once_fired at 2026-04-28T09:58:04Z, but my server received zero inbound requests in the 15 minutes after fire. The curl never executed.

Visible Anthropic-side symptom

The run viewer at https://claude.ai/code/routines/<id> for affected runs shows the four init steps stuck with loading spinners indefinitely:

  1. Setting up a cloud container
  2. Clone repository
  3. Run setup script
  4. Start Claude Code

Same behaviour in the desktop Claude Code app and the web viewer.

What has been ruled out

  • Configuration drift in existing routines: a brand-new minimal routine fails the same way.
  • Environment ID: tested both env_014aH42omVrvkZY9wEaG9keq (existing) and env_01XE23cmQ6V1ioMS2fSUJ7rp (Default). Same failure mode.
  • Repo authentication: source repo is publicly cloneable, no auth required.
  • Slash command resolution: smoke test uses no slash command, raw Bash only.
  • MCP attachments: smoke test has none.
  • Embedded secret parsing: only one secret in the test prompt as a plain header value.
  • Quota: routines panel shows 4 of 15 included remote daily runs used, well under cap.

Affected routines beyond the smoke test

  • Morning Ops Check trig_01PpagDwsAX3UstDJYoL6pKb (daily 7:30am AEST cron). No webhook calls or Telegram messages received in the last 7 days at minimum.
  • LinkedIn Scout trig_01642uv5vH8h9tp2RMXG3kx3 (weekdays 1pm AEST cron). Output file last updated on 31 March 2026 (28 days silent).
  • SEO Sentinel trig_016vtkZKEuN6dnjfJxuP9Bun (weekly Mon 7:03am AEST). Cannot confirm recent runs from output artefacts.

Defect candidate

Independently of resolving the runtime failure, the silent failure mode itself is worth a fix:

The ended_reason: run_once_fired flag is set even when the cloud container never reaches prompt execution. This makes the routine appear successful from the user's perspective. A more accurate signal would be:

  • ended_reason: container_init_timeout (or similar) when the container never finishes the init phase
  • A non-empty last_error field surfaced via the API
  • The run viewer changing from "loading" to "failed" after a reasonable timeout (say 5 minutes)

Without these, users have to instrument their downstream effects (webhook calls, file writes, Telegram messages) to detect failure, which adds significant operational overhead.

Environment

  • Account: [email protected]
  • Plan: Claude Max
  • Tested via RemoteTrigger API (n8n MCP RemoteTrigger tool) and the desktop Claude Code app
  • macOS desktop client (recent version)
  • Region: Australia (Melbourne, AEST)

Related support thread

Already raised via [email protected]. The Intercom auto-reply (Fin AI Agent) suggested local-config troubleshooting (/doctor, removing ~/.claude.json) which doesn't apply to remote routine runtime issues. Filing here per Anthropic support's recommendation that this needs engineering visibility. Conversation ID for cross-reference: 215474090153858.

Happy to enable any debug flags, share session URLs, or run further test cases.

extent analysis

TL;DR

The issue can be mitigated by implementing a more accurate failure detection mechanism, such as setting a container_init_timeout flag or surfacing a non-empty last_error field, to identify when the cloud container fails to reach prompt execution.

Guidance

  • Investigate the cloud container initialization process to determine why it's not completing, potentially due to a timeout or other issue.
  • Consider implementing a timeout mechanism, such as a 5-minute timeout, to detect when the container init phase is taking too long.
  • Review the ended_reason flag and last_error field to ensure they accurately reflect the routine's execution status.
  • Test the routine with a simpler prompt or configuration to isolate the issue.

Example

No code snippet is provided as the issue is related to the runtime environment and container initialization.

Notes

The root cause of the issue is unclear, but it's likely related to the cloud container initialization process. The suggested mitigation steps aim to improve failure detection and provide more accurate error reporting.

Recommendation

Apply a workaround by implementing a more accurate failure detection mechanism, such as setting a container_init_timeout flag or surfacing a non-empty last_error field, to identify when the cloud container fails to reach prompt execution. This will help detect and report failures more accurately, reducing the operational overhead of instrumenting downstream effects.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING