- The job should finish within a few minutes - It should produce 10 short lines - It should be delivered through cron `announce`

openclaw - ✅(Solved) Fix [Bug]: isolated cron agentTurn can get stuck/rerun stale running state and eventually time out on lightweight ClawHub JSON summarization [1 pull requests, 1 comments, 1 participants]

openclaw2026-03-13 02:14:00

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#44541•Fetched 2026-04-08 00:45:28

View on GitHub

Comments

Participants

Timeline

Reactions

Author

zhaoyouqi

Participants

zhaoyouqi

Timeline (top)

cross-referenced ×2commented ×1

Error Message

FailoverError: LLM request timed out.

Code Example

{
  "kind": "agentTurn",
  "message": "使用 clawhub skill，并且只做下面这件事：\n\n1. 执行一次 clawhub explore --limit 10 --json\n2. 只读取返回 JSON 里的 items 前 10 条\n3. 基于每条的 slug 和 summary，整理成 10 行中文摘要\n4. 每行格式固定为：序号. slug - 简短中文描述\n5. 不要读取其他文件，不要运行其他 clawhub 命令，不要搜索，不要展开分析\n6. 不要输出思考、步骤、工具调用、JSON 原文、标题或结尾说明\n7. 只输出最终这 10 行文本，系统会自动投递到 Telegram",
  "timeoutSeconds": 900,
  "lightContext": true
}

---

cron: queued manual run waiting for an execution slot

---

FailoverError: LLM request timed out.

---

lane=session:agent:main:cron:<job-id> durationMs=900028 error="FailoverError: LLM request timed out."

---

cron: clearing stale running marker on startup

RAW_BUFFERClick to expand / collapse

What happened?

An isolated cron job using payload.kind = "agentTurn" appears to get stuck in a stale/running state and often times out, even for a lightweight task.

The task is intentionally simple:

Run clawhub explore --limit 10 --json
Read the first 10 items
Rewrite them into 10 short lines
Deliver to Telegram via normal cron announce

The underlying clawhub command is fast when run directly, and direct Telegram sends also work. But the same logic inside OpenClaw cron repeatedly hangs or times out.

Environment

OpenClaw: 2026.3.8
Node: v22.22.0
Session target: isolated
Payload kind: agentTurn
Delivery: announce to Telegram
Timeout: 900s
lightContext: true

Repro config

Cron job payload:

{
  "kind": "agentTurn",
  "message": "使用 clawhub skill，并且只做下面这件事：\n\n1. 执行一次 clawhub explore --limit 10 --json\n2. 只读取返回 JSON 里的 items 前 10 条\n3. 基于每条的 slug 和 summary，整理成 10 行中文摘要\n4. 每行格式固定为：序号. slug - 简短中文描述\n5. 不要读取其他文件，不要运行其他 clawhub 命令，不要搜索，不要展开分析\n6. 不要输出思考、步骤、工具调用、JSON 原文、标题或结尾说明\n7. 只输出最终这 10 行文本，系统会自动投递到 Telegram",
  "timeoutSeconds": 900,
  "lightContext": true
}

Repro steps

Configure an isolated cron job with the payload above
Trigger it manually with openclaw cron run <job-id>
Observe cron state, run history, and logs

Expected behavior

The job should finish within a few minutes
It should produce 10 short lines
It should be delivered through cron announce

Actual behavior

Observed repeatedly:

cron list sometimes shows the job as running even after the previous run already finished with error
manual runs can get queued behind a stale slot:
- cron: queued manual run waiting for an execution slot
old stale running markers seem to survive until gateway restart
after restart, state may clear, but the next run can still get stuck again
many runs end as:
- Error: cron: job execution timed out
logs show:
- FailoverError: LLM request timed out.

Evidence

Timed-out runs:

runAtMs: 1773365921979, durationMs: 900004, status: error
runAtMs: 1773326195890, durationMs: 900003, status: error
runAtMs: 1773324907261, durationMs: 900004, status: error

Example log lines:

cron: queued manual run waiting for an execution slot

FailoverError: LLM request timed out.

lane=session:agent:main:cron:<job-id> durationMs=900028 error="FailoverError: LLM request timed out."

Also observed earlier:

cron: clearing stale running marker on startup

Why this seems like an OpenClaw bug

The same underlying operations succeed outside cron:

clawhub explore --limit 10 --json returns quickly when run directly
direct Telegram send via openclaw message send succeeds

So this does not look like:

ClawHub CLI slowness
Telegram delivery failure
large output size

It looks more like a cron / isolated agentTurn state-machine or execution-slot bug.

Additional note

A much simpler 3-line version of the same ClawHub task did succeed once, which suggests the cron lane is sensitive in a way that direct command execution is not.

extent analysis

Fix Plan

To resolve the issue of the cron job getting stuck in a stale/running state and timing out, we will implement the following steps:

Increase the timeout for the LLM request to prevent timeouts
Implement a retry mechanism for the LLM request to handle temporary failures
Add logging to track the execution time of the cron job and identify potential bottlenecks

Code Changes

We will modify the cron job payload to include a longer timeout and implement a retry mechanism:

{
  "kind": "agentTurn",
  "message": "...",
  "timeoutSeconds": 1800, // increased timeout
  "lightContext": true,
  "retry": {
    "attempts": 3,
    "delay": 30000 // 30 seconds
  }
}

We will also add logging to track the execution time of the cron job:

const startTime = Date.now();
// execute cron job logic
const endTime = Date.now();
console.log(`Execution time: ${endTime - startTime}ms`);

Infra / Dependency Fixes

We will ensure that the Node.js version is up-to-date and compatible with the OpenClaw version.

Temporary Workaround

As a temporary workaround, we can manually restart the gateway to clear the stale running markers and allow the cron job to run again.

Verification

To verify that the fix worked, we will:

Run the cron job manually and check the execution time and output
Check the logs for any errors or timeouts
Verify that the job completes successfully and produces the expected output

Extra Tips

To prevent similar issues in the future, we can:

Monitor the execution time of cron jobs and adjust timeouts accordingly
Implement retry mechanisms for critical tasks
Regularly update dependencies and ensure compatibility with the OpenClaw version.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

The job should finish within a few minutes
It should produce 10 short lines
It should be delivered through cron announce

#api #ssr #installation #tensor shape #autograd error #file not found #serialization error #model compatibility #GPU setup #container setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix [Bug]: isolated cron agentTurn can get stuck/rerun stale running state and eventually time out on lightweight ClawHub JSON summarization [1 pull requests, 1 comments, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Fix Action

Fixed

PR fix notes

PR #44573: Cron: isolate cron run session lanes

Description (problem / solution / changelog)

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

User-visible / Behavior Changes

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Human Verification (required)

Review Conversations

Compatibility / Migration

Failure Recovery (if this breaks)

Risks and Mitigations

Changed files

Code Example

What happened?

Environment

Repro config

Repro steps

Expected behavior

Actual behavior

Evidence

Why this seems like an OpenClaw bug

Additional note

extent analysis

Fix Plan

Code Changes

Infra / Dependency Fixes

Temporary Workaround

Verification

Extra Tips

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING