openclaw - ✅(Solved) Fix Cron runs record status=ok when model returns empty content + errorMessage (should be status=error) [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#69889Fetched 2026-04-22 07:46:57
View on GitHub
Comments
1
Participants
2
Timeline
2
Reactions
0
Author
Participants
Timeline (top)
commented ×1cross-referenced ×1

When a cron job's agent call returns an empty content array with an errorMessage set (e.g. Gemini returning HTTP 400 with no body due to free-tier quota exhaustion), the cron-run record still gets status: "ok", delivered: false, deliveryStatus: "not-delivered".

Downstream this looks like a successful-but-undelivered run — which reads to an operator as "delivery is broken" rather than the true "the model call itself failed silently." My own regression debugging chased the delivery layer for 5 days before realising the real fault was at the model-call layer.

Error Message

{ "content": [], "usage": { "input": 0, "output": 0 }, "stopReason": "error", "errorMessage": "400 status code (no body)" } 5. Inspect ~/.openclaw/cron/runs/<job-id>.jsonl — the run record has status: "ok" despite the assistant message carrying both stopReason:"error" and errorMessage. A cron run whose assistant message has stopReason:"error" and/or a non-empty errorMessage should record status: "error", propagating the errorMessage into the run record's error field.

Root Cause

When a cron job's agent call returns an empty content array with an errorMessage set (e.g. Gemini returning HTTP 400 with no body due to free-tier quota exhaustion), the cron-run record still gets status: "ok", delivered: false, deliveryStatus: "not-delivered".

Downstream this looks like a successful-but-undelivered run — which reads to an operator as "delivery is broken" rather than the true "the model call itself failed silently." My own regression debugging chased the delivery layer for 5 days before realising the real fault was at the model-call layer.

Fix Action

Fix / Workaround

Workaround in place

PR fix notes

PR #69911: fix(cron): record status=error on run-level error, not just isError payloads

Description (problem / solution / changelog)

Summary

Fixes #69889. When a cron job's assistant message ends with `stopReason: "error"` and a non-empty `errorMessage` (e.g. Gemini returning HTTP 400 with no body on quota exhaustion), the outbound delivery payload can be empty content with no `isError` flag. `resolveCronPayloadOutcome` inferred fatal-error status only from per-payload `isError`, so the cron run record got `status: ok, delivered: false, deliveryStatus: not-delivered`.

Downstream that reads as delivery broken rather than model call failed. The reporter chased the delivery layer for 5 days before realizing the real fault was at the model-call layer.

Fix

In `src/cron/isolated-agent/helpers.ts`:

  1. Treat `runLevelError` (already threaded in from `finalRunResult.meta?.error`) as another fatal-error trigger alongside per-payload `isError`.
  2. When no per-payload error text is available but `runLevelError` is a string, propagate it into `embeddedRunError` so the cron run record carries the real reason.

Preserved invariants

  • A successful text after an isError payload still recovers when `runLevelError` is undefined (original behavior).
  • `runLevelError` overrides that recovery path when set — a run-level error is authoritative.

Test

5 new test cases in `TestResolveCronPayloadOutcome — runLevelError handling`:

  1. runLevelError alone ⇒ fatal, embeddedRunError matches
  2. isError payload preferred over runLevelError for the error text
  3. no error anywhere ⇒ not fatal
  4. successful-payload-after-error recovery still works when runLevelError absent
  5. runLevelError overrides the recovery path

oxlint passes.

Closes #69889.

Changed files

  • src/cron/isolated-agent/helpers.test.ts (modified, +60/-0)
  • src/cron/isolated-agent/helpers.ts (modified, +14/-2)

Code Example

{ "content": [], "usage": { "input": 0, "output": 0 }, "stopReason": "error", "errorMessage": "400 status code (no body)" }
RAW_BUFFERClick to expand / collapse

Summary

When a cron job's agent call returns an empty content array with an errorMessage set (e.g. Gemini returning HTTP 400 with no body due to free-tier quota exhaustion), the cron-run record still gets status: "ok", delivered: false, deliveryStatus: "not-delivered".

Downstream this looks like a successful-but-undelivered run — which reads to an operator as "delivery is broken" rather than the true "the model call itself failed silently." My own regression debugging chased the delivery layer for 5 days before realising the real fault was at the model-call layer.

Environment

  • OpenClaw: 2026.3.13 (global npm install)
  • Node: 24.13.0
  • OS: Windows 10 Pro (10.0.19045)
  • Affected backend: google/gemini-2.0-flash via OpenAI-compat endpoint

Reproduction

Using a Gemini free-tier key that has burned through its daily input-token quota:

  1. Configure agents.defaults.model.primary = google/gemini-2.0-flash
  2. Trigger any cron with a non-trivial prompt (~20k prompt tokens)
  3. Inspect the resulting ~/.openclaw/agents/main/sessions/<session-id>.jsonl
  4. The final type:"message" with role:"assistant" has:
    { "content": [], "usage": { "input": 0, "output": 0 }, "stopReason": "error", "errorMessage": "400 status code (no body)" }
  5. Inspect ~/.openclaw/cron/runs/<job-id>.jsonl — the run record has status: "ok" despite the assistant message carrying both stopReason:"error" and errorMessage.

Expected

A cron run whose assistant message has stopReason:"error" and/or a non-empty errorMessage should record status: "error", propagating the errorMessage into the run record's error field.

Actual

Run is stored as status:"ok" with delivered:false, which looks like "delivery succeeded but nothing was sent" rather than "model call failed."

Proposed fix location

Based on a read of the bundled dist (source-map regions around src/cron/legacy-delivery.ts and the run-record writer), the run-finish handler appears to infer success from HTTP-level completion rather than inspecting the assistant message's stopReason / errorMessage.

Workaround in place

Switched primary model to groq/llama-3.1-8b-instant. Delivery visible again, but the underlying cron-runtime bug remains — any future provider outage will present the same silent-success shape.

Happy to PR

If the proposed fix location is roughly right, I'm happy to open a PR — just confirm whether the cron-run writer or the agent-call wrapper is the preferred intervention point.

extent analysis

TL;DR

The cron job's run record should be updated to reflect an error status when the agent call returns an error message, rather than incorrectly showing a successful status.

Guidance

  • The issue seems to stem from the run-finish handler inferring success based on HTTP-level completion, rather than inspecting the assistant message's stopReason and errorMessage.
  • To fix this, the run-record writer in src/cron/legacy-delivery.ts should be updated to check for stopReason:"error" and/or a non-empty errorMessage in the assistant message, and set the run record's status to "error" accordingly.
  • The error field in the run record should also be populated with the errorMessage from the assistant message.
  • Before making changes, it's essential to review the code around src/cron/legacy-delivery.ts to ensure the proposed fix location is accurate.

Example

No code snippet is provided, as the issue does not include enough context for a specific example. However, the fix would likely involve updating the logic in legacy-delivery.ts to handle error cases correctly.

Notes

The current workaround of switching to a different primary model may mask the issue but does not address the underlying problem. A proper fix is necessary to prevent similar issues in the future.

Recommendation

Apply a workaround is not recommended as it does not fix the root cause. Instead, it is recommended to update the code to correctly handle error cases, as this will provide a more robust and reliable solution. The contributor has expressed willingness to open a PR, which should be encouraged to resolve the issue.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix Cron runs record status=ok when model returns empty content + errorMessage (should be status=error) [1 pull requests, 1 comments, 2 participants]