openclaw - 💡(How to fix) Fix feat(cron): add structured job-completion record to detect partial/incomplete cron cycles [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#57890Fetched 2026-04-08 01:56:29
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0
Author
Participants

Code Example

# In session_status or a new tool:
{
  "cron_job_id": "cron:1ed7b400-640f-4925-8e8f-2b29e46b7d95",
  "status": "running" | "completed" | "timeout" | "interrupted",
  "started_at": "2026-03-30T17:29:00Z",
  "completed_at": null
}
RAW_BUFFERClick to expand / collapse

Problem

When a cron job modifies state files (e.g. heartbeat-state.json) but is killed before writing its completion log, there is no way for subsequent cycles to detect that the previous run was incomplete.

Observed behavior: A self-evolution cron cycle updated heartbeat-state.json (setting lastEvolveRun) but was killed before appending to the agent's evolution log. The next cycle saw the updated timestamp and assumed the prior cycle completed successfully, potentially skipping work.

Desired behavior

OpenClaw should provide a structured job-completion mechanism for cron tasks — separate from agent-managed state files — so that:

  1. Each cron invocation gets a unique job ID (e.g. cron:1ed7b400...)
  2. Upon successful session completion, OpenClaw marks the job as COMPLETE in a registry accessible to subsequent sessions
  3. Subsequent cron invocations can query: "Did job X complete, or was it interrupted?"

Use case

Autonomous agent pipelines that maintain multi-file state (e.g. JSON state + JSONL audit logs) need a reliable way to detect partial writes. Currently, agents implement their own completion guards, but these are vulnerable to the exact timing attack described above.

Proposed API

# In session_status or a new tool:
{
  "cron_job_id": "cron:1ed7b400-640f-4925-8e8f-2b29e46b7d95",
  "status": "running" | "completed" | "timeout" | "interrupted",
  "started_at": "2026-03-30T17:29:00Z",
  "completed_at": null
}

A lightweight alternative: expose session_status fields for the triggering cron job's completion status, so the agent can query whether its own prior run finished.

Priority

Medium — affects reliability of any multi-step autonomous pipeline that writes to multiple files atomically.

extent analysis

Fix Plan

To address the issue, we will implement a job-completion mechanism for cron tasks. Here are the steps:

  • Introduce a unique job ID for each cron invocation
  • Create a registry to store the status of each job
  • Update the registry when a job completes or is interrupted
  • Allow subsequent cron invocations to query the registry for job status

Example Code

We can use a simple database or a file-based registry to store the job status. Here's an example using a JSON file:

import json
import uuid
import os

# Create a unique job ID
job_id = f"cron:{uuid.uuid4()}"

# Initialize the registry file
registry_file = "cron_registry.json"

# Create the registry file if it doesn't exist
if not os.path.exists(registry_file):
    with open(registry_file, "w") as f:
        json.dump({}, f)

# Update the registry when a job starts
def start_job(job_id):
    with open(registry_file, "r+") as f:
        registry = json.load(f)
        registry[job_id] = {"status": "running", "started_at": "2026-03-30T17:29:00Z"}
        f.seek(0)
        json.dump(registry, f)
        f.truncate()

# Update the registry when a job completes
def complete_job(job_id):
    with open(registry_file, "r+") as f:
        registry = json.load(f)
        registry[job_id]["status"] = "completed"
        registry[job_id]["completed_at"] = "2026-03-30T17:30:00Z"
        f.seek(0)
        json.dump(registry, f)
        f.truncate()

# Query the registry for job status
def get_job_status(job_id):
    with open(registry_file, "r") as f:
        registry = json.load(f)
        return registry.get(job_id, {"status": "unknown"})

# Example usage:
start_job(job_id)
# ... run the cron job ...
complete_job(job_id)
print(get_job_status(job_id))

Verification

To verify that the fix worked, we can test the following scenarios:

  • A cron job completes successfully and the registry is updated correctly
  • A cron job is interrupted and the registry is updated correctly
  • A subsequent cron invocation can query the registry and detect the status of the previous job

Extra Tips

  • Use a robust storage solution for the registry, such as a database, to ensure data consistency and durability.
  • Implement error handling and logging to ensure that the registry is updated correctly even in case of failures.
  • Consider using a distributed registry solution if the cron jobs are running on multiple machines.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING