openclaw - ✅(Solved) Fix [Bug]: openclaw cron list/status and openclaw health --json timeout against local gateway while scheduler still appears to run jobs [1 pull requests, 2 comments, 3 participants]

Q: Expected behavior

`openclaw cron list`, `openclaw cron status`, and `openclaw health --json` should return normally when the local gateway is running and `openclaw gateway status` reports `RPC probe: ok`. If legacy cron-store fields are the issue, `openclaw doctor --fix` should normalize them or at least improve the situation.

Parametric89 · 2026-03-21T07:59:40Z

[openclaw] openclaw cron list/status and openclaw health --json timeout against local gateway while scheduler still appears to run jobs openclaw cron list/status and openclaw health --json timeout against local gateway while scheduler still appears to run jobs # PR #51515: fix(health): bound gateway health snapshots and normalize legacy cron - Repository: openclaw/openclaw - Author: xydt-610 - State: open | merged: False - Link: https://github.com/openclaw/openclaw/pull/51515 ## Description (problem / solution / changelog) ## Summary Describe the problem and fix in 2–5 bullets: - **Problem**: `openclaw cron list` / `cron status` and `openclaw health --json` could hit gateway timeouts (~30s) while `gateway status` still showed a healthy RPC probe; users with many channel accounts also paid a sequential health snapshot cost (N×probe timeout). Legacy cron rows could have non-string or empty `id` values. - **Why it matters**: Admins lose the ability to inspect cron and gateway health via CLI despite a running scheduler; multi-account setups amplified health snapshot latency. - **What changed**: `getHealthSnapshot` now runs per-channel account probes in parallel, applies a default wall-clock budget (`DEFAULT_HEALTH_SNAPSHOT_BUDGET_MS`), and passes a per-probe budget derived from remaining time; `normalizeStoredCronJobs` coerces numeric `id` to string and assigns a UUID when `id` is missing/empty after legacy migration. - **What did NOT change (scope boundary)**: No change to cron execution semantics, gateway auth, or unrelated channel plugins beyond health snapshot gathering and cron store normalization on load. ## Change Type (select all) - [x] Bug fix - [ ] Feature - [ ] Refactor required for the fix - [ ] Docs - [ ] Security hardening - [ ] Chore/infra ## Scope (select all touched areas) - [x] Gateway / orchestration - [ ] Skills / tool execution - [ ] Auth / tokens - [ ] Memory / storage - [ ] Integrations - [x] API / contracts - [ ] UI / DX - [ ] CI/CD / infra ## Linked Issue/PR - Closes #51498 - Related # ## User-visible / Behavior Changes - `health` RPC / `openclaw health` snapshots complete within a bounded time more reliably on multi-account configs; probes for accounts under one channel run concurrently. - Legacy cron jobs with numeric or missing `id` are normalized when the store is loaded/saved (may persist a one-time rewrite). ## Security Impact (required) - New permissions/capabilities? (`No`) - Secrets/tokens handling changed? (`No`) - New/changed network calls? (`No` — same probes, different scheduling/budget) - Command/tool execution surface changed? (`No`) - Data access scope changed? (`No`) ## Repro + Verification ### Environment - OS: any (fix is Node/TS in gateway + CLI) - Runtime: OpenClaw gateway + CLI calling `health` / `cron.*` RPCs - Relevant config: multiple channel accounts or many plugins increases snapshot work; cron store with legacy `id` / `jobId` ### Steps 1. Run `pnpm check` locally on the branch (passes). 2. (Optional) Run gateway and exercise `openclaw health --json` / `openclaw cron list` against a multi-account config. ### Expected Health snapshot and cron admin RPCs return before client timeout in typical setups; cron store normalizes legacy ids without manual doctor intervention. ### Actual Local `pnpm check` passed; unit coverage in `health.snapshot.test.ts` / `store-migration.test.ts`. ## Evidence - [x] Failing pattern addressed: sequential health probes + unbounded total time; missing cron id coercion — covered by code change + tests - [x] `pnpm check` green locally ## Human Verification (required) - Verified scenarios: `pnpm check` full suite; targeted tests for health snapshot and cron migration. - Edge cases checked: parallel account probes preserve per-account error handling; budget exceeded returns structured probe error string. - What you did **not** verify: End-to-end on the reporter’s exact Linux systemd gateway host. ## Review Conversations - [ ] I replied to or resolved every bot review conversation I addressed in this PR. - [ ] I left unresolved only the conversations that still need reviewer or maintainer judgment. ## Compatibility / Migration - Backward compatible? (`Yes`) - Config/env changes? (`No`) - Migration needed? (`No` — cron ids are normalized on load; optional persist on next write) ## Failure Recovery (if this breaks) - How to disable/revert: revert this commit. - Files/config to restore: `src/commands/health.ts`, `src/cron/store-migration.ts`, tests, `CHANGELOG.md` ## Risks and Mitigations - **Risk**: Parallel probes could increase concurrent outbound requests to providers. **Mitigation**: Same probes as before, only concurrency per channel; overall budget caps total wait. - **Risk**: UUID assignment for empty ids changes stable ids for broken rows. **Mitigation**: Only when `id` was unusable; improves correctness. ## Changed files - `CHANGELOG.md` (modified, +1/-0) - `src/commands/health.ts` (modified, +97/-68) - `src/cron/store-migration.test.

openclaw2026-03-21 07:59:40

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#51498•Fetched 2026-04-08 01:10:20

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

commented ×2labeled ×2cross-referenced ×1referenced ×1

openclaw cron list/status and openclaw health --json timeout against local gateway while scheduler still appears to run jobs

Error Message

Error: gateway timeout after 30000ms Gateway target: ws://127.0.0.1:18789 Source: local loopback Config: /home/erikadmin/.openclaw/openclaw.json Bind: loopback

Root Cause

openclaw cron list/status and openclaw health --json timeout against local gateway while scheduler still appears to run jobs

Code Example

Error: gateway timeout after 30000ms
Gateway target: ws://127.0.0.1:18789
Source: local loopback
Config: /home/erikadmin/.openclaw/openclaw.json
Bind: loopback

RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Summary

openclaw cron list/status and openclaw health --json timeout against local gateway while scheduler still appears to run jobs

Steps to reproduce

Run openclaw gateway status on a Linux host with a user-level systemd gateway.
Confirm the gateway reports healthy output and RPC probe: ok.
Run openclaw cron list.
Run openclaw cron status.
Run openclaw health --json.
Observe that all of the admin-facing commands time out, while existing cron jobs still appear to run and write run-history under ~/.openclaw/cron/runs/.
Run openclaw gateway install --force and retry.
Run openclaw doctor --fix and retry.

Expected behavior

openclaw cron list, openclaw cron status, and openclaw health --json should return normally when the local gateway is running and openclaw gateway status reports RPC probe: ok. If legacy cron-store fields are the issue, openclaw doctor --fix should normalize them or at least improve the situation.

Actual behavior

openclaw gateway status reports a healthy local gateway with RPC probe: ok, but:

openclaw cron list times out
openclaw cron status times out
openclaw health --json times out
Existing cron jobs still appear to run and write run-history, so the scheduler seems alive while the admin/control plane is not.

I also found that jobs in ~/.openclaw/cron/jobs.json use legacy id fields and have jobId = None. Running openclaw gateway install --force did not fix it. Running openclaw doctor --fix did not fix it.

OpenClaw version

2026.3.13 (61d171a)

Operating system

Ubuntu Linux 6.8.0-100-generic (x64), user-level systemd gateway

Install method

No response

Model

openai-codex/gpt-5.4

Provider / routing chain

OpenAI Codex OAuth / local gateway / user-level systemd service

Additional provider/model setup details

No response

Logs, screenshots, and evidence

Error: gateway timeout after 30000ms
Gateway target: ws://127.0.0.1:18789
Source: local loopback
Config: /home/erikadmin/.openclaw/openclaw.json
Bind: loopback

Impact and severity

Affected users/systems/channels: Observed on one Linux host using a local user-level systemd Gateway. The affected subsystem is OpenClaw cron administration via the CLI (openclaw cron list, openclaw cron status) and openclaw health --json. Existing cron jobs still appear to run, so the scheduler/runtime is at least partially functioning.

Severity: Blocks workflow. It prevents safe inspection and administration of cron jobs even though the scheduler appears to remain active.

Frequency: Always, in this environment. The timeout reproduces consistently across repeated attempts before and after openclaw gateway install --force and openclaw doctor --fix.

Consequence: Cannot reliably inspect, add, modify, or remove cron jobs via the normal CLI workflow. This leaves the system in a degraded state where scheduled automations may continue to run, but the admin/control plane is effectively unavailable.

Additional information

Gateway bind is loopback (127.0.0.1:18789)
openclaw gateway status is healthy
Existing cron jobs still appear to execute
jobs.json is valid, small (~33K), version 1, and stores jobs with legacy id and no jobId
openclaw doctor --fix only cleaned orphan transcript files and did not change the cron behavior

extent analysis

Fix Plan

To resolve the issue with openclaw cron list, openclaw cron status, and openclaw health --json timing out, we will:

Update the jobs.json file to use the new jobId field instead of the legacy id field.
Implement a retry mechanism for the gateway connection to handle temporary timeouts.

Code Changes

import json

# Load the jobs.json file
with open('~/.openclaw/cron/jobs.json', 'r') as f:
    jobs = json.load(f)

# Update the jobs to use the new jobId field
for job in jobs:
    job['jobId'] = job['id']
    del job['id']

# Save the updated jobs.json file
with open('~/.openclaw/cron/jobs.json', 'w') as f:
    json.dump(jobs, f)

Configuration Changes

No configuration changes are required.

Infra / Dependency Fixes

No infra or dependency fixes are required.

Temporary Workarounds

If the issue persists, try increasing the timeout value in the openclaw.json configuration file.

Verification

To verify that the fix worked:

Run openclaw cron list and check that it returns normally.
Run openclaw cron status and check that it returns normally.
Run openclaw health --json and check that it returns normally.

Extra Tips

Make sure to backup the jobs.json file before making any changes.
If the issue persists, try running openclaw doctor --fix again to clean up any orphaned files.
Consider implementing a regular backup and update process for the jobs.json file to prevent similar issues in the future.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

#api #ssr #installation #tensor shape #autograd error #API rate limit #retriever error #indexing error #inference speed #output truncation

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix [Bug]: openclaw cron list/status and openclaw health --json timeout against local gateway while scheduler still appears to run jobs [1 pull requests, 2 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fixed

PR fix notes

PR #51515: fix(health): bound gateway health snapshots and normalize legacy cron

Description (problem / solution / changelog)

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

User-visible / Behavior Changes

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Human Verification (required)

Review Conversations

Compatibility / Migration

Failure Recovery (if this breaks)

Risks and Mitigations

Changed files

Code Example

Bug type

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

extent analysis

Fix Plan

Code Changes

Configuration Changes

Infra / Dependency Fixes

Temporary Workarounds

Verification

Extra Tips

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING