openclaw - ✅(Solved) Fix [Bug] Stale running tasks after gateway restart — tasks remain in 'running' state instead of being cancelled [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#78463Fetched 2026-05-07 03:36:39
View on GitHub
Comments
1
Participants
2
Timeline
2
Reactions
2
Timeline (top)
commented ×1cross-referenced ×1

After openclaw gateway restart, tasks that were running at restart time remain in running state indefinitely. openclaw tasks audit reports them as stale_running. Manual openclaw tasks maintenance --apply is required to clean them up.

Error Message

Task error stale_running adfb3305-… running 3h10m running task appears stuck

Root Cause

After openclaw gateway restart, tasks that were running at restart time remain in running state indefinitely. openclaw tasks audit reports them as stale_running. Manual openclaw tasks maintenance --apply is required to clean them up.

Fix Action

Fix / Workaround

Workaround: openclaw tasks maintenance --apply cleans them up manually.

PR fix notes

PR #78575: fix(tasks): mark restart-interrupted tasks lost

Description (problem / solution / changelog)

Summary

  • mark active task records as lost when a forced or timed-out gateway restart proceeds before task drain completes
  • keep queued tasks untouched while clearing stale running audit leftovers from interrupted work
  • cover the registry helper and restart run-loop paths with focused regressions

Fixes #78463.

Verification

  • pnpm exec oxfmt --check --threads=1 src/tasks/task-registry.maintenance.ts src/tasks/task-registry.maintenance.issue-60299.test.ts src/cli/gateway-cli/lifecycle.runtime.ts src/cli/gateway-cli/run-loop.ts src/cli/gateway-cli/run-loop.test.ts
  • git diff --check
  • pnpm test src/tasks/task-registry.maintenance.issue-60299.test.ts src/cli/gateway-cli/run-loop.test.ts -- --reporter=verbose
  • Reproduced issue #78463 on Linux 6.8 x86_64 with Node v22.22.0 and OpenClaw 2026.5.5 by seeding a stale running subagent task plus durable child session row; pnpm openclaw tasks audit --json reported stale_running: 1 and errors: 1.
  • Re-ran the same seeded-state proof on current OpenClaw 2026.5.6 after this fix; audit reported stale_running: 0, errors: 0, and one lost task with restart-interruption detail.

Real behavior proof

  • Behavior or issue addressed: Fixes issue #78463, where openclaw gateway restart could leave active background task records in running forever, causing openclaw tasks audit --json to report stale_running restart leftovers.
  • Real environment tested: Linux 6.8 x86_64, Node v22.22.0, OpenClaw 2026.5.5 before fix and OpenClaw 2026.5.6 plus this branch after fix, both with an isolated OPENCLAW_STATE_DIR.
  • Exact steps or command run after this patch: seeded the same restart-leftover running subagent task and durable child session row, invoked the restart-interruption marking path, then ran pnpm openclaw tasks audit --json against that isolated state directory.
  • Evidence after fix: copied live terminal output from pnpm openclaw tasks audit --json after this patch:
{
  "summary": {
    "total": 1,
    "warnings": 1,
    "errors": 0,
    "byCode": {
      "stale_queued": 0,
      "stale_running": 0,
      "lost": 1,
      "delivery_failed": 0,
      "missing_cleanup": 0,
      "inconsistent_timestamps": 0
    }
  },
  "findings": [
    {
      "severity": "warn",
      "code": "lost",
      "detail": "task interrupted by gateway restart before completion",
      "status": "lost",
      "task": {
        "runtime": "subagent",
        "runId": "run-issue-78463-pr-fixed",
        "status": "lost",
        "error": "task interrupted by gateway restart before completion"
      }
    }
  ]
}
  • Observed result after fix: pnpm openclaw tasks audit --json reports stale_running: 0 and errors: 0; the interrupted task is now lost with task interrupted by gateway restart before completion instead of remaining running indefinitely.
  • What was not tested: full local pnpm check:changed was not completed because it expanded to broad lanes=all in this environment and Blacksmith/Testbox is unavailable; targeted formatter, diff, regression tests, and real audit reproduction/proof were completed.

Notes

  • pnpm check:changed expanded to broad lanes=all in the clean PR worktree and was stopped per repo guidance because Blacksmith/Testbox is unavailable in this environment; targeted touched-surface proof above passed.

Changed files

  • src/cli/gateway-cli/lifecycle.runtime.ts (modified, +4/-1)
  • src/cli/gateway-cli/run-loop.test.ts (modified, +8/-0)
  • src/cli/gateway-cli/run-loop.ts (modified, +11/-0)
  • src/tasks/task-registry.maintenance.issue-60299.test.ts (modified, +38/-0)
  • src/tasks/task-registry.maintenance.ts (modified, +30/-0)

Code Example

Tasks audit: 9 findings · 1 errors · 8 warnings
Task     error    stale_running    adfb3305-…  running  3h10m   running task appears stuck
RAW_BUFFERClick to expand / collapse

Summary

After openclaw gateway restart, tasks that were running at restart time remain in running state indefinitely. openclaw tasks audit reports them as stale_running. Manual openclaw tasks maintenance --apply is required to clean them up.

Environment

  • OpenClaw: 2026.5.5 (b1abf9d)
  • macOS: 26.4.1 (arm64)
  • Node.js: 24.15.0

Steps to Reproduce

  1. Have any active tasks running (e.g. subagent, cron, acp)
  2. Run openclaw gateway restart
  3. Run openclaw tasks audit

Expected Behavior

Tasks running at restart time should be marked as cancelled or failed during shutdown/startup, not left as running.

Actual Behavior

Tasks audit: 9 findings · 1 errors · 8 warnings
Task     error    stale_running    adfb3305-…  running  3h10m   running task appears stuck

Workaround: openclaw tasks maintenance --apply cleans them up manually.

Impact

Requires manual cleanup after every gateway restart. openclaw status shows persistent task errors until maintenance is run.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Bug] Stale running tasks after gateway restart — tasks remain in 'running' state instead of being cancelled [1 pull requests, 1 comments, 2 participants]