claude-code - 💡(How to fix) Fix FleetView buckets agents by stale text-classifier verdict instead of live session status — actively-working agents shown under Completed

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

In the FleetView / background-agents list (the N working · M completed view), agents that are actively working are frequently shown under Completed. The mis-bucketing is not random: the bucket is driven by a stale, sticky text-classifier verdict persisted in each job's state.json (.state), which lags the agent's true activity. The harness already knows the real state precisely (the session process is busy), but the bucketing consults the classifier verdict instead, so a row that emitted a conclusion-looking message and then resumed work stays parked under Completed until it emits its next classifiable line.

This is the "actively-working shown as Completed" direction of the broader bucket-accuracy problem. Related: #59011 (classifier reads only assistant-message text for sentinels), #63094 (stuck running timer after return), #59518 (stuck working on plan-mode/question).

Root Cause

Root cause hypothesis

Code Example

row                | bucket shown | state.json .state | session .status | pid alive | actually working?
-------------------|--------------|-------------------|-----------------|-----------|------------------
agent A            | Working      | done              | busy            | yes       | yes  (correct)
agent B            | Completed    | done              | busy            | yes       | YES  (mis-bucketed)
agent C            | Completed    | done              | busy            | yes       | YES  (mis-bucketed)
agent D            | Completed    | done              | busy            | yes       | YES  (mis-bucketed)
agent E..J         | Completed    | done              | idle            | -         | no   (correct)

---

# Build a session index (jobId -> status,pid), then diff each job's classifier
# verdict against its real liveness:
for d in ~/.claude/jobs/*/; do
  sj="$d/state.json"; [ -f "$sj" ] || continue
  jq -r '"\(.name)\tstate=\(.state)\ttempo=\(.tempo)"' "$sj"
done
# then for each job's daemonShort, look up ~/.claude/sessions/*.json where
# .jobId matches and compare .status (busy/idle) + live pid against .state.
RAW_BUFFERClick to expand / collapse

Summary

In the FleetView / background-agents list (the N working · M completed view), agents that are actively working are frequently shown under Completed. The mis-bucketing is not random: the bucket is driven by a stale, sticky text-classifier verdict persisted in each job's state.json (.state), which lags the agent's true activity. The harness already knows the real state precisely (the session process is busy), but the bucketing consults the classifier verdict instead, so a row that emitted a conclusion-looking message and then resumed work stays parked under Completed until it emits its next classifiable line.

This is the "actively-working shown as Completed" direction of the broader bucket-accuracy problem. Related: #59011 (classifier reads only assistant-message text for sentinels), #63094 (stuck running timer after return), #59518 (stuck working on plan-mode/question).

Environment

  • Claude Code v2.1.158
  • macOS (darwin), Opus 4.8
  • Multiple concurrent background (kind: bg) agent sessions

What I observed (reproducible from disk, no screenshot needed)

There are two independent signals on disk:

  1. Bucket source — the classifier verdict (sticky): ~/.claude/jobs/<id>/state.json.stateworking | blocked | done | failed, with a history in ~/.claude/jobs/<id>/timeline.jsonl. The sibling .detail field is a copy of the agent's last assistant message — confirming .state is derived from message text, not process state. This value is only rewritten when the agent emits a new classifiable message.

  2. Ground-truth liveness (live): a separate file, ~/.claude/sessions/<pid>.json.status (busy/idle) and a real OS .pid, mirrored as .tempo (active/idle) inside state.json.

With 10 concurrent background agents, I cross-referenced the two. Findings:

  • Every one of the 10 jobs had .state = "done" persisted — including a job the UI itself displayed under Working. So .state is unreliable as a current-activity signal, and FleetView's bucketing does not consistently agree with it either.
  • 4 of those 10 sessions were genuinely busy with live PIDs (.status == "busy", kill -0 <pid> succeeds), i.e. actively processing a turn — yet 3 of them were rendered under Completed. (The 4th was correctly under Working.)
  • The disagreement is internal to the data the harness already has: .state (text classifier) says done while .tempo / session .status (process) says active/busy.

Illustrative shape (anonymized):

row                | bucket shown | state.json .state | session .status | pid alive | actually working?
-------------------|--------------|-------------------|-----------------|-----------|------------------
agent A            | Working      | done              | busy            | yes       | yes  (correct)
agent B            | Completed    | done              | busy            | yes       | YES  (mis-bucketed)
agent C            | Completed    | done              | busy            | yes       | YES  (mis-bucketed)
agent D            | Completed    | done              | busy            | yes       | YES  (mis-bucketed)
agent E..J         | Completed    | done              | idle            | -         | no   (correct)

Repro recipe (read-only audit)

# Build a session index (jobId -> status,pid), then diff each job's classifier
# verdict against its real liveness:
for d in ~/.claude/jobs/*/; do
  sj="$d/state.json"; [ -f "$sj" ] || continue
  jq -r '"\(.name)\tstate=\(.state)\ttempo=\(.tempo)"' "$sj"
done
# then for each job's daemonShort, look up ~/.claude/sessions/*.json where
# .jobId matches and compare .status (busy/idle) + live pid against .state.

Any job where .state == "done" (→ Completed bucket) while session .status == "busy" / .tempo == "active" and the PID is alive is a false-Completed.

Root cause hypothesis

The bucket is computed from a lagging indicator (last-message text sentinel) rather than the authoritative one (session process status the daemon already tracks). The sequence that produces the bug:

  1. Agent finishes a turn with conclusion-looking text → classifier stamps .state = done → row moves to Completed.
  2. Agent resumes work (a /loop tick, a queued/resumed task, or a user reply) → session goes busy again.
  3. No new classifiable message has been emitted yet → .state stays frozen at done → row stays under Completed while actively working.

Suggested fix

When choosing a row's bucket, treat live session activity as authoritative over the stale text verdict: if session.status == busy / tempo == active with a live PID, the row should render under Working regardless of the last-message sentinel. Equivalently, invalidate/refresh .state on the idle → busy transition instead of only on message emission. (This also fixes the inverse stale cases.)

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix FleetView buckets agents by stale text-classifier verdict instead of live session status — actively-working agents shown under Completed