claude-code - 💡(How to fix) Fix [BUG] Task list accumulates stale tasks across sessions; spinner surfaces in_progress tasks from months ago alongside current work

StepCodex · 2026-05-17T04:38:59Z

[claude-code] Preflight Checklist - x I have searched existing issues and this hasn't been reported yet - x This is a single bug report - x I am using the late… ### Preflight Checklist - [x] I have searched existing issues and this hasn't been reported yet - [x] This is a single bug report - [x] I am using the latest version of Claude Code ### Version Claude Code 2.1.143 (macOS Sequoia 15.x, terminal Claude Code, default settings). ### What's Wrong? The task primitive (`TaskCreate` / `TaskUpdate` / the system-reminder list that exposes them) accumulates tasks across sessions with no scoping or aging. In a project with months of conversation history, the current session ends up showing a task list dominated by stale entries from old refactor arcs that have nothing to do with current work. This compounds three ways: 1. **Stale `in_progress` tasks from old sessions stay flagged in_progress indefinitely.** If a session ended without explicit cleanup (compaction, context overflow, abandoned), the task stays as `in_progress` for every future session. The next session inherits a phantom 'this is what you're working on' signal that's months out of date. 2. **The spinner shows the top N tasks during long tool calls.** When the surfaced N includes phantom in_progress tasks (sorted by creation order, not by recency-of-activity), the user sees the agent appear to be working on the wrong thing. Example from a real session today: ``` ✢ Running Phase 2: Core dissection… (6m 7s · ↓ 8.1k tokens · almost done thinking) ⎿ ◼ Phase 2: Core dissection (8 PRs) ← from months ago, not actually active ◻ Phase 5: SampleIndex dissection (4 PRs) ← months old ◻ Move namespace anchors to their own SPM target root ← months old ◻ Ironclad sweep — Stage B (older closures sample) ← actually active ◻ Ironclad sweep — Stage E (process changes) ← actually active … +2 pending, 83 completed ``` Only 2 of the 5 visible tasks are actually relevant to current work. The 3 stale tasks dominate the visual frame and create a false impression of what the agent is doing. 3. **The system-reminder injects the full task list into context periodically** (the 'task tools haven't been used recently...' reminder lists every task, completed and stale alike). In a long-lived project this can grow to 80+ tasks, every reminder. Every reminder costs tokens and provides false grounding ('I have a Phase 5 SampleIndex dissection pending' — no, that was months ago and is no longer scoped). Related: #46465 (system-reminder design), #53603 (task subject leak via spinner + context). ### Expected One of (in order of escalating fix): - **(a) Aging**: tasks not touched for N days drop off the visible spinner list + the system-reminder injection. Still queryable via `TaskList` if explicitly requested. Closest analogue: a TTL on `updatedAt`. - **(b) Session scoping**: tasks belong to the session that created them. Cross-session continuation requires explicit reattach (e.g. `TaskList --since-session=N`). Closest analogue: how shell job control works — `jobs` shows your jobs, not the previous user's. - **(c) Spinner top-N by recency**: even if the underlying store keeps every task, the spinner should sort by `updatedAt DESC` rather than `createdAt ASC`, so stale in_progress tasks fall off the visible portion. - **(d) Automatic cleanup on session start**: when a new session starts in a project with tasks where `status == in_progress` and `updatedAt > 30 days ago`, flip those to a new state like `abandoned` so they stop reporting as 'currently being worked on'. (c) is the smallest fix and would mitigate the spinner-visual problem without losing data. (a) + (d) together would also mitigate the system-reminder context-bloat. (b) is the cleanest semantically but biggest architectural change. ### Repro 1. Start a project. Create some tasks. Mark some `in_progress`. End the session without marking them done. 2. Wait a month. Or do many other sessions in the same project. 3. Start a new session. Create a new task and run a long tool call. 4. Observe the spinner shows old in_progress tasks from step 1 mixed with the new task from step 3. The user has to mentally filter every spinner update to find the actually-current work. In my case the project has 80+ accumulated tasks from a multi-month refactor arc. Many of the `in_progress` and `pending` ones at the top of the list (by creation order) are from work that completed months ago in different contexts — Phase 2 / Phase 5 / namespace anchor refactor tasks that have all long since shipped. The actually-current work is in tasks created today, near the bottom of the chronological list. ### Why It Matters For users running long-lived agents on real projects: - The spinner is the primary status surface during long tool calls. If the spinner consistently shows wrong context, users stop trusting it. - The system-reminder repeatedly re-grounds the model on phantom tasks. Beyond token cost, this can cause the agent to "re-pick-up" a task that's no l

Code Example

✢ Running Phase 2: Core dissection… (6m 7s · ↓ 8.1k tokens · almost done thinking)
     ⎿  ◼ Phase 2: Core dissection (8 PRs)        ← from months ago, not actually active
        ◻ Phase 5: SampleIndex dissection (4 PRs) ← months old
        ◻ Move namespace anchors to their own SPM target root  ← months old
        ◻ Ironclad sweep — Stage B (older closures sample)     ← actually active
        ◻ Ironclad sweep — Stage E (process changes)           ← actually active
         … +2 pending, 83 completed

Preflight Checklist

I have searched existing issues and this hasn't been reported yet
This is a single bug report
I am using the latest version of Claude Code

Version

Claude Code 2.1.143 (macOS Sequoia 15.x, terminal Claude Code, default settings).

What's Wrong?

The task primitive (TaskCreate / TaskUpdate / the system-reminder list that exposes them) accumulates tasks across sessions with no scoping or aging. In a project with months of conversation history, the current session ends up showing a task list dominated by stale entries from old refactor arcs that have nothing to do with current work.

This compounds three ways:

Stale in_progress tasks from old sessions stay flagged in_progress indefinitely. If a session ended without explicit cleanup (compaction, context overflow, abandoned), the task stays as in_progress for every future session. The next session inherits a phantom 'this is what you're working on' signal that's months out of date.

The spinner shows the top N tasks during long tool calls. When the surfaced N includes phantom in_progress tasks (sorted by creation order, not by recency-of-activity), the user sees the agent appear to be working on the wrong thing. Example from a real session today:

✢ Running Phase 2: Core dissection… (6m 7s · ↓ 8.1k tokens · almost done thinking)
  ⎿  ◼ Phase 2: Core dissection (8 PRs)        ← from months ago, not actually active
     ◻ Phase 5: SampleIndex dissection (4 PRs) ← months old
     ◻ Move namespace anchors to their own SPM target root  ← months old
     ◻ Ironclad sweep — Stage B (older closures sample)     ← actually active
     ◻ Ironclad sweep — Stage E (process changes)           ← actually active
      … +2 pending, 83 completed

Only 2 of the 5 visible tasks are actually relevant to current work. The 3 stale tasks dominate the visual frame and create a false impression of what the agent is doing.

The system-reminder injects the full task list into context periodically (the 'task tools haven't been used recently...' reminder lists every task, completed and stale alike). In a long-lived project this can grow to 80+ tasks, every reminder. Every reminder costs tokens and provides false grounding ('I have a Phase 5 SampleIndex dissection pending' — no, that was months ago and is no longer scoped). Related: #46465 (system-reminder design), #53603 (task subject leak via spinner + context).

Expected

One of (in order of escalating fix):

(a) Aging: tasks not touched for N days drop off the visible spinner list + the system-reminder injection. Still queryable via TaskList if explicitly requested. Closest analogue: a TTL on updatedAt.
(b) Session scoping: tasks belong to the session that created them. Cross-session continuation requires explicit reattach (e.g. TaskList --since-session=N). Closest analogue: how shell job control works — jobs shows your jobs, not the previous user's.
(c) Spinner top-N by recency: even if the underlying store keeps every task, the spinner should sort by updatedAt DESC rather than createdAt ASC, so stale in_progress tasks fall off the visible portion.
(d) Automatic cleanup on session start: when a new session starts in a project with tasks where status == in_progress and updatedAt > 30 days ago, flip those to a new state like abandoned so they stop reporting as 'currently being worked on'.

(c) is the smallest fix and would mitigate the spinner-visual problem without losing data. (a) + (d) together would also mitigate the system-reminder context-bloat. (b) is the cleanest semantically but biggest architectural change.

Repro

Start a project. Create some tasks. Mark some in_progress. End the session without marking them done.
Wait a month. Or do many other sessions in the same project.
Start a new session. Create a new task and run a long tool call.
Observe the spinner shows old in_progress tasks from step 1 mixed with the new task from step 3. The user has to mentally filter every spinner update to find the actually-current work.

In my case the project has 80+ accumulated tasks from a multi-month refactor arc. Many of the in_progress and pending ones at the top of the list (by creation order) are from work that completed months ago in different contexts — Phase 2 / Phase 5 / namespace anchor refactor tasks that have all long since shipped. The actually-current work is in tasks created today, near the bottom of the chronological list.

Why It Matters

For users running long-lived agents on real projects:

The spinner is the primary status surface during long tool calls. If the spinner consistently shows wrong context, users stop trusting it.
The system-reminder repeatedly re-grounds the model on phantom tasks. Beyond token cost, this can cause the agent to "re-pick-up" a task that's no longer scoped — a class-of-bug observed in #39961 ("AI Groundhog Day").
New users in a fresh project don't notice the problem; long-term users in a months-old project see it on every session.

#53603 — task subjects leak via spinner UI + context re-injection (overlapping shape; this issue is the staleness dimension specifically)
#46465 — system-reminder injection design
#39961 — 'AI Groundhog Day' (agent re-doing completed work across sessions; stale task list is one contributing mechanism)
#59195 — feature request for a persistent Todo List panel (the underlying primitive is the same; this bug is about scoping discipline on it)

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix [BUG] Task list accumulates stale tasks across sessions; spinner surfaces in_progress tasks from months ago alongside current work

Recommended Tools

GitHub issue graph ai analysis

Code Example

Preflight Checklist

Version

What's Wrong?

Expected

Repro

Why It Matters

Related

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix [BUG] Task list accumulates stale tasks across sessions; spinner surfaces in_progress tasks from months ago alongside current work

Recommended Tools

GitHub issue graph ai analysis

Code Example

Preflight Checklist

Version

What's Wrong?

Expected

Repro

Why It Matters

Related

Still need to ship something?

RELATED_DISCOVERY

TRENDING