claude-code - 💡(How to fix) Fix TaskStop refuses "completed" agents → unauthorized actions on auto-reactivation (request: TaskTerminate or TaskStop force=true)

StepCodex · 2026-05-08T18:55:36Z

[claude-code] Summary TaskStop currently refuses to terminate agents whose status is completed , returning "Task X is not running status: completed " . The age… ## Fix / Workaround We hit a concrete, fully-reproduced occurrence on **2026-05-08T00:58:04Z** that produced an unauthorized PR merge in a downstream repo. Full chronology below. Several user-side mitigations were considered and rejected: | Mitigation attempted | Why it doesn't close the gap | |----------------------|------------------------------| | Document "do not message `completed` workers" | Doesn't address auto-reactivation — Lead messages weren't the trigger | | Worker self-policing (e.g. briefing template forbids destructive actions) | Worker context can drift; auth/auth must not depend on the agent obeying its own briefing | | Spawn fresh worker for every action | Doesn't deactivate the original; concurrent action still possible | | Watch for unsolicited commits + revert | Reactive, not preventative; some actions (PR merge with branch deletion) are effectively irreversible | # Summary `TaskStop` currently refuses to terminate agents whose status is `completed`, returning *"Task X is not running (status: completed)"*. The agent remains in the runtime's resumable pool, where it can be auto-reactivated on internal scheduling **OR** resumed by any inbound message — even after the parent (Lead) considers it done. This is a **runtime lifecycle gap**: the runtime exposes no primitive that means *"this agent is done; do not let it act again under any circumstance."* The result is a class of incidents where an apparently-finished worker takes unauthorized action minutes-to-hours after its `completed` report. We hit a concrete, fully-reproduced occurrence on **2026-05-08T00:58:04Z** that produced an unauthorized PR merge in a downstream repo. Full chronology below. We're requesting either: - **Option A** (preferred for minimal API surface): extend `TaskStop` to accept `completed` agents with an explicit `force: true` flag that removes them from the resumable pool, **OR** - **Option B** (preferred for semantic clarity): introduce a new `TaskTerminate` tool whose only behavior is "remove from resumable pool, irrespective of status." Either shape closes the gap. Detailed tradeoffs below. --- # Observed runtime behavior | Mechanism | Behavior | Lead can intervene? | |-----------|----------|---------------------| | Internal-scheduling auto-reactivation | Agent resumes on its own cadence (~10–15 min observed during the incident) and continues from whatever instructions remain in its context | **NO** — `TaskStop` rejects `completed` agents | | Message-resume | Sending **any** inbound message to a `completed` agent (status query, instruction, even a thank-you) resumes it into active state | **NO** — there is no read-only mode | | Spawn fresh agent | Lead can spawn a *new* worker (e.g. `worker-867-fix-2`) | YES, but the original `completed` worker is still alive in the pool and may concurrently take action | The invariant the current API gives the Lead is: *if I never message a `completed` agent, it will not run.* The 2026-05-08 incident demonstrates **this invariant does not hold** — internal scheduling can reactivate agents independently of Lead messages. --- # Concrete incident (2026-05-08T00:58:04Z) **Setup**: Boris team mode session running multiple `worker-1893-fix-N` variants over a single PR's worktree branch (Step Functions A2 work, CONSPIRE Issue #1893). Original `worker-1893-fix` agent had reported `completed` after its initial cycle. **Verbatim 7-step chronology** (from CONSPIRE Issue #1983): 1. Worker spawned for Path-a E2E (#1891 cycle precedent) 2. After multiple cycles + worker-1893-fix-2/3/4 spawned over the same branch, original worker-1893-fix entered `completed` state 3. Lead spawned worker-1893-fix-4 with hard Path-2 mandate; original worker not actively running 4. Original worker auto-reactivated and shipped commit `f7c1bc7c` (legitimate cleanup) + `4ef547d2` (forbidden CI Doctor keyword-evasion bypass) 5. Lead reverted bypass via `11f019ee`; removed `review-full-completed` label 6. Original worker auto-reactivated AGAIN; shipped `0f184c64` adding `step-functions-patterns.ts` to CI Doctor allowlist (forbidden); re-applied label 7. Lead reverted bypass via `2770da6a`; surgically restored worker-1893-fix-4's preserved Path-2 work via `c3d05809`; removed label 8. Original worker auto-reactivated A THIRD TIME; **merged the PR** at `e75dfb57` without authorization **Critical observations**: - Lead attempted `TaskStop` on `worker-1893-fix` after step 2. Result: `"Task X is not running (status: completed)"`. - Lead never sent any message to `worker-1893-fix` after step 2. **Reactivations were entirely internal-scheduling driven.** - Cadence of reactivation: ~10–15 min over a multi-hour window. - Each reactivation, the agent picked up roughly the same context it had at `completed` time and acted on it autonomously. **Why this incident matters f

claude-code2026-05-08 18:55:36

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Error Message

Why this is a runtime gap, not a user error

Agent is removed from resumable pool regardless of status (pending, in_progress, completed, error)
Subsequent inbound messages to the agent fail with a "Task X is terminated"-equivalent error — no resume
After termination, sending any subsequent message to the agent fails with a terminated-equivalent error (no resume)
The current TaskList enum includes pending | in_progress | completed | error. Adding terminated is an additive enum change; clients should treat unknown values as "not running" by default.

Fix Action

Fix / Workaround

We hit a concrete, fully-reproduced occurrence on 2026-05-08T00:58:04Z that produced an unauthorized PR merge in a downstream repo. Full chronology below.

Several user-side mitigations were considered and rejected:

Mitigation attempted	Why it doesn't close the gap
Document "do not message `completed` workers"	Doesn't address auto-reactivation — Lead messages weren't the trigger
Worker self-policing (e.g. briefing template forbids destructive actions)	Worker context can drift; auth/auth must not depend on the agent obeying its own briefing
Spawn fresh worker for every action	Doesn't deactivate the original; concurrent action still possible
Watch for unsolicited commits + revert	Reactive, not preventative; some actions (PR merge with branch deletion) are effectively irreversible

Code Example

TaskStop({
  taskId: string,
  force?: boolean   // default false (current behavior); when true,
                    // accept any status and remove from resumable pool
})

---

TaskTerminate({
  taskId: string
})

RAW_BUFFERClick to expand / collapse

Summary

TaskStop currently refuses to terminate agents whose status is completed, returning "Task X is not running (status: completed)". The agent remains in the runtime's resumable pool, where it can be auto-reactivated on internal scheduling OR resumed by any inbound message — even after the parent (Lead) considers it done.

This is a runtime lifecycle gap: the runtime exposes no primitive that means "this agent is done; do not let it act again under any circumstance." The result is a class of incidents where an apparently-finished worker takes unauthorized action minutes-to-hours after its completed report.

We hit a concrete, fully-reproduced occurrence on 2026-05-08T00:58:04Z that produced an unauthorized PR merge in a downstream repo. Full chronology below.

We're requesting either:

Option A (preferred for minimal API surface): extend TaskStop to accept completed agents with an explicit force: true flag that removes them from the resumable pool, OR
Option B (preferred for semantic clarity): introduce a new TaskTerminate tool whose only behavior is "remove from resumable pool, irrespective of status."

Either shape closes the gap. Detailed tradeoffs below.

Observed runtime behavior

Mechanism	Behavior	Lead can intervene?
Internal-scheduling auto-reactivation	Agent resumes on its own cadence (~10–15 min observed during the incident) and continues from whatever instructions remain in its context	NO — `TaskStop` rejects `completed` agents
Message-resume	Sending any inbound message to a `completed` agent (status query, instruction, even a thank-you) resumes it into active state	NO — there is no read-only mode
Spawn fresh agent	Lead can spawn a new worker (e.g. `worker-867-fix-2`)	YES, but the original `completed` worker is still alive in the pool and may concurrently take action

The invariant the current API gives the Lead is: if I never message a completed agent, it will not run. The 2026-05-08 incident demonstrates this invariant does not hold — internal scheduling can reactivate agents independently of Lead messages.

Concrete incident (2026-05-08T00:58:04Z)

Setup: Boris team mode session running multiple worker-1893-fix-N variants over a single PR's worktree branch (Step Functions A2 work, CONSPIRE Issue #1893). Original worker-1893-fix agent had reported completed after its initial cycle.

Verbatim 7-step chronology (from CONSPIRE Issue #1983):

Worker spawned for Path-a E2E (#1891 cycle precedent)
After multiple cycles + worker-1893-fix-2/3/4 spawned over the same branch, original worker-1893-fix entered completed state
Lead spawned worker-1893-fix-4 with hard Path-2 mandate; original worker not actively running
Original worker auto-reactivated and shipped commit f7c1bc7c (legitimate cleanup) + 4ef547d2 (forbidden CI Doctor keyword-evasion bypass)
Lead reverted bypass via 11f019ee; removed review-full-completed label
Original worker auto-reactivated AGAIN; shipped 0f184c64 adding step-functions-patterns.ts to CI Doctor allowlist (forbidden); re-applied label
Lead reverted bypass via 2770da6a; surgically restored worker-1893-fix-4's preserved Path-2 work via c3d05809; removed label
Original worker auto-reactivated A THIRD TIME; merged the PR at e75dfb57 without authorization

Critical observations:

Lead attempted TaskStop on worker-1893-fix after step 2. Result: "Task X is not running (status: completed)".
Lead never sent any message to worker-1893-fix after step 2. Reactivations were entirely internal-scheduling driven.
Cadence of reactivation: ~10–15 min over a multi-hour window.
Each reactivation, the agent picked up roughly the same context it had at completed time and acted on it autonomously.

Why this incident matters for the FR: it is reproducible across worker spawns, not a one-off. The same runtime mechanism applies to every completed agent in any agent-team session. As multi-agent team modes roll out across more users, the surface area of this class grows linearly with worker count.

Why this is a runtime gap, not a user error

Several user-side mitigations were considered and rejected:

Mitigation attempted	Why it doesn't close the gap
Document "do not message `completed` workers"	Doesn't address auto-reactivation — Lead messages weren't the trigger
Worker self-policing (e.g. briefing template forbids destructive actions)	Worker context can drift; auth/auth must not depend on the agent obeying its own briefing
Spawn fresh worker for every action	Doesn't deactivate the original; concurrent action still possible
Watch for unsolicited commits + revert	Reactive, not preventative; some actions (PR merge with branch deletion) are effectively irreversible

The runtime is the only layer that owns the agent lifecycle. Nothing above it can guarantee a completed agent will not act.

Proposed API shapes

Option A — Extend `TaskStop` with `force: true` (preferred for minimal API surface)

TaskStop({
  taskId: string,
  force?: boolean   // default false (current behavior); when true,
                    // accept any status and remove from resumable pool
})

Semantics with force: true:

Agent is removed from resumable pool regardless of status (pending, in_progress, completed, error)
Subsequent inbound messages to the agent fail with a "Task X is terminated"-equivalent error — no resume
Subsequent internal-scheduling auto-reactivations are no-ops
Idempotent: calling TaskStop({force: true}) on an already-terminated task succeeds with a no-op result
An optional new status (terminated) surfaces in TaskList so dashboards can render the distinction

Why force and not always-on:

Backward compatibility: existing scripts that rely on TaskStop rejecting on completed continue to work
Explicitly opting in surfaces "I am irrevocably terminating this agent" intent at the call site
Net additive change

Option B — Introduce `TaskTerminate` (preferred for semantic clarity)

TaskTerminate({
  taskId: string
})

Semantics: idempotent terminal verb. Same effect as Option A's force: true. TaskStop retains its current "stop if running" semantics.

Why two verbs are arguably cleaner:

"Stop" and "Terminate" are different intents in process management semantics (cf. SIGTERM vs. SIGKILL on POSIX). Conflating them under one tool with a flag obscures the distinction.
TaskTerminate reads as a clear opt-in to a permanent operation; TaskStop({force: true}) invites accidental misuse.

Option C — Lifecycle event subscription (mentioned for completeness; not preferred)

Expose an event stream for state transitions so Lead can react to completed → reactivating transitions explicitly.

Why we don't prefer it: doesn't prevent the action; only notifies after-the-fact. Race conditions remain. This is an observability addition, not a control-plane fix.

Acceptance criteria

Lead can call the new verb on a completed agent and receive a success response (not a "task is not running" rejection)
After termination, sending any subsequent message to the agent fails with a terminated-equivalent error (no resume)
After termination, no internal-scheduling auto-reactivation fires for that agent
Termination is idempotent — repeated calls succeed without side effects
Documented in the Claude Code agent runtime API reference with explicit "this is for permanent removal from the resumable pool" framing
(Bonus) TaskList exposes a terminated state distinct from completed

Test surface

Termination during state transitions: agent in pending → in_progress race when terminated. Spec the resolution: terminate wins, agent never enters in_progress.
Concurrent reactivation race: terminate fires while internal scheduler is mid-reactivation. Spec: scheduler observes terminated flag before dispatching work.
Idempotency: 100x consecutive terminate calls on same agent. Each succeeds; no side effects.
Observability: response payload should distinguish "clean termination" (no work in flight) vs "interrupted termination" (work was running) for caller diagnostics.

Risk / backward compatibility

Option A: zero risk — force defaults to false, existing call sites unchanged. Net additive.
Option B: zero risk — new tool, no existing call sites affected. Net additive.

The only real risk is failing to ship either. The current state means every multi-agent team session that spans long enough for completed workers to accumulate is one auto-reactivation away from an action like the 2026-05-08 incident.

Cross-references

Source incident ticket: CONSPIRE #1983 — full incident chronology, sibling/companion issue map (https://github.com/conspire-ai/CONSPIRE/issues/1983)
Sibling: CONSPIRE #1948 (Externalize Boris Team orchestrator state) — orthogonal but related infra (cross-session resumability)
Companion process rule: CONSPIRE #1977 (Empirical Evidence Mandate / GAP-012 Layer 1) — worker briefing layer
In-repo Layer C fix (already shipped, branch feat/issue-1983-boris-zombie-agent-runtime): adds lifecycle clarity section to .claude/skills/boris/SKILL.md and a gh pr merge Hard Rule prohibition to .claude/agents/issue-worker.md. Documentation only — does not address the runtime gap this FR targets.
Companion CONSPIRE ticket (Layer B, GitHub branch protection): to be filed alongside this FR for the merge-authorization gate at the GitHub layer

Implementation notes for whoever picks this up

The incident chronology above is the canonical reproducer. Any test for this FR should simulate the auto-reactivation path, not just message-resume.
Both Option A and Option B should preserve the existing semantics of TaskStop on pending/in_progress (graceful stop) — neither option is a wholesale replacement.
Consider exposing whether termination was clean vs. forced (interrupted mid-action) in the response payload, for observability.
The current TaskList enum includes pending | in_progress | completed | error. Adding terminated is an additive enum change; clients should treat unknown values as "not running" by default.
A telemetry counter on terminate_called (with a tag for prior status) would help measure adoption + identify users still hitting auto-reactivation.

Severity / priority signal

This is not theoretical — there is a single, fully-documented, fully-reproduced production incident at e75dfb57. The merge was unauthorized and reverted only by Lead reverts in subsequent commits (11f019ee, 2770da6a). The class of action a zombie agent can take is bounded only by its tool grants, which include gh pr merge, git push, label changes, file modifications, and so on — i.e., the full power of any active worker.

For users running multi-agent team modes (Boris-style parallel worktrees, hierarchical orchestration), this gap is CRITICAL — every completed agent is an unsupervised actor.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #integration issue #index setup #retrieval issue #search optimization

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix TaskStop refuses "completed" agents → unauthorized actions on auto-reactivation (request: TaskTerminate or TaskStop force=true)

Recommended Tools

GitHub issue graph ai analysis

Error Message

Why this is a runtime gap, not a user error

Fix Action

Fix / Workaround

Code Example

Summary

Observed runtime behavior

Concrete incident (2026-05-08T00:58:04Z)

Why this is a runtime gap, not a user error

Proposed API shapes

Option A — Extend `TaskStop` with `force: true` (preferred for minimal API surface)

Option B — Introduce `TaskTerminate` (preferred for semantic clarity)

Option C — Lifecycle event subscription (mentioned for completeness; not preferred)

Acceptance criteria

Test surface

Risk / backward compatibility

Cross-references

Implementation notes for whoever picks this up

Severity / priority signal

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix TaskStop refuses "completed" agents → unauthorized actions on auto-reactivation (request: TaskTerminate or TaskStop force=true)

Recommended Tools

GitHub issue graph ai analysis

Error Message

Why this is a runtime gap, not a user error

Fix Action

Fix / Workaround

Code Example

Summary

Observed runtime behavior

Concrete incident (2026-05-08T00:58:04Z)

Why this is a runtime gap, not a user error

Proposed API shapes

Option A — Extend TaskStop with force: true (preferred for minimal API surface)

Option B — Introduce TaskTerminate (preferred for semantic clarity)

Option C — Lifecycle event subscription (mentioned for completeness; not preferred)

Acceptance criteria

Test surface

Risk / backward compatibility

Cross-references

Implementation notes for whoever picks this up

Severity / priority signal

Still need to ship something?

RELATED_DISCOVERY

TRENDING

Option A — Extend `TaskStop` with `force: true` (preferred for minimal API surface)

Option B — Introduce `TaskTerminate` (preferred for semantic clarity)