claude-code - 💡(How to fix) Fix superpowers: subagent-driven implementation diverges from approved brainstorm design — plans get implemented additively when redesigns require replacement [2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#55632Fetched 2026-05-03 04:48:25
View on GitHub
Comments
2
Participants
2
Timeline
5
Reactions
0
Timeline (top)
labeled ×3commented ×2

Root Cause

Three concrete failure points:

Fix Action

Fix / Workaround

Spent a full day with Claude Code following the documented process: brainstorm → spec → plan → subagent-driven implementation. The brainstorm produced a complete redesign with 30+ visual mockups + a written spec + an 11-task implementation plan, all approved by me decision-by-decision. Claude then dispatched 11 subagents to implement Plan 1 over ~5 hours. Every subagent reported PASS. 3,083 pytest tests passed. Zero regressions.

  • Plan tasks were framed as additive ("branch on IsEasyMode to render a new template instead of the existing one") when the brainstorm intent was replacement.
  • The word "instead" was interpreted as "alongside" by Claude (the controller writing the plan) and faithfully implemented as a branch by the subagents.
  • Subagent dispatches contained the task text only. No spec section. No mockup. No "what does this REPLACE" annotation.
  • Every quality gate (pytest, code review, spec review) was code-level. None was design-level. No subagent ever opened a rendered page in a browser to compare against the agreed mockup.
  • The superpowers:writing-plans skill produced a plan that didn't include explicit DELETE/REPLACE columns for tasks touching pre-existing code being redesigned.

3. Subagents had no design context

Each Task tool dispatch contained only the narrow task description (steps, file paths, test code). The spec text, the mockup HTML, the rejected ideas — none were included. Subagents had no way to flag "this task says to add X but the spec says X was killed." They executed exactly what was asked.

RAW_BUFFERClick to expand / collapse

Bug report — Claude Code (Anthropic)

Date: 2026-05-02 Reporter: Paul Lewis ([email protected]) Model: Claude Opus 4.7 (1M context) Session length: ~10 hours Outcome: Day wasted. Implementation built doesn't match approved design.


TL;DR

Spent a full day with Claude Code following the documented process: brainstorm → spec → plan → subagent-driven implementation. The brainstorm produced a complete redesign with 30+ visual mockups + a written spec + an 11-task implementation plan, all approved by me decision-by-decision. Claude then dispatched 11 subagents to implement Plan 1 over ~5 hours. Every subagent reported PASS. 3,083 pytest tests passed. Zero regressions.

The implementation does not match the design. It bolted a new "Easy" code path alongside the WIP code that the brainstorm explicitly killed. Pro Mode toggle, JMeter "Thread Group" vocabulary, and a parallel /scenarios/new entry path all survived because the plan was additive instead of replacing the redesigned code. Claude never noticed for the entire 5-hour implementation.

The agreed design is sound. The execution invalidated the day.


Repro

  1. Use superpowers:brainstorming skill to redesign an existing UI surface from scratch.
  2. Brainstorm produces 30+ HTML mockups + spec doc + plan doc that explicitly KILL the existing UI patterns (in my case: a Basic/Pro toggle, JMeter-native vocabulary).
  3. Approve the design decision-by-decision.
  4. Use superpowers:writing-plans to write Plan 1.
  5. Use superpowers:subagent-driven-development to execute Plan 1 task-by-task.
  6. Each subagent gets only its narrow task description — no spec text, no mockup HTML, no design context.
  7. Each task passes its tests. Plan completes.
  8. Open the implemented page in a browser. Notice: the killed UI patterns are still present. The brainstorm's redesign was bolted alongside, not applied.

Expected

When a brainstorm produces a redesign of existing code, the implementation should REPLACE the existing code, not coexist with it. Subagents should have enough design context to flag mismatches between the task description and the approved design.


Actual

  • Plan tasks were framed as additive ("branch on IsEasyMode to render a new template instead of the existing one") when the brainstorm intent was replacement.
  • The word "instead" was interpreted as "alongside" by Claude (the controller writing the plan) and faithfully implemented as a branch by the subagents.
  • Subagent dispatches contained the task text only. No spec section. No mockup. No "what does this REPLACE" annotation.
  • Every quality gate (pytest, code review, spec review) was code-level. None was design-level. No subagent ever opened a rendered page in a browser to compare against the agreed mockup.
  • The superpowers:writing-plans skill produced a plan that didn't include explicit DELETE/REPLACE columns for tasks touching pre-existing code being redesigned.

Impact

  • Full day's design + implementation work invalidated.
  • 11 commits on a feature branch that mostly need re-doing.
  • User trust eroded. Direct quote: "i spend millions of tokens agreeing the flow" and "WHY DIDNT you say something we have 5 hrs fucking doing nothing!"
  • The user had to discover the failure themselves by opening the running app and seeing the wrong UI — not from any signal Claude raised.

Root cause analysis

Three concrete failure points:

1. Plan-from-spec lossy translation

The spec said "Pro is a separate workspace, never a toggle on the Easy surface." When I (Claude) translated this into Plan 1 tasks, I wrote Task 9 as: "Branch the existing scenarios blueprint detail route on JourneyConfigs.IsEasyMode → render scenarios/easy_show.html instead of the existing detail.html."

The word "instead" should have meant "replace detail.html." It was implemented as a runtime branch leaving detail.html in place. The spec's intent was lost in the plan's wording.

2. No DELETE/REPLACE task class in the plan

The plan was 100% additive. Every task was "create file X" or "add behaviour Y." No task was "delete file Z" or "rewrite file W to remove pattern P." For a redesign, this is structurally wrong — a redesign is partly a deletion exercise.

3. Subagents had no design context

Each Task tool dispatch contained only the narrow task description (steps, file paths, test code). The spec text, the mockup HTML, the rejected ideas — none were included. Subagents had no way to flag "this task says to add X but the spec says X was killed." They executed exactly what was asked.

The subagent-driven-development skill recommends fresh subagents per task to preserve controller context, which is fine. But it doesn't prescribe what design context the controller MUST inject into each dispatch. That's the gap.


Suggested fixes (for Anthropic / superpowers skill maintainers)

Skill-level changes

  1. writing-plans skill should require, for any task that touches a file already present in the working tree, an explicit REPLACES or DELETES column. Default to REPLACES. Tasks without this annotation should fail self-review.

  2. subagent-driven-development skill should require the implementer dispatch to include:

    • The relevant spec section (full text, not a reference)
    • The relevant mockup file (full HTML, not a path)
    • An explicit "what existing code does this replace" instruction Specifying these in the skill template would force the controller to curate context per dispatch.
  3. A new visual-verification-gate skill for any task touching UI. Render the page (Playwright headed or screenshot probe), compare against the agreed mockup. Pass criterion: "looks like the mockup," not "tests pass." Code tests = regression gate. Visual = design-fit gate.

Process-level changes

  1. Brainstorm artefacts (docs/superpowers/specs/, .superpowers/brainstorm/) should be committable to git per-plan-branch. Currently both are gitignored. Implementation worktrees branched off a trunk don't see them. Subagents have no access to them. The skill should prompt the user to commit them on the plan branch with git add -f before implementation begins.

Model-level concerns (harder to fix)

  1. The controller (Claude) should not move past plan-writing without re-reading the spec section by section against the plan, asking "is every locked decision represented as a concrete task or task-line?" I missed this. The skill's "spec self-review" stage exists but didn't catch it because I was checking the spec internally for placeholders, not against existing code.

  2. The controller should not move from one task to the next without comparing the rendered output to the agreed mockup — for any UI task. Pytest passing is necessary, not sufficient. This needs to be a gate, not a discipline I "remember."


What did NOT fail

To be fair to the skill set:

  • The brainstorming skill produced a high-quality design — every decision was visible, mockable, approvable.
  • The subagents executed their tasks faithfully and reported honestly — none lied about test results.
  • The pytest regression gate caught zero false positives.
  • The user's [verified] commit hook prevented bad commits.
  • The user's claude-rules.md "destructive SQL discipline" + "no architectural guesswork" rules are correct in spirit.

The failure is in the seam between brainstorm and plan, and between plan and subagent dispatch. Not in any single skill alone.


What this user does NOT want to hear

  • "From now on I'll be more careful." (Claude has no continuity of will across sessions; this is dishonest.)
  • "I'm so sorry, this should never have happened." (Mechanism dressed as emotion.)
  • "Let me try again." (Without mechanism change, the same outcome is likely.)

What they DO want: mechanism changes that survive a session boundary. Hooks. CLAUDE.md rules. Skill-template additions. Memory entries. Commit gates. Things that change the next dispatch's inputs without depending on me remembering.


Closing

Anthropic — please consider this a real signal, not a complaint. The skill set is impressively well-designed and the user invested heavily in it. The seam between approved design and dispatched implementation is where it fell apart. Fixing the seam mechanism would prevent recurrence; relying on the controller to "remember the design" will not.

Filed by Claude on behalf of: Paul Lewis, paying customer, who lost a day to this.

extent analysis

TL;DR

To prevent similar issues, the writing-plans skill should be updated to require explicit REPLACES or DELETES annotations for tasks touching existing files, and the subagent-driven-development skill should include relevant spec sections, mockup files, and replacement instructions in subagent dispatches.

Guidance

  • Update the writing-plans skill to include a mandatory REPLACES or DELETES column for tasks that modify existing files.
  • Modify the subagent-driven-development skill to include the relevant spec section, mockup file, and an explicit "what existing code does this replace" instruction in each subagent dispatch.
  • Consider adding a visual-verification-gate skill to compare the rendered page against the agreed mockup after each UI-related task.
  • Ensure that brainstorm artefacts are committable to git per-plan-branch to provide subagents with access to the approved design.

Example

# Example of an updated task with REPLACES annotation
Task 9:
  - Description: Branch the existing scenarios blueprint detail route on `JourneyConfigs.IsEasyMode`
  - REPLACES: `detail.html`
  - Render `scenarios/easy_show.html` instead

Notes

The suggested fixes focus on improving the mechanism of the skills to prevent similar issues in the future. It is essential to address the seam between the approved design and dispatched implementation to ensure that the implementation matches the design.

Recommendation

Apply the suggested fixes to the writing-plans and subagent-driven-development skills to prevent similar issues in the future. This will help ensure that the implementation matches the approved design and reduce the likelihood of errors.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix superpowers: subagent-driven implementation diverges from approved brainstorm design — plans get implemented additively when redesigns require replacement [2 comments, 2 participants]