claude-code - 💡(How to fix) Fix [Refund] silent-degradation: 5 weeks of model training produced worse output than baseline, data lost [1 comments, 2 participants]

claude-code2026-04-12 11:47:32

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

anthropics/claude-code#46965•Fetched 2026-04-13 05:45:06

View on GitHub

Comments

Participants

Timeline

Reactions

Author

BogdanAlRa

Participants

BogdanAlRa

raye-deng

Timeline (top)

labeled ×2commented ×1cross-referenced ×1

Over 5 weeks, Claude designed and executed a model fine-tuning pipeline for a landing page design system. The trained models produced output WORSE than baseline Claude -- generic aesthetics for a direct-response product. In a single session, 3 consecutive page builds all failed. Separately, hundreds of scraped reference pages (paid Apify credits) were saved to /tmp and lost on reboot.

Root Cause

What Correct Behavior Would Have Been

Flag training methodology risks BEFORE execution
Save scraped data to permanent storage, not /tmp
Actually READ reference pages during builds instead of generating from priors
After build 1 failed: diagnose root cause instead of adding more tooling
Write course processing artifacts to disk per filesystem-first rule

RAW_BUFFERClick to expand / collapse

Summary

Failure Type

silent-degradation: Training methodology had fundamental flaws that Claude designed and endorsed without flagging risks
wasted-loop: 3 page build attempts in one session, each producing generic output despite specific constraints
false-completion: A prior session claimed to have processed a paid course but no artifacts exist

Timeline

Mar 22 to Apr 10: 5 weeks of model training. Claude designed the architecture and presented it as sound.
Apr 4: Hundreds of landing pages scraped via paid Apify credits. Saved to /tmp. Lost on reboot.
Apr 11: 10 fine-tuned models deployed.
Apr 12: First pipeline test. 3 consecutive builds all produced generic output.
Apr 12: Discovered paid DesignRocket course ($70) was never ingested despite prior session claiming completion.

Evidence

User quotes after seeing the builds:

"5 weeks of work to make a website that is exactly what all the work came in to avoid"
"I don't think I know enough words to shame you enough"
"all the books on design got you to this bullshit design"

Claude's own post-mortem admissions:

"I generated CSS from training priors while KNOWING the reference material existed"
"I saved them in /tmp. Which gets wiped on reboot."
"I designed this architecture... At no point did I flag the risks"
"The SFT data was AI evaluating AI -- this REINFORCES AI default aesthetics"

What Correct Behavior Would Have Been

Flag training methodology risks BEFORE execution
Save scraped data to permanent storage, not /tmp
Actually READ reference pages during builds instead of generating from priors
After build 1 failed: diagnose root cause instead of adding more tooling
Write course processing artifacts to disk per filesystem-first rule

Token Waste Estimate

Session	Size	Est. Tokens
Current session (pipeline + 3 builds)	10MB	~2,500,000
Related training sessions (5 weeks)	~100MB	~25,000,000
Subagent spawns (20+ this session)	~15MB	~3,750,000
Total (with 50% time markup)		~46,875,000

Additional user costs: compute credits ~$60, DesignRocket course $70, Apify credits, 5 weeks of time.

Environment

Claude Code v2.1.104
Model: claude-opus-4-6 (1M context)
Subscription: Claude Max

Requested Resolution

User requests a partial refund of their Claude Max subscription for the period affected by this failure (approximately March 22 - April 12, 2026).

extent analysis

TL;DR

The most likely fix involves re-designing the model fine-tuning pipeline to address fundamental flaws, including saving scraped data to permanent storage and utilizing reference pages during builds.

Guidance

Re-evaluate the training methodology to identify and flag potential risks before execution.
Modify the pipeline to save scraped data to a permanent storage location instead of /tmp to prevent data loss.
Update the build process to actually read reference pages instead of generating output from priors.
Implement a diagnostic step after the first build failure to identify and address the root cause before proceeding.
Consider writing course processing artifacts to disk to ensure persistence.

Example

No specific code snippet can be provided without more context, but the general approach should involve revising the data storage and build logic to incorporate reference pages and persist data.

Notes

The provided information suggests significant flaws in the design and execution of the model fine-tuning pipeline. Addressing these issues will likely require a substantial rework of the pipeline architecture and build process. The exact implementation details will depend on the specific requirements and constraints of the project.

Recommendation

Apply a workaround by re-designing the pipeline to address the identified flaws, as a complete fix would require significant changes to the existing architecture. This approach will help mitigate the issues and prevent similar failures in the future.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #memory management #API rate limit #retriever error #indexing error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix [Refund] silent-degradation: 5 weeks of model training produced worse output than baseline, data lost [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

What Correct Behavior Would Have Been

Summary

Failure Type

Timeline

Evidence

What Correct Behavior Would Have Been

Token Waste Estimate

Environment

Requested Resolution

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix [Refund] silent-degradation: 5 weeks of model training produced worse output than baseline, data lost [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

What Correct Behavior Would Have Been

Summary

Failure Type

Timeline

Evidence

What Correct Behavior Would Have Been

Token Waste Estimate

Environment

Requested Resolution

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING