claude-code - 💡(How to fix) Fix Logic bug in Claude-generated code caused ~$30 in wasted API costs + data loss [1 comments, 2 participants]

Error Message

2. Cost estimation error

The user requests a refund or credit of ~$30 for the wasted Anthropic API costs. The costs were incurred due to Claude Code's own logic bug and inaccurate cost estimation, not user error. The user explicitly asked for safety verification before running, was told it was safe, and proceeded in good faith. The data loss and need for re-running are entirely attributable to Claude's code.

Root Cause

During a Claude Code session, Claude built a Python pipeline that calls the Anthropic API (Haiku 4.5) to enrich Google Drive file metadata. A logic bug in Claude's own code caused ~$30 worth of already-retrieved Haiku API responses to be permanently overwritten and lost. The user explicitly asked Claude to verify the pipeline's safety before running. Claude confirmed it was safe but failed to catch the bug.

Summary

What happened

User asked Claude Code to build a Google Drive knowledge base indexing system
The pipeline calls Haiku 4.5 to extract structured metadata (tickers, type, topics, keywords, summary) from ~650 PDFs via PDF document-block enrichment
Before running, user explicitly asked Claude to "confirm the entire pipeline won't fail and won't require re-running, or add fail-safe mechanisms." Claude responded with a detailed checklist claiming all failure modes were covered.
After enrichment completed (~$30 in actual API costs, vs Claude's estimate of $5-10), the code triggered an automatic reclassify_row function that overwrote Haiku-extracted fields (tickers, type) with empty regex-based values
The bug: reclassify_row was designed to re-apply topic labels after new topic seeds were added, but it unconditionally re-ran full regex classification and overwrote ALL fields — including those set by the expensive Haiku enrichment
No local cache of Haiku responses existed — Claude did not implement response caching despite building a multi-step pipeline with expensive, irreversible API calls
Google Sheets revision history merged the good writes and bad overwrites into a single revision, making recovery impossible
~450 rows permanently lost their Haiku-extracted tickers field; type was partially recovered from an earlier Sheet revision

Two distinct failures

1. Logic bug in `reclassify_row`

The function was supposed to only update topic labels but instead re-ran full regex classification, overwriting Haiku-set source/tickers/type/topics with empty regex results. This is a straightforward code bug that would have been caught by testing the end-to-end pipeline on even 3 sample rows before running on the full dataset.

2. Cost estimation error

Claude estimated the enrichment would cost $5-10. Actual cost was $30+. Claude underestimated the per-page token cost of PDF document blocks (each page is rendered as a high-resolution image, ~3000-5000 tokens/page, not the ~1000 Claude assumed). This caused the user to approve a run that was 3-6x more expensive than quoted.

Impact

~$30 in API costs, of which a significant portion was wasted (the extracted data was destroyed by the bug)
~$2.5 additional cost needed to re-extract the lost data
Multiple hours of user time debugging, attempting recovery, and re-running partial fixes
Partial data recovered from Sheet revision history, but ~258 enriched rows permanently lost their tickers field

What should have been done differently

Cache API responses locally before any post-processing. The Haiku JSON responses should have been written to data/enrich_cache/{file_id}.json before writing to Sheet. This is a basic engineering practice for expensive, irreversible operations.
End-to-end test before full run. Claude's "fail-safe review" only checked for crash recovery (retry, batched writes, idempotent re-runs). It did not test the actual code path that triggers after enrichment (topic growth → reclassify). Running 3 files through the complete pipeline including the reclassify step would have caught the bug.
Accurate cost estimation. Claude should have run a small sample (e.g., 3 PDFs) and checked actual API token usage before extrapolating to 650 files, rather than estimating from assumptions about PDF token counts.
reclassify should be scoped narrowly. A function meant to update topic labels should not have the ability to overwrite unrelated fields (source, tickers, type). This is a separation-of-concerns issue.

Requested resolution

Environment

Claude Code CLI
Model: claude-opus-4-6 (1M context)
Date: 2026-04-15 to 2026-04-16
User: [email protected]

extent analysis

TL;DR

The most likely fix is to implement a local cache of Haiku API responses to prevent data loss in case of errors, and to revise the reclassify_row function to narrowly scope its updates and avoid overwriting unrelated fields.

Guidance

Implement a local cache of Haiku API responses in data/enrich_cache/{file_id}.json before writing to Google Sheets to prevent data loss.
Revise the reclassify_row function to only update topic labels and avoid overwriting other fields such as tickers and type.
Perform end-to-end testing of the pipeline on a small sample of files before running on the full dataset to catch potential bugs.
Improve cost estimation by running a small sample of files and checking actual API token usage before extrapolating to the full dataset.

Example

# Example of caching Haiku API responses
import json

def cache_haiku_response(file_id, response):
    with open(f'data/enrich_cache/{file_id}.json', 'w') as f:
        json.dump(response, f)

# Example of revising reclassify_row function to narrowly scope updates
def reclassify_row(row):
    # Only update topic labels, do not overwrite other fields
    row['topics'] = update_topic_labels(row['topics'])
    return row

Notes

The provided examples are minimal and may require additional modifications to fit the specific use case. It is essential to thoroughly test the revised code to ensure it works as expected.

Recommendation

Apply the workaround by implementing a local cache of Haiku API responses and revising the reclassify_row function to prevent data loss and incorrect updates. This approach addresses the root causes of the issue and helps prevent similar problems in the future.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix Logic bug in Claude-generated code caused ~$30 in wasted API costs + data loss [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

2. Cost estimation error

Root Cause

Summary

What happened

Two distinct failures

1. Logic bug in `reclassify_row`

2. Cost estimation error

Impact

What should have been done differently

Requested resolution

Environment

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix Logic bug in Claude-generated code caused ~$30 in wasted API costs + data loss [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

2. Cost estimation error

Root Cause

Summary

What happened

Two distinct failures

1. Logic bug in reclassify_row

2. Cost estimation error

Impact

What should have been done differently

Requested resolution

Environment

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING

1. Logic bug in `reclassify_row`