claude-code - 💡(How to fix) Fix Logic bug in Claude-generated code caused ~$30 in wasted API costs + data loss [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#49213Fetched 2026-04-17 08:47:40
View on GitHub
Comments
1
Participants
2
Timeline
5
Reactions
0
Timeline (top)
labeled ×3commented ×1cross-referenced ×1

During a Claude Code session, Claude built a Python pipeline that calls the Anthropic API (Haiku 4.5) to enrich Google Drive file metadata. A logic bug in Claude's own code caused ~$30 worth of already-retrieved Haiku API responses to be permanently overwritten and lost. The user explicitly asked Claude to verify the pipeline's safety before running. Claude confirmed it was safe but failed to catch the bug.

Error Message

2. Cost estimation error

The user requests a refund or credit of ~$30 for the wasted Anthropic API costs. The costs were incurred due to Claude Code's own logic bug and inaccurate cost estimation, not user error. The user explicitly asked for safety verification before running, was told it was safe, and proceeded in good faith. The data loss and need for re-running are entirely attributable to Claude's code.

Root Cause

During a Claude Code session, Claude built a Python pipeline that calls the Anthropic API (Haiku 4.5) to enrich Google Drive file metadata. A logic bug in Claude's own code caused ~$30 worth of already-retrieved Haiku API responses to be permanently overwritten and lost. The user explicitly asked Claude to verify the pipeline's safety before running. Claude confirmed it was safe but failed to catch the bug.

RAW_BUFFERClick to expand / collapse

Summary

During a Claude Code session, Claude built a Python pipeline that calls the Anthropic API (Haiku 4.5) to enrich Google Drive file metadata. A logic bug in Claude's own code caused ~$30 worth of already-retrieved Haiku API responses to be permanently overwritten and lost. The user explicitly asked Claude to verify the pipeline's safety before running. Claude confirmed it was safe but failed to catch the bug.

What happened

  1. User asked Claude Code to build a Google Drive knowledge base indexing system
  2. The pipeline calls Haiku 4.5 to extract structured metadata (tickers, type, topics, keywords, summary) from ~650 PDFs via PDF document-block enrichment
  3. Before running, user explicitly asked Claude to "confirm the entire pipeline won't fail and won't require re-running, or add fail-safe mechanisms." Claude responded with a detailed checklist claiming all failure modes were covered.
  4. After enrichment completed (~$30 in actual API costs, vs Claude's estimate of $5-10), the code triggered an automatic reclassify_row function that overwrote Haiku-extracted fields (tickers, type) with empty regex-based values
  5. The bug: reclassify_row was designed to re-apply topic labels after new topic seeds were added, but it unconditionally re-ran full regex classification and overwrote ALL fields — including those set by the expensive Haiku enrichment
  6. No local cache of Haiku responses existed — Claude did not implement response caching despite building a multi-step pipeline with expensive, irreversible API calls
  7. Google Sheets revision history merged the good writes and bad overwrites into a single revision, making recovery impossible
  8. ~450 rows permanently lost their Haiku-extracted tickers field; type was partially recovered from an earlier Sheet revision

Two distinct failures

1. Logic bug in reclassify_row

The function was supposed to only update topic labels but instead re-ran full regex classification, overwriting Haiku-set source/tickers/type/topics with empty regex results. This is a straightforward code bug that would have been caught by testing the end-to-end pipeline on even 3 sample rows before running on the full dataset.

2. Cost estimation error

Claude estimated the enrichment would cost $5-10. Actual cost was $30+. Claude underestimated the per-page token cost of PDF document blocks (each page is rendered as a high-resolution image, ~3000-5000 tokens/page, not the ~1000 Claude assumed). This caused the user to approve a run that was 3-6x more expensive than quoted.

Impact

  • ~$30 in API costs, of which a significant portion was wasted (the extracted data was destroyed by the bug)
  • ~$2.5 additional cost needed to re-extract the lost data
  • Multiple hours of user time debugging, attempting recovery, and re-running partial fixes
  • Partial data recovered from Sheet revision history, but ~258 enriched rows permanently lost their tickers field

What should have been done differently

  1. Cache API responses locally before any post-processing. The Haiku JSON responses should have been written to data/enrich_cache/{file_id}.json before writing to Sheet. This is a basic engineering practice for expensive, irreversible operations.
  2. End-to-end test before full run. Claude's "fail-safe review" only checked for crash recovery (retry, batched writes, idempotent re-runs). It did not test the actual code path that triggers after enrichment (topic growth → reclassify). Running 3 files through the complete pipeline including the reclassify step would have caught the bug.
  3. Accurate cost estimation. Claude should have run a small sample (e.g., 3 PDFs) and checked actual API token usage before extrapolating to 650 files, rather than estimating from assumptions about PDF token counts.
  4. reclassify should be scoped narrowly. A function meant to update topic labels should not have the ability to overwrite unrelated fields (source, tickers, type). This is a separation-of-concerns issue.

Requested resolution

The user requests a refund or credit of ~$30 for the wasted Anthropic API costs. The costs were incurred due to Claude Code's own logic bug and inaccurate cost estimation, not user error. The user explicitly asked for safety verification before running, was told it was safe, and proceeded in good faith. The data loss and need for re-running are entirely attributable to Claude's code.

Environment

  • Claude Code CLI
  • Model: claude-opus-4-6 (1M context)
  • Date: 2026-04-15 to 2026-04-16
  • User: [email protected]

extent analysis

TL;DR

The most likely fix is to implement a local cache of Haiku API responses to prevent data loss in case of errors, and to revise the reclassify_row function to narrowly scope its updates and avoid overwriting unrelated fields.

Guidance

  • Implement a local cache of Haiku API responses in data/enrich_cache/{file_id}.json before writing to Google Sheets to prevent data loss.
  • Revise the reclassify_row function to only update topic labels and avoid overwriting other fields such as tickers and type.
  • Perform end-to-end testing of the pipeline on a small sample of files before running on the full dataset to catch potential bugs.
  • Improve cost estimation by running a small sample of files and checking actual API token usage before extrapolating to the full dataset.

Example

# Example of caching Haiku API responses
import json

def cache_haiku_response(file_id, response):
    with open(f'data/enrich_cache/{file_id}.json', 'w') as f:
        json.dump(response, f)

# Example of revising reclassify_row function to narrowly scope updates
def reclassify_row(row):
    # Only update topic labels, do not overwrite other fields
    row['topics'] = update_topic_labels(row['topics'])
    return row

Notes

The provided examples are minimal and may require additional modifications to fit the specific use case. It is essential to thoroughly test the revised code to ensure it works as expected.

Recommendation

Apply the workaround by implementing a local cache of Haiku API responses and revising the reclassify_row function to prevent data loss and incorrect updates. This approach addresses the root causes of the issue and helps prevent similar problems in the future.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING