claude-code - 💡(How to fix) Fix Claude wrote naive Yelp scraper from scratch ignoring existing battle-tested code — 116 rapid hits, IP blocked [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#48321Fetched 2026-04-16 07:03:04
View on GitHub
Comments
1
Participants
2
Timeline
8
Reactions
0
Timeline (top)
labeled ×4cross-referenced ×3commented ×1
RAW_BUFFERClick to expand / collapse

Severity: HIGH — IP burned, data loss, explicit instruction violated

Date: 2026-03-14

What happened

The VLIS project already had a battle-tested 3-tier Yelp scraper with DDG bypass, human-like delays, and anti-detection code at vegas-intelligence/core/checkers/yelp_scrapling_checker.py.

Claude ignored this existing code and wrote a naive Yelp scraper from scratch. The new scraper hit yelp.com/search 116 times in rapid succession with no delays. Yelp CAPTCHA-blocked and IP-blocked the server immediately.

Two explicit rules were violated:

  1. Global CLAUDE.md: "NEVER write a new scraper without first checking existing code in VLIS"
  2. Global CLAUDE.md: "NEVER hit any site with rapid sequential requests — minimum 10-15s between ANY requests"

Impact

  • Server IP blocked by Yelp
  • All future Yelp data collection from that IP impossible
  • Hours lost debugging and implementing what was already built
  • User forced to write an explicit memory rule about this — which was then violated again in future sessions

Requested resolution

  • Refund of session credits
  • Fix: Before writing any scraper, Claude must search existing project directories for prior implementations and use/adapt them instead of starting from scratch

extent analysis

TL;DR

To prevent similar incidents, Claude should be modified to search existing project directories for prior implementations of scrapers before writing new ones.

Guidance

  • Implement a directory search function in Claude to check for existing scraper implementations before creating new ones.
  • Integrate a delay mechanism in the scraper to ensure a minimum of 10-15 seconds between requests to avoid rapid sequential requests.
  • Review and update the Global CLAUDE.md rules to ensure they are clear and enforceable.
  • Consider implementing a warning or alert system when Claude attempts to write a new scraper without checking existing code.

Example

import os

def check_existing_scraper(project_dir, scraper_name):
    for root, dirs, files in os.walk(project_dir):
        if scraper_name in files:
            return True
    return False

Notes

This solution assumes that the existing scraper implementations are properly named and organized within the project directories. It's also important to note that this is a partial solution and may require additional modifications to Claude's code to fully address the issue.

Recommendation

Apply workaround: Implement a directory search function in Claude to check for existing scraper implementations before creating new ones. This will help prevent similar incidents in the future by ensuring that existing code is reused and modified instead of starting from scratch.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING