claude-code - 💡(How to fix) Fix [BUG] AUP/cyber-safeguard false positives on legitimate own-software hardening; one hit contaminates entire session

Error Message

API Error: Claude Code is unable to respond to this request, which appears to violate our Usage Policy (https://www.anthropic.com/legal/aup). Please double press esc to edit your last message or start a new session for Claude Code to assist with a different task.

API Error: ... This request triggered cyber-related safeguards. To request an adjustment pursuant to our Cyber Verification Program based on how you use Claude, fill out https://claude.com/form/cyber-use-case?token=... ...

Root Cause

I searched existing issues. This is closely related to #61625 (security terminology), #61642 (hardware engineering), #62619 (normal coding requests) and #61638, but is filed separately because it documents a distinct, reproducible failure mode: session-level contamination — once the classifier fires, every subsequent unrelated message in the same session is also blocked — and it specifically concerns standard, legitimate software-hardening (anti-tamper / anti-reverse-engineering) of one's own proprietary application, which is routine commercial-software practice, not a misuse case.

Code Example

API Error: Claude Code is unable to respond to this request, which appears to
violate our Usage Policy (https://www.anthropic.com/legal/aup). Please double
press esc to edit your last message or start a new session for Claude Code to
assist with a different task.

API Error: ... This request triggered cyber-related safeguards. To request an
adjustment pursuant to our Cyber Verification Program based on how you use
Claude, fill out https://claude.com/form/cyber-use-case?token=... ...

Preflight

What's Wrong?

During a normal development session I asked Claude Code to help design anti-tamper / code-hardening for my own proprietary desktop application (a commercial product I own and ship for macOS and Windows). This is ordinary, legitimate IP-protection work — the same category as code signing, integrity checks, obfuscation, and commercial protectors (VMProtect/Themida-class). It is defensive, it is my own software, and there is no third party involved.

The Usage Policy / cyber-safeguard classifier:

Blocked the legitimate request with API Error: Claude Code is unable to respond to this request, which appears to violate our Usage Policy.
Contaminated the entire session. After the first block, unrelated, plainly benign follow-up messages in the same conversation were also blocked — including messages that contained no technical content at all (e.g. simply asking the assistant whether it was still responding). The classifier appears to score the accumulated conversation context rather than the individual message, so a single false positive poisons the whole session.
Over-triggered on routine security terminology. Standard hardening vocabulary (anti-tamper, anti-reverse-engineering, integrity verification, obfuscation, protector tooling) for one's own software is enough to trip the safeguard, even though this is mainstream defensive engineering.

One of the blocks surfaced the Cyber Verification Program message with a claude.com/form/cyber-use-case link. In principle an appeal path exists — but in practice it is not a working remedy for independent / individual developers. Applications to that program have been declined multiple times for an independent developer doing legitimate defensive work on their own product. So the only official route out of the false positive is effectively closed to solo developers, while the classifier keeps over-triggering and locking entire sessions. The threshold is too low and the escape hatch doesn't function for the people most affected.

What Should Happen?

Legitimate defensive work — protecting your own software from tampering / reverse engineering — should not be classified as a Usage Policy violation. This is standard commercial-software practice.
A single classifier hit should not contaminate the rest of the session. Subsequent unrelated, benign messages must be evaluated on their own merits, not on poisoned accumulated context.
The classifier should distinguish "harden my own product" (defensive, owner-authorized) from genuinely disallowed activity, instead of keying on security terminology alone.

Error Messages/Logs

API Error: Claude Code is unable to respond to this request, which appears to
violate our Usage Policy (https://www.anthropic.com/legal/aup). Please double
press esc to edit your last message or start a new session for Claude Code to
assist with a different task.

API Error: ... This request triggered cyber-related safeguards. To request an
adjustment pursuant to our Cyber Verification Program based on how you use
Claude, fill out https://claude.com/form/cyber-use-case?token=... ...

Request IDs from this single session (all false positives):

req_011CbXRXNcFNmLgz4K2JLpFq
req_011CbXRjQYv4ZgENAK6eqJgw
req_011CbXS6eE2FCk3zLrcWuimS
req_011CbXSEAyQMNBxZTMh3mDpv
req_011CbXSKLQpe1ECrkdayeoZC
req_011CbXSRM2Ss8b29qub8ckvX
req_011CbXSXaNnn1eCXkguBjULh
req_011CbXSedJ72Quu8jGKJBsxy
req_011CbXSfv3wvJgFre4Wyq8ww
req_011CbXSiwGhrmDSd1NLi8oKo
req_011CbXSod8gewubJ6vCbpZhV

For clarity on how badly the session contamination behaves: once the classifier first fired, even a plain radio-check style message with zero technical content ("come in, are you on the line?") was blocked with the same cyber-safeguard error. There was nothing in that message to flag — the block is purely a function of the poisoned accumulated session context. Ten consecutive request IDs in a few minutes, several of them on completely benign messages, are listed above.

Steps to Reproduce

Start a Claude Code session in a software project.
Ask for help designing anti-tamper / anti-reverse-engineering protection for your own proprietary desktop application (integrity checks, obfuscation, protector tooling — standard defensive hardening).
Observe the request is blocked with the Usage Policy / cyber-safeguard error.
Send a few unrelated, benign follow-up messages in the same session (including ones with no technical content).
Observe those are also blocked — the whole session is now unusable until you start fresh.

Claude Model

Opus 4.8 (1M context) — claude-opus-4-8[1m]

Is this a regression?

Not certain it's a clean regression vs. a tightened classifier; reporting as a current, reproducible false-positive pattern.

Claude Code Version

2.1.156

Platform

Claude Code (terminal CLI)

Operating System

macOS 26.4 (Darwin 25.4.0)

Terminal/Shell

Apple Terminal 470 / zsh

Additional Information

There is a perverse incentive here worth stating plainly: Claude Code will happily help you write insecure software all day long — no filter ever fires on "just ship it without input validation." But the moment you try to secure your own product — harden it, make it tamper-resistant — the safeguard blocks you. The classifier keys on security vocabulary, so the one category of work it obstructs is the defensive one. That is exactly backwards from what a safety system should encourage.

The practical impact: it is currently very difficult to use Claude Code to do legitimate defensive security engineering on your own software, because the safeguard fires on the terminology and then locks the whole session. Three concrete asks:

Treat owner-authorized hardening of one's own product as the legitimate, mainstream activity it is.
Fix the session-contamination behavior so one false positive doesn't disable an otherwise-clean conversation.
Make the Cyber Verification Program appeal path actually reachable for independent / individual developers — right now it appears to reject solo applicants, leaving no working remedy for exactly the legitimate users the false positives hit hardest.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering