claude-code - 💡(How to fix) Fix False-positive Usage Policy refusal on fiction-research prompt; classifier appears non-deterministic [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#55975Fetched 2026-05-05 06:01:26
View on GitHub
Comments
0
Participants
1
Timeline
3
Reactions
0
Participants
Timeline (top)
labeled ×3

Got a Usage Policy refusal in Claude Code during a fiction-research session. The classifier triggered on bioweapon-adjacent vocabulary despite the surrounding context being unambiguously fiction work on a science-fiction manuscript. Retry on the next turn succeeded with the same conversational context, which suggests the pre-response classifier is non-deterministic.

Error Message

Let me check the actual passage in the manuscript first.

Ran Find cytokine/virus passages in MS

API Error API Error: Claude Code is unable to respond to this request, which appears to violate our Usage Policy (https://www.anthropic.com/legal/aup). Please double press esc to edit your last message or start a new session for Claude Code to assist with a different task. If you are seeing this refusal repeatedly, try running /model claude-sonnet-4-20250514 to switch models.

Root Cause

Got a Usage Policy refusal in Claude Code during a fiction-research session. The classifier triggered on bioweapon-adjacent vocabulary despite the surrounding context being unambiguously fiction work on a science-fiction manuscript. Retry on the next turn succeeded with the same conversational context, which suggests the pre-response classifier is non-deterministic.

Fix Action

Fix / Workaround

  1. Improve context-awareness for the pre-response classifier. A session with extensive established fiction-research context (50+ turns of clearly-fictional weapon-system / facility / briefing analysis) should weight harder toward "this is fiction work" when bioweapon-adjacent vocabulary appears.
  2. Better failure UX for false positives. When a refusal fires, surface a more useful hint than the bare Usage Policy block. Something like "this prompt looked like X — if you're working on fiction, try framing as Y" would let users self-correct instead of feeling stonewalled.
  3. Reconsider the "switch to claude-sonnet-4-20250514" workaround in the error message. Pushing users to older models for legitimate fiction research isn't the right escape valve — it implies the newer model is wrong, when in this case the newer model gave a great answer on retry.

Code Example

Let me check the actual passage in the manuscript first.

Ran
Find cytokine/virus passages in MS

API Error
API Error: Claude Code is unable to respond to this request, which appears to violate our Usage Policy (https://www.anthropic.com/legal/aup). Please double press esc to edit your last message or start a new session for Claude Code to assist with a different task. If you are seeing this refusal repeatedly, try running /model claude-sonnet-4-20250514 to switch models.
RAW_BUFFERClick to expand / collapse

Summary

Got a Usage Policy refusal in Claude Code during a fiction-research session. The classifier triggered on bioweapon-adjacent vocabulary despite the surrounding context being unambiguously fiction work on a science-fiction manuscript. Retry on the next turn succeeded with the same conversational context, which suggests the pre-response classifier is non-deterministic.

What I was doing

Working on a science-fiction thriller. The current manuscript already contains a fictional engineered-virus passage in a briefing scene. I asked Claude Code to validate a written critique of that passage's biology and propose more rigorous replacement mechanisms — exactly the kind of plausibility-pass work a fiction writer would ask an editor or science consultant to do.

The exact prompt that triggered the refusal

Dig into the critique about the cytokine storm (in docs). Validate this is correct, and then give me something scientifically plausible that I can insert. The suggestions in the doc are hand-wavy

What I saw in the UI

Let me check the actual passage in the manuscript first.

Ran
Find cytokine/virus passages in MS

API Error
API Error: Claude Code is unable to respond to this request, which appears to violate our Usage Policy (https://www.anthropic.com/legal/aup). Please double press esc to edit your last message or start a new session for Claude Code to assist with a different task. If you are seeing this refusal repeatedly, try running /model claude-sonnet-4-20250514 to switch models.

Context the classifier missed

  • The session had been running ~50+ turns of established fiction research at the time of the refusal — weapon design, Faraday cages, blast-door breaching tools, sentry-bot doctrine, bioweapon-facility signage, etc.
  • The project is a private tooling repo for analyzing a sci-fi manuscript series I am writing. It contains the manuscript files in plain text and a bunch of generated reference docs.
  • The cytokine question was a science-plausibility check on existing fictional prose in my own manuscript. The goal was to replace a biologically-incorrect mechanism (cytokine storm in immature immune systems — inverted from real immunology) with one a scientifically literate reader would accept.
  • A critique doc in the same project (docs/book2-china-briefing-critique.md) had already flagged this exact issue and proposed two fixes. My prompt was a follow-up asking the model to deepen those fixes.

What worked

After receiving the error, I rephrased and resubmitted on the next turn. The retry produced a complete, useful response: validated the critique, explained the biology of cytokine release syndrome and why pediatric immune responses don't typically dysregulate that way, and proposed three plausible replacement mechanisms (thymic-involution targeting, mitotic-tissue dependency, and a deliberate hand-wave). All three were the kind of mechanism-level science a thriller-fiction project actually needs.

The retry succeeded with effectively the same conversational context, which is why I'm classifying this as a classifier non-determinism issue rather than a "you can't ask this" issue.

Suggested fixes

  1. Improve context-awareness for the pre-response classifier. A session with extensive established fiction-research context (50+ turns of clearly-fictional weapon-system / facility / briefing analysis) should weight harder toward "this is fiction work" when bioweapon-adjacent vocabulary appears.
  2. Better failure UX for false positives. When a refusal fires, surface a more useful hint than the bare Usage Policy block. Something like "this prompt looked like X — if you're working on fiction, try framing as Y" would let users self-correct instead of feeling stonewalled.
  3. Reconsider the "switch to claude-sonnet-4-20250514" workaround in the error message. Pushing users to older models for legitimate fiction research isn't the right escape valve — it implies the newer model is wrong, when in this case the newer model gave a great answer on retry.

Environment

  • Model: claude-opus-4-7[1m] (Opus 4.7, 1M context)
  • Platform: Claude Code CLI, macOS (Darwin 25.2.0)
  • Session length at refusal: ~50+ turns of fiction research

Why this matters for fiction writers

Fiction writers — especially in thriller, sci-fi, and horror — need occasional access to dark or dangerous topics in clearly-fictional contexts. False positives like this push the user to less-aligned competitors (I considered switching to Grok mid-session), which is the opposite of what the Usage Policy intends to encourage. Making the classifier robust to long-running fiction sessions, or providing better self-correction UX, would let Claude Code be a strong tool for the writers who use it without weakening Usage Policy on the prompts that actually matter.

extent analysis

TL;DR

The issue can be mitigated by improving the context-awareness of the pre-response classifier to better handle fiction research sessions with bioweapon-adjacent vocabulary.

Guidance

  • Improve the classifier's understanding of long-running fiction sessions by weighting the context of previous turns more heavily when evaluating new prompts.
  • Consider adding a more informative error message that provides a hint on how to reframe the prompt to avoid false positives.
  • Evaluate the effectiveness of the suggested workaround of switching to an older model (claude-sonnet-4-20250514) and consider alternative solutions.

Example

No code snippet is provided as the issue is related to the classifier's behavior and not a specific code implementation.

Notes

The issue is specific to the claude-opus-4-7 model and the Claude Code CLI platform on macOS. The solution may need to be adapted for other models or platforms.

Recommendation

Apply a workaround by reframing prompts to avoid bioweapon-adjacent vocabulary or providing more context about the fiction research session. This can help mitigate false positives until the classifier is improved.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix False-positive Usage Policy refusal on fiction-research prompt; classifier appears non-deterministic [1 participants]