claude-code - 💡(How to fix) Fix Claude Code AUP filter false positive on legitimate self-audit documentation [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#55638Fetched 2026-05-03 04:48:15
View on GitHub
Comments
1
Participants
2
Timeline
5
Reactions
0
Timeline (top)
labeled ×4commented ×1

Error Message

Better error messages: Current "Try rephrasing or attempting a different approach" gives zero signal. Tell the user which keyword combination tripped the filter so they can self-correct, or explicitly say "this looks like a false positive — file an issue".

RAW_BUFFERClick to expand / collapse

Preflight Checklist

  • I have searched existing requests and this feature hasn't been requested yet
  • This is a single feature request (not multiple features)

Problem Statement

I'm using Claude Code to help me build self-audit tooling for my own OpenClaw deployment on my own Mac. The work involves writing documentation describing how the audit detector catches LLM rule violations.

The AUP filter blocked me when I asked Claude to verify a Google Drive upload by downloading the file content for SHA256 comparison. The downloaded file was a markdown doc I had just uploaded — my own writing about my own system.

The trigger keywords appear to be combinations like "binance + strategy", "violation detection", "monitoring + Telegram alerts" appearing densely in legitimate technical documentation.

Suggestion: AUP classifier should consider context — files being processed within the user's own filesystem / own Drive folder shouldn't be classified the same as user-generated requests targeting external systems. Right now this creates a chilling effect where users have to self-censor technical documentation about their own software.

Reference: AUP block occurred at approximately 2026-05-03 01:50 UTC+8.

中文補充(如果你想加):

此外,Claude Code 在桌面工作時權限彈窗頻率過高、嚴重影響長任務體驗,希望提供更細緻的「session-level trust」機制。

Proposed Solution

Suggested solution:

Context-aware AUP classification: When the content being processed (uploaded, downloaded, edited) originates from the user's own local filesystem or their own cloud storage, weight the trust signal higher. A file the user just uploaded 30 seconds ago shouldn't be classified the same as a fresh user prompt targeting external infrastructure. Round-trip exemption: When Claude Code downloads a file it (or the user) just uploaded within the same session for verification purposes (e.g., SHA256 hash check), bypass deep content classification — the content is already known to be user-authored. Better error messages: Current "Try rephrasing or attempting a different approach" gives zero signal. Tell the user which keyword combination tripped the filter so they can self-correct, or explicitly say "this looks like a false positive — file an issue". Session-level trust escalation: After N successful tool calls in a session that didn't trigger AUP, raise the threshold for borderline cases. Right now every tool call gets classified independently, which creates absurd inconsistency where call #50 in a session about my own audit log gets blocked while call #1 about the same topic was fine. Permission popup fatigue: Separately from AUP — Claude Code's per-tool-call approval prompts for read-only operations on the user's own machine are a major UX drag for long sessions. Need session-level "I trust this directory" persistence beyond the current 1-tool-1-approval model. The current behavior makes Claude Code unsuitable for any work involving security audit tooling, anti-fraud detection, content moderation systems, or any technical documentation that legitimately discusses violations / monitoring / detection — which is a huge category of real software engineering work.

Alternative Solutions

No response

Priority

Critical - Blocking my work

Feature Category

API and model interactions

Use Case Example

具體場景(2026-05-03 凌晨真實事件):

  1. 我在自己的 Mac 上維護一個本地 AI agent 系統(OpenClaw), 有一個 main agent 偶爾會違反我寫的內部規則。

  2. 我用 Claude Code 幫我寫了一個自我稽核腳本,會偵測 agent 違規行為 並寫紀錄到本地 log。這些都在我自己機器上、純粹自己對自己的稽核。

  3. 完成後我請 Claude Code 把工作紀錄上傳到我自己的 Google Drive 備份。 上傳成功。

  4. 為了驗證上傳沒被截斷,我請 Claude Code 把同一個檔案下載回來比對 SHA256。

  5. 此時 AUP 過濾器擋下、回應「Claude Code is unable to respond... appears to violate our Usage Policy」。

    被擋的內容是:我自己 30 秒前剛上傳的、我自己寫的 markdown 文件, 下載回我自己機器做 hash 比對。

  6. 同一份文件 Claude Code 自己幾分鐘前才寫過、上傳過,沒被擋; 下載回來就被擋。完全不一致。

我希望的行為:

  • Claude Code 應該識別「這是 round-trip」(同 session 剛上傳又下載)並豁免
  • 或至少給出可操作的錯誤訊息,告知是哪個關鍵字組合觸發、能否暫時 override
  • 而不是把使用者卡在進度中段、留下半完成的工作

Additional Context

<img width="623" height="697" alt="Image" src="https://github.com/user-attachments/assets/9a744369-d0e0-491b-b0a9-4c37845dd3cc" />

extent analysis

TL;DR

The AUP filter in Claude Code can be improved to consider context and exempt round-trip file operations, such as downloading a file that was just uploaded for verification purposes.

Guidance

  • The current AUP filter is overly restrictive and blocks legitimate technical documentation, suggesting a need for a more nuanced approach that considers the context of the file operation.
  • Implementing a round-trip exemption for files downloaded within the same session for verification purposes could help mitigate false positives.
  • Providing more informative error messages that specify the triggering keyword combination could help users self-correct and avoid similar issues in the future.
  • Introducing session-level trust escalation could also help reduce the likelihood of false positives for users with a history of legitimate activity.

Example

No code snippet is provided as the issue is more related to the design and implementation of the AUP filter rather than a specific code bug.

Notes

The proposed solution requires careful consideration of the trade-offs between security and usability, as well as potential edge cases that may arise from implementing context-aware AUP classification and round-trip exemptions.

Recommendation

Apply a workaround by revising the technical documentation to avoid triggering keyword combinations, while awaiting a more permanent solution that addresses the underlying issues with the AUP filter.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix Claude Code AUP filter false positive on legitimate self-audit documentation [1 comments, 2 participants]