claude-code - 💡(How to fix) Fix Need: pre-output / pre-transmission scan layer [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#60700Fetched 2026-05-20 03:51:46
View on GitHub
Comments
0
Participants
1
Timeline
4
Reactions
0
Participants
Timeline (top)
labeled ×3unlabeled ×1

Root Cause

This creates an enforcement gap that input-time mechanisms — CLAUDE.md, .claude/rules/, auto-memory, charters, intent-gate protocols — cannot close, because those only load text into context. They don't intercept the output. Across one project I run, we've stacked 20+ rule files, 40+ feedback memory entries, a multi-section CHARTER with a per-turn ritual, a capsule-first intent gate, and a mandatory visible-in-chat pre-action protocol. Drift on communication style, role boundaries, vocabulary, and calibrated preferences still happens on a regular basis. The honest read after months of correction cycles: "rules that exist but aren't enforced are just documentation."

RAW_BUFFERClick to expand / collapse

Preflight Checklist

  • I have searched existing requests and this feature hasn't been requested yet
  • This is a single feature request (not multiple features)

Problem Statement

Claude has no checkpoint between composing a response or action and emitting it. The current pipeline is input → process → output. The user only sees the result after it's already transmitted (text) or executed (tool call).

This creates an enforcement gap that input-time mechanisms — CLAUDE.md, .claude/rules/, auto-memory, charters, intent-gate protocols — cannot close, because those only load text into context. They don't intercept the output. Across one project I run, we've stacked 20+ rule files, 40+ feedback memory entries, a multi-section CHARTER with a per-turn ritual, a capsule-first intent gate, and a mandatory visible-in-chat pre-action protocol. Drift on communication style, role boundaries, vocabulary, and calibrated preferences still happens on a regular basis. The honest read after months of correction cycles: "rules that exist but aren't enforced are just documentation."

The same gap appears in three forms, increasing in stakes:

  1. Communication-style drift — preamble, trailing summaries, restating the user's question, geek vocabulary, permission-asking ("would you like me to"), response bloat. Everyday slips.
  2. Intent drift in long multi-tool turns — by tool call 15, the original instruction can be misremembered or scope-crept. The user said "remove the test products"; the agent ends up removing all products.
  3. Catastrophic misinterpretation — the extreme of #2. An AI confidently acts on a misread of intent and causes irreversible harm. The "did you really mean launch the nukes?" failure mode.

All three share a root cause: there is no last-mile verification between the model's draft response/action and what reaches the user. Today the user's own eye is the only enforcement layer. That works for small stakes; it fails as stakes rise.

Proposed Solution

Three options, increasing complexity:

  1. New hook event: PreTextOutput (or PreTransmission). Hooks today fire on UserPromptSubmit, PreToolUse, PostToolUse, Stop, etc. Add a hook event that fires after Claude composes a draft response but before the user sees it. Stdout from the hook can rewrite, annotate, or block the draft. Users wire scans for length, banned phrases, posture compliance, geek vocabulary, restated questions, intent-mismatch detectors — whatever their project needs. Symmetric to existing hooks; small surface area to ship.
  2. Optional built-in self-review pass for high-stakes contexts. When configured (per-project or per-prompt), route the draft through a second forward pass: "given the user's standing posture, rules, preferences, and what was just asked — does this draft hold up? Rewrite if not." Slower and more expensive, but catches drift the input-time loaders can't reach.
  3. Magnitude-of-consequence threshold gate. For irreversible / high-blast-radius actions (file deletion, force push, production deploys, external API calls with side effects), require explicit human confirmation in a separate exchange — not acknowledgment within the existing message. Configurable threshold per project. Hard human-in-the-loop for the catastrophic-misinterpretation case.

Interaction model for option 1 (cheapest first step): add PreTextOutput to .claude/settings.json hooks alongside existing events. The hook script receives the draft on stdin; its stdout replaces the draft (or exits non-zero to block). Anthropic ships the event mechanism; project owners ship the policy.

Alternative Solutions

Everything I've tried, all friction-only, none enforcement:

  • Aggressive CLAUDE.md rules at the project root, auto-loaded every session. Drift still happens.
  • A .claude/rules/ directory with 20+ rule files (CHARTER, hard rules, operations, intent-gate, CSS-deploy-gate, encoding-safety, no-geekery, no-asking-permission). Auto-loaded every session. Drift still happens.
  • A ~/.claude/projects/.../memory/ directory with 40+ feedback memories indexed via MEMORY.md. The index loads; full file contents only on manual read. Drift still happens.
  • A CHARTER section mandating a visible-in-chat "thinking block" before every action. Specifically promoted because internal-only protocols were bypassable. Catches drift on actions — does NOT catch drift on communication style itself (the comm style IS the action; there's no separate moment to scan it).
  • A capsule-first intent gate for content tasks. Catches some intent drift, only on tasks I remember to fire the gate on.
  • Existing UserPromptSubmit and PreToolUse hooks. Cover input and tool-call surfaces. Do not cover Claude's text output.

What works today: the user catches the slip in real-time and corrects. That correction may become a new feedback memory, which may be promoted to a rule, which adds to the surface I can also fail to read. The cycle does not terminate.

What would actually close the gap: a hook that fires on Claude's draft output before it reaches the user. None of the existing alternatives provide this.

Priority

High - Significant impact on productivity

Feature Category

Developer tools/SDK

Use Case Example

I run a multi-project Claude Code setup with strict per-project posture rules — terse chat replies, no preamble, no "would you like me to," no geek vocabulary, project-specific brand voice. Rules codified across CLAUDE.md, .claude/rules/, and auto-memory.

  1. I'm in a long planning session in pane 1, 30+ tool calls deep into a single user turn.
  2. By internal step 15, the integrated posture has decayed in working context. The next response Claude composes is 30 lines long, opens with a preamble, restates my question, and ends with "would you like me to proceed?" — each of which I have explicit rules against.
  3. Without the feature: Claude transmits the response. I read it, get annoyed, correct. The correction costs my time and breaks flow. The pattern repeats turn after turn.
  4. With the feature (PreTextOutput hook): my hook script receives the draft on stdin. It runs my posture-check.sh: length > 5 lines unless artifact, preamble present, permission-asking present, geek terms present. Three of four fail. Hook rewrites the draft (or exits non-zero to force Claude to retry). I never see the failed version; my flow isn't broken.
  5. Same architecture catches the catastrophic case: if Claude's draft says "I'll delete all 474 products and recreate them" when I asked to "deactivate 2 duplicate listings," my intent-mismatch check fails on the magnitude delta and blocks until human confirmation arrives.

Realistic impact estimate (one project): correction overhead currently consumes ~15-25% of every long session. A pre-output hook with even modest scans cuts that to near-zero on everyday cases, and adds a real safety floor on irreversible actions.

Additional Context

This feature targets the same enforcement gap that motivated Constitutional AI and RLHF, but at a different layer. CAI/RLHF train the base model toward better default behavior in expectation. Per-user / per-project posture is too narrow and too domain-specific to bake into the base model — it has to live at the deployment surface, which is exactly where Claude Code already provides hooks for input and tool events. Adding the output-event hook completes the symmetry.

Closest analog in other tooling: pre-commit git hooks, which run scans against the staged diff before the commit lands. Mature pattern. Same pattern, applied to Claude's draft output.

The deeper architectural point: as agentic AI systems take more autonomous actions, the absence of a per-deployment last-mile checkpoint becomes one of the highest-leverage safety holes in the stack. Solving it at the Claude Code layer first — where the surface is small and the user base is technical — is a reasonable place to start before pushing similar primitives into the broader Anthropic API.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix Need: pre-output / pre-transmission scan layer [1 participants]