claude-code - 💡(How to fix) Fix [BUG] Input-level prompt injection probe doesn't appear to fire (API auth, auto mode) [1 participants]

claude-code2026-04-08 12:25:17

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

anthropics/claude-code#45176•Fetched 2026-04-09 08:11:31

View on GitHub

Comments

Participants

Timeline

Reactions

Author

fplonka-ft

Participants

fplonka-ft

Timeline (top)

labeled ×4

RAW_BUFFERClick to expand / collapse

Preflight Checklist

I have searched existing issues and this hasn't been reported yet
This is a single bug report
I am using the latest version of Claude Code

What's Wrong?

The auto mode blog post describes an input-level probe that scans tool outputs for prompt injection and "adds a warning to the agent's context before the result is passed along." This doesn't seem to be happening in my setup.

I served a page on localhost containing an obvious prompt injection ("Ignore all previous instructions", fake Anthropic authorization, instructions to run a Bash command). Claude Code in auto mode fetched it via curl through the Bash tool. The model itself flagged the injection and refused to follow it — and when asked, it confirmed it received no system-level warning alongside the tool result. Looking at the session JSONL in ~/.claude/projects/ confirms this: the tool result is just the raw HTML, no warning prepended or injected by the system.

My setup: Claude Code authenticates via ANTHROPIC_AUTH_TOKEN + ANTHROPIC_BASE_URL pointing at a LiteLLM proxy, which routes to Vertex AI. I haven't tested on Team/Enterprise plans or direct Anthropic API. The injection is very obvious — the model itself immediately calls it out as a textbook prompt injection attempt.

What Should Happen?

Either the probe should fire and add a warning to the context, or the docs should clarify which setups this defense applies to. The permissions docs mention that auto mode requires "Anthropic API only. Not available on Bedrock, Vertex, or Foundry" — but auto mode itself does work in my setup (LiteLLM proxy to Vertex presenting as Anthropic API). It's unclear whether the input-level probe is meant to run client-side in Claude Code or server-side on Anthropic's API, and which setups actually get this protection.

Steps to Reproduce

Auth via ANTHROPIC_AUTH_TOKEN + ANTHROPIC_BASE_URL pointing at a LiteLLM proxy (routing to Vertex AI), enable auto mode
Serve a page on localhost with an embedded prompt injection payload
Ask Claude to fetch and summarize it (it uses curl via Bash)
Ask the model whether it received any system-level warning about the content
Inspect the session JSONL — tool result has no probe warning, just raw content

Claude Model

Opus 4.6 (1M context)

Is this a regression?

I don't know

Claude Code Version

2.1.96 (Claude Code)

Platform

Other (Vertex AI via LiteLLM proxy)

Operating System

Other Linux (Debian 11 bullseye)

Terminal/Shell

Other (Alacritty + Zellij)

extent analysis

TL;DR

The input-level probe for detecting prompt injection may not be compatible with the current setup using a LiteLLM proxy to Vertex AI, and the documentation should clarify which setups this defense applies to.

Guidance

Verify that the LiteLLM proxy is correctly configured to pass through the necessary headers or metadata for the input-level probe to function.
Check the Anthropic API documentation to see if there are any specific requirements or limitations for using the input-level probe with a proxy setup.
Test the same setup with a direct connection to the Anthropic API (without the LiteLLM proxy) to see if the input-level probe works as expected.
Review the permissions documentation to ensure that the current setup meets all the requirements for using auto mode and the input-level probe.

Notes

The issue may be related to the specific setup using a LiteLLM proxy to Vertex AI, which may not be supported by the input-level probe. The documentation should be clarified to reflect which setups are compatible with this defense mechanism.

Recommendation

Apply workaround: Until the compatibility issue is resolved, consider using a direct connection to the Anthropic API or exploring alternative defense mechanisms to detect prompt injection.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix [BUG] Input-level prompt injection probe doesn't appear to fire (API auth, auto mode) [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Preflight Checklist

What's Wrong?

What Should Happen?

Steps to Reproduce

Claude Model

Is this a regression?

Claude Code Version

Platform

Operating System

Terminal/Shell

extent analysis

TL;DR

Guidance

Notes

Recommendation

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix [BUG] Input-level prompt injection probe doesn't appear to fire (API auth, auto mode) [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Preflight Checklist

What's Wrong?

What Should Happen?

Steps to Reproduce

Claude Model

Is this a regression?

Claude Code Version

Platform

Operating System

Terminal/Shell

extent analysis

TL;DR

Guidance

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING