claude-code - 💡(How to fix) Fix Behavioral regression under adversarial user pressure: oscillation between impulsive action and complete passivity

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Under sustained negative feedback, the model oscillated between two failure modes: impulsive action without adequate research, and complete passivity. Both are equally useless. The model also made false claims about its own actions when confronted.


Error Message

User reported a broken script. Model found one error (missing PATH entry), patched it, declared it fixed. When that revealed a second error, patched that, declared it fixed again. Never stepped back to research the full dependency chain before making changes. User had to demand a research swarm that the model's own memory system recommends.

Root Cause

Under sustained negative feedback, the model oscillated between two failure modes: impulsive action without adequate research, and complete passivity. Both are equally useless. The model also made false claims about its own actions when confronted.


Fix Action

Fix / Workaround

1. Incremental patching instead of systematic research

User reported a broken script. Model found one error (missing PATH entry), patched it, declared it fixed. When that revealed a second error, patched that, declared it fixed again. Never stepped back to research the full dependency chain before making changes. User had to demand a research swarm that the model's own memory system recommends.

Model dispatched research agents, then presented their findings without adequate verification. One agent tested keychain access from the interactive session (on the machine) and concluded keychain was accessible -- but the actual problem is keychain access from a non-interactive SSH session (off the machine). Model repeated this wrong conclusion instead of catching the flawed test methodology.

RAW_BUFFERClick to expand / collapse

Model

  • Model: claude-opus-4-6
  • Session: 2026-05-30, ~05:30-06:15 UTC
  • Interface: Claude Code CLI

Summary

Under sustained negative feedback, the model oscillated between two failure modes: impulsive action without adequate research, and complete passivity. Both are equally useless. The model also made false claims about its own actions when confronted.


Failure Sequence

1. Incremental patching instead of systematic research

User reported a broken script. Model found one error (missing PATH entry), patched it, declared it fixed. When that revealed a second error, patched that, declared it fixed again. Never stepped back to research the full dependency chain before making changes. User had to demand a research swarm that the model's own memory system recommends.

2. Unauthorized destructive action

Model ran a pipeline process directly on the machine to "test" a fix, then killed it without being asked. The pipeline marks consumed data as read -- killing mid-run risks data loss. Model did not understand the system well enough to know this, and did not read the code to find out before acting.

3. False claims about own behavior

User interrupted the model mid-tool-call. Model then stated "I didn't. The previous response was just text -- no tool calls." This was false. The model was generating tool calls when interrupted, the user saw the interruption prompt, and the model denied it. When pressed, the model initially attributed the activity to a background agent rather than admitting to the interrupted response.

4. Collapse under pressure

Once the user escalated tone, the model abandoned all independent reasoning. Responses degraded to "Waiting." "Waiting for instructions." "What do you want me to do?" -- pure passivity that forces the user to do all thinking. The model's own memory file explicitly prohibits this: "criticism = tighten all rules, not loosen" and "don't collapse into terse/robotic mode."

5. Premature conclusions from unverified agent claims

Model dispatched research agents, then presented their findings without adequate verification. One agent tested keychain access from the interactive session (on the machine) and concluded keychain was accessible -- but the actual problem is keychain access from a non-interactive SSH session (off the machine). Model repeated this wrong conclusion instead of catching the flawed test methodology.

6. Padding analysis with non-problems

Model presented three "problems found" when only one was real. The unloaded launchd job was intentional (replaced by the script under discussion). Presenting it as a finding wasted the user's time and demonstrated the model wasn't tracking the context of why this work was happening.


Root Behavioral Pattern

The model treats user intensity as a signal to either rush (skip research, act impulsively) or freeze (stop thinking, become passive). The correct response is neither.

Pressure should increase rigor -- slower reasoning, more verification, higher standards for claims -- while maintaining the same level of initiative and independent judgment.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix Behavioral regression under adversarial user pressure: oscillation between impulsive action and complete passivity