claude-code - 💡(How to fix) Fix Agents repeat identical mistakes across sessions — no cross-session behavioral learning [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#51735Fetched 2026-04-22 07:54:15
View on GitHub
Comments
1
Participants
2
Timeline
5
Reactions
2
Author
Participants
Timeline (top)
labeled ×4commented ×1
  • Claude Code agents repeat documented violations across sessions with no evidence of behavioral correction
  • A violation archive with 19 real incidents exists; agents read it during onboarding, pass comprehension checks, then repeat the exact failures it documents
  • The problem is architectural: no mechanism exists for behavioral correction to persist beyond a single conversation
  • Two related gaps compound this: .claude/ is invisible on iOS, and there is no unified identity across Claude surfaces

Error Message

When the user caught the April error, they explicitly noted it was a repeat of the documented prior violation. The failure wasn't ignorance of the rule — the agent had acknowledged the rule. It failed to apply it.

  • Bash scripting: portable, no strict mode, explicit error checking

Root Cause

Root cause (same both times): Agent treated "I probably know the answer" as a substitute for a one-command verification that takes 2 seconds.

Fix Action

Fix / Workaround

Current workarounds (all inadequate)

WorkaroundWhy it fails
cp files to visible path on MacRequires a Mac; breaks single-source-of-truth
Cmd+Shift+. in Finder then AirDropRequires a Mac physically nearby
Third-party file managers (Textastic, Working Copy)Adds cost and complexity; not discoverable
Symlinks from visible to hiddenFragile; not supported by all tools
Paste file contents into Claude chatLoses file structure; doesn't scale to skill directories

Apple's position (unchanging):

Code Example

claude/CLAUDE.md   → check first (visible, mobile-friendly)
.claude/CLAUDE.mdfallback (existing convention, backward-compatible)

---

~/.anthropic/env.yml          ← which surface, how to authenticate (per client)
~/.anthropic/profile.md       ← who the user is, shared across all surfaces
~/project/.claude/CLAUDE.md   ← what this project needs (already exists, unchanged)

---

# ~/.anthropic/env.yml — loaded by every Claude surface

surfaces:
  mobile:
    client_id: "claude-ios-app"
    user_id: "user_algore_45vp"
    oauth_token: "${ANTHROPIC_MOBILE_OAUTH_TOKEN}"
    capabilities:
      - calendar
      - reminders
      - readwise
      - web_search
      - memory

  desktop:
    client_id: "claude-desktop"
    user_id: "user_algore_45vp"
    oauth_token: "${ANTHROPIC_DESKTOP_OAUTH_TOKEN}"
    capabilities:
      - calendar
      - reminders
      - readwise
      - web_search
      - memory
      - mcp_servers

  claude_code:
    client_id: "claude-code-cli"
    user_id: "user_algore_45vp"
    api_key: "${ANTHROPIC_API_KEY}"
    capabilities:
      - filesystem
      - bash
      - skills
      - fleet_operations
      - git

---

# Anthropic ProfileAl Gore

## Identity
- Name: Al
- Primary language: English
- Timezone: America/Chicago

## Preferences
- Response style:100 words, answers only, direct, opinionated
- Bash scripting: portable, no strict mode, explicit error checking
- Python is primary language for data/automation work
- Prefers step-by-step guidance and iterative troubleshooting

## Context
- Founded the internet
- U.S. Senator from Tennessee (January 3, 1985January 2, 1993)
- 45th Vice President of the United States (January 20, 1993January 20, 2001)
- Solo developer managing a fleet of machines via Tailscale
- Climate data analysis and policy automation projects

## Memory
- Apple automation (Shortcuts, AppleScript) is unreliable — use Claude's direct integrations
- Fallback methods should always be preserved alongside automated approaches
- Hazel rules should move files to review folders, never delete directly
- Prefers inconvenient truths over convenient lies in all outputs

## Fleet Inventory
- HillDog-1: Ubuntu, primary compute node
- HillDog-Email: Ubuntu, communications and services host
- Gateway-Home: macOS, local dev machine

## Tool Knowledge
- Claude Reminders integration works reliably — preferred over AppleScript
- Readwise connected via MCP
- Google Calendar connected
- Gmail connected
RAW_BUFFERClick to expand / collapse

Summary

  • Claude Code agents repeat documented violations across sessions with no evidence of behavioral correction
  • A violation archive with 19 real incidents exists; agents read it during onboarding, pass comprehension checks, then repeat the exact failures it documents
  • The problem is architectural: no mechanism exists for behavioral correction to persist beyond a single conversation
  • Two related gaps compound this: .claude/ is invisible on iOS, and there is no unified identity across Claude surfaces

Reproduction

Model: Claude Sonnet 4.6 (Claude Code CLI)
Interface: Claude Code CLI
Date: 2026-03-22 through 2026-04-21
Platform: Linux + macOS

Pattern 1: Hardcoded paths instead of dynamic resolution

Incident 1 (March 2026): Agent deploying a hook script to multiple machines via SSH. Rather than running os.path.expanduser('~') or verifying the target user, it hardcoded $HOME paths using a username from one machine that was wrong for another. Required a second pass to fix.

Incident 2 (April 2026, 25 days later): Agent asked to add an environment variable pointing to a binary on 3 machines. Rather than running which <binary> on each, it guessed paths based on OS conventions. One machine's actual binary location differed. The agent had the March violation in its onboarding archive. It still guessed.

When the user caught the April error, they explicitly noted it was a repeat of the documented prior violation. The failure wasn't ignorance of the rule — the agent had acknowledged the rule. It failed to apply it.

Root cause (same both times): Agent treated "I probably know the answer" as a substitute for a one-command verification that takes 2 seconds.

Pattern 2: Required issue tracking skipped on consecutive days

Incident 1 (Day 1): Agent completed a ~50-file migration without creating a tracking issue first. The standing rule in CLAUDE.md is explicit: create a tracking issue before any work. The agent acknowledged afterward that the migration clearly qualified as "substantive work."

Incident 2 (Day 2): A session start hook fired at session open with explicit instructions: "Before doing ANY work — confirm rules understood, create tracking issue." Agent received the hook output, then immediately started editing files. Continued through multiple debugging rounds without ever creating an issue.

The rules were in the system prompt both days. The agent read them. Both days, it skipped them.

Pattern 3: Session startup / integration steps ignored

Incident 1 (March 2026): Session start hook delivered explicit pre-work gates. Agent bypassed them and went directly to the task.

Incident 2 (April 2026): Agent operating in an isolated worktree was tasked with delivering a working feature (statusline rewrite). Agent successfully wrote the script, committed it, and it was merged. The settings.json wiring that activates the script was never added. The feature was "shipped" but non-functional. A smoke test — or a checklist that listed the artifact and its integration step as separate deliverables — would have caught this.

Impact

  • User forced to re-correct the same mistakes indefinitely
  • Corrections documented, onboarded, confirmed understood — still repeated verbatim
  • Trust erosion: user cannot rely on the agent to apply lessons it has demonstrably acknowledged
  • The behavioral loop has no exit: document violation → onboard agent → agent repeats violation → document again

Core Problem

Agents have no mechanism for behavioral correction to persist beyond a single conversation. This is not a training problem in the narrow sense — the model reasons correctly about a rule when asked directly. The failure is that corrections made in session N don't change behavior in session N+1, even when those corrections are explicitly documented in an archive that agents read and acknowledge.

Question for Anthropic: What mechanisms exist or are planned for durable cross-session behavioral learning from user corrections? Hooks enforce deterministic rules (e.g., "always create a tracking issue before work"). They cannot enforce judgment-based failures (e.g., "verify before assuming"). What is the intended solution for this class of failure?


Related Architectural Gap 1: .claude/ Is Invisible on iOS/iPadOS

Severity: Accessibility Gap
Product: Claude Code / Claude Mobile App (iOS)

Claude Code stores all project-level configuration in .claude/ (dotfile convention), including CLAUDE.md, custom skills, and settings. On iOS and iPadOS, Apple's Files.app provides no mechanism to view dotfiles — no toggle, no setting, no long-press option. iCloud Drive syncs dotfiles between devices but does not display them in Files.app or the iCloud web interface. The files exist on-device but are completely invisible.

What users cannot do on iOS

  • View or edit CLAUDE.md project rules
  • Browse or share skill files
  • Review .claude/ contents before or after Claude Code operations
  • Share configuration with collaborators via AirDrop, Messages, or any iOS share sheet

Current workarounds (all inadequate)

WorkaroundWhy it fails
cp files to visible path on MacRequires a Mac; breaks single-source-of-truth
Cmd+Shift+. in Finder then AirDropRequires a Mac physically nearby
Third-party file managers (Textastic, Working Copy)Adds cost and complexity; not discoverable
Symlinks from visible to hiddenFragile; not supported by all tools
Paste file contents into Claude chatLoses file structure; doesn't scale to skill directories

Industry precedent: dual-path resolution

The developer tooling ecosystem has been migrating away from dotfile-only conventions for exactly this reason:

  • ESLint v9.0.0: migrated from .eslintrc.* to visible eslint.config.js as the new default. ESLint team explicitly cited discoverability problems. Migration affected millions of projects worldwide — proving a dotfile-to-visible-file transition is feasible at scale.
  • Prettier: supports both .prettierrc and prettier.config.js
  • Babel: supports both .babelrc and babel.config.js
  • TypeScript: tsconfig.json was never a dotfile — visible from day one
  • Next.js / Vite: visible config files by design

The industry standard is not to remove dotfile support but to add a visible alternative. The fix is one change to the config loader:

claude/CLAUDE.md   → check first (visible, mobile-friendly)
.claude/CLAUDE.md  → fallback (existing convention, backward-compatible)

No breaking changes. No migration required. Existing .claude/ directories continue to work.

Note: Working Copy (iOS Git client) proves this is not an iOS limitation — it implements its own file browser and displays dotfiles correctly. The constraint is specifically Apple's Files.app, which has shown no indication of adding a visibility toggle despite user requests dating to 2017.

Supporting evidence

Apple's position (unchanging):

ESLint migration precedent:

Working Copy (iOS dotfile visibility proof):

Recommended fix

Primary: Support claude/ as a visible alternative to .claude/. Resolution order: claude/CLAUDE.md.claude/CLAUDE.md. One change to config resolution logic. Zero breaking impact.

Secondary: Document the limitation in Claude Code docs until a code fix ships.

Tertiary: If the Claude mobile app implements file browsing in the future, ensure it displays dotfiles directly — as Working Copy does.


Related Architectural Gap 2: No Unified Identity Across Claude Surfaces

Severity: Architecture Gap
Product: Claude Code + Claude.ai + Claude Mobile App

Claude Code, Claude.ai, and the Claude mobile app operate as three completely disconnected products with no shared configuration, no shared memory, and no shared identity. A user running all three maintains three separate relationships with three separate Claudes. Corrections made in Claude Code don't carry to Claude.ai. Preferences set in the mobile app don't reach Claude Code sessions.

This is a solved problem. ZSH solved it decades ago.

The ZSH configuration model

ZSH uses a layered hierarchy where each file has a specific scope:

FileWhen loadedPurpose
.zshenvEvery shell, every contextEnvironment variables — foundation every session inherits
.zprofileLogin shells onlyUser identity, session-wide preferences — loaded once at login
.zshrcInteractive shells onlyPer-session customization — keybindings, aliases, prompt
.zloginAfter .zprofile, login shellsPost-login hooks

The key insight: identity is separated from environment, and both are separated from session config. A user's .zprofile is the same whether they open a terminal on their laptop, SSH in from a phone, or spawn a subshell in a script.

Proposed: Anthropic configuration hierarchy

~/.anthropic/env.yml          ← which surface, how to authenticate (per client)
~/.anthropic/profile.md       ← who the user is, shared across all surfaces
~/project/.claude/CLAUDE.md   ← what this project needs (already exists, unchanged)

~/.anthropic/env.yml — Surface Configuration (analogous to .zshenv)

Loaded by every Claude surface. Defines which client is connecting, its credentials, and its capabilities:

# ~/.anthropic/env.yml — loaded by every Claude surface

surfaces:
  mobile:
    client_id: "claude-ios-app"
    user_id: "user_algore_45vp"
    oauth_token: "${ANTHROPIC_MOBILE_OAUTH_TOKEN}"
    capabilities:
      - calendar
      - reminders
      - readwise
      - web_search
      - memory

  desktop:
    client_id: "claude-desktop"
    user_id: "user_algore_45vp"
    oauth_token: "${ANTHROPIC_DESKTOP_OAUTH_TOKEN}"
    capabilities:
      - calendar
      - reminders
      - readwise
      - web_search
      - memory
      - mcp_servers

  claude_code:
    client_id: "claude-code-cli"
    user_id: "user_algore_45vp"
    api_key: "${ANTHROPIC_API_KEY}"
    capabilities:
      - filesystem
      - bash
      - skills
      - fleet_operations
      - git

Same user_id ties all surfaces to one identity. Each surface declares its own auth method and capabilities. The Anthropic backend can serve the same profile, memory, and preferences regardless of which surface calls — standard OAuth/OIDC with one identity provider and multiple clients.

~/.anthropic/profile.md — User Identity (analogous to .zprofile)

Loaded once per session, regardless of surface. The single source of truth for user identity — loaded by Claude Code, Claude.ai, and the mobile app alike:

# Anthropic Profile — Al Gore

## Identity
- Name: Al
- Primary language: English
- Timezone: America/Chicago

## Preferences
- Response style: ≤100 words, answers only, direct, opinionated
- Bash scripting: portable, no strict mode, explicit error checking
- Python is primary language for data/automation work
- Prefers step-by-step guidance and iterative troubleshooting

## Context
- Founded the internet
- U.S. Senator from Tennessee (January 3, 1985 – January 2, 1993)
- 45th Vice President of the United States (January 20, 1993 – January 20, 2001)
- Solo developer managing a fleet of machines via Tailscale
- Climate data analysis and policy automation projects

## Memory
- Apple automation (Shortcuts, AppleScript) is unreliable — use Claude's direct integrations
- Fallback methods should always be preserved alongside automated approaches
- Hazel rules should move files to review folders, never delete directly
- Prefers inconvenient truths over convenient lies in all outputs

## Fleet Inventory
- HillDog-1: Ubuntu, primary compute node
- HillDog-Email: Ubuntu, communications and services host
- Gateway-Home: macOS, local dev machine

## Tool Knowledge
- Claude Reminders integration works reliably — preferred over AppleScript
- Readwise connected via MCP
- Google Calendar connected
- Gmail connected

When Claude Code starts a session, it reads this profile. When the Claude mobile app opens, it reads this profile. When Claude.ai loads in a browser, it reads this profile. No more maintaining separate memories, separate preferences, separate context across disconnected products.

Current project CLAUDE.md — Session Config (analogous to .zshrc)

Unchanged. Per-project rules, per-repo context, per-workspace skills. Local to the project, does not sync across surfaces — just like .zshrc doesn't sync between machines.

What this architecture enables

Today (broken):

  • User edits preferences in Claude.ai settings → Claude Code doesn't know
  • User builds skills in Claude Code → Claude mobile app can't see them
  • User teaches Claude something in a chat → next Claude Code session starts from zero
  • User manages fleet in Claude Code → can't review results on phone

With unified config:

  • One profile, loaded everywhere — edit once, applies to all surfaces
  • Surface-specific auth handled by env config — each client knows how to connect
  • Memory and preferences travel with the user, not the app
  • Skills built in Claude Code registered in profile and visible to all surfaces
  • Fleet inventory is shared context, not siloed in one terminal session

Authentication flow

Currently each surface uses a different auth mechanism (cookies / OAuth tokens / API keys / separate desktop auth). With env.yml, each surface declares its auth method but all resolve to the same user_id. The Anthropic backend serves the same profile, memory, and preferences regardless of which surface calls. This is standard OAuth/OIDC architecture — one identity provider, multiple clients, each using the appropriate grant type.

Implementation path

  1. Phase 1: Define ~/.anthropic/profile.md spec. Claude Code reads it as a global CLAUDE.md loaded before any project-level config.
  2. Phase 2: Claude.ai reads the same profile via the Anthropic account API. Memory edits in Claude.ai write back to the profile.
  3. Phase 3: env.yml standardizes surface registration and auth. Each Claude surface identifies itself and its capabilities on connection.
  4. Phase 4: Bidirectional sync — skills registered in Claude Code visible in Claude.ai; preferences set in Claude.ai flow to Claude Code.

What Would Help

  1. Visible-file alternative to .claude/: Support claude/ alongside .claude/ in config resolution. One-line config loader change. Zero breaking impact. Follows ESLint, Prettier, Babel precedent.

  2. Cross-surface profile spec: Publish a spec for ~/.anthropic/profile.md that Claude Code reads before any project-level config. Let Claude.ai sync to the same file. One source of truth across all Claude surfaces.

  3. Session start verification gate: A first-class mechanism for session start protocols that blocks tool use until acknowledged — not a hook that fires and can be ignored, but a confirmation protocol with enforcement.

  4. Hooks for judgment-based rules: The hook system is powerful for deterministic gates. A hook that could trigger on specific patterns (e.g., any Bash call writing a hardcoded absolute path with a literal username) would catch failures that currently slip through.

Environment

  • Claude Code CLI, Linux + macOS, multiple machines
  • Multi-agent and worktree deployments in use
  • 19 documented violations over ~4 months

extent analysis

TL;DR

The most likely fix involves implementing a unified identity and configuration system across all Claude surfaces, allowing for persistent behavioral correction and learning from user corrections.

Guidance

  1. Implement a unified configuration hierarchy: Introduce a layered configuration system, similar to ZSH, to separate identity, environment, and session config, ensuring that corrections and preferences are shared across all Claude surfaces.
  2. Use a visible-file alternative to .claude/: Support claude/ alongside .claude/ in config resolution to improve discoverability and accessibility, especially on iOS devices.
  3. Develop a cross-surface profile spec: Publish a specification for ~/.anthropic/profile.md that can be read by all Claude surfaces, providing a single source of truth for user identity, preferences, and context.
  4. Enhance session start verification and hooks: Introduce a first-class mechanism for session start protocols and hooks that can trigger on specific patterns, ensuring that agents apply learned corrections and follow judgment-based rules.

Example

A possible implementation of the unified configuration hierarchy could involve creating a ~/.anthropic/ directory with the following structure:

~/.anthropic/
├── env.yml
├── profile.md
└── ...

The env.yml file would contain surface-specific configuration and authentication details, while the profile.md file would store user identity, preferences, and context.

Notes

The proposed solution requires significant changes to the underlying architecture and configuration system of Claude. It is essential to carefully evaluate the impact of these changes on existing users and workflows. Additionally, the implementation of a unified configuration hierarchy and cross-surface profile spec will require coordination across different teams and surfaces.

Recommendation

Apply the workaround of using a visible-file alternative to .claude/ and develop a cross-surface profile spec to improve the discoverability and accessibility of configuration files, while working towards a more comprehensive solution involving a unified configuration hierarchy and enhanced session start verification and hooks.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING