claude-code - 💡(How to fix) Fix [FEATURE] Feature Idea — Trust Calibration and Progressive Autonomy Model [2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#47183Fetched 2026-04-13 05:39:16
View on GitHub
Comments
2
Participants
2
Timeline
4
Reactions
0
Timeline (top)
commented ×2labeled ×2
RAW_BUFFERClick to expand / collapse

Preflight Checklist

  • I have searched existing requests and this feature hasn't been requested yet
  • This is a single feature request (not multiple features)

Problem Statement

I'd like to propose a trust calibration system that allows the relationship between a user and Claude to deepen over time, rather than resetting with each session.

The core idea: Claude's level of autonomy should be earned and domain-scoped, not fixed. A tiered model could allow users to progressively unlock different modes of interaction based on demonstrated alignment over time.

Key dimensions:

Session continuity — Currently Claude has no persistent identity across sessions. A trust tier system would require some memory of past interaction quality and outcomes, so that earned trust isn't lost at session end.

Domain-scoped trust — Trust shouldn't be monolithic. A user might trust Claude at full autonomy for code review and refactoring, while preferring a more cautious, confirmation-seeking mode for financial decisions or irreversible actions. Tiers should be configurable per domain.

Earned autonomy progression — The natural ladder would mirror how you'd onboard a human engineer:

  1. Ask before acting — confirm every significant step
  2. Act and report — proceed, then summarize what was done
  3. Full autonomy — operate independently within agreed scope

Deepening relationship — The current model treats every session as first contact. This idea recognizes that user-Claude collaboration should compound, not reset — the longer the working relationship, the more calibrated and efficient it becomes.

This feels particularly relevant to Anthropic's alignment work: it's not just a UX improvement, it's a framework for making autonomous AI behavior legible and accountable as it scales up.

Proposed Solution

  1. Trust Profile (Persistent, Per-User)

Store a lightweight trust profile tied to the user's account — not raw conversation history, but distilled signals:

  • Domains where the user has granted elevated autonomy (e.g., code, writing, research)
  • Autonomy tier per domain (Supervised / Act-and-Report / Full Autonomy)
  • Optional: a brief user-written context ("I'm a senior engineer, treat me accordingly")

This profile travels across sessions and devices.


  1. Three Autonomy Tiers

┌───────────────┬───────────────────────────────────────────────┬────────────────────────────────────────┐ │ Tier │ Behavior │ Default for │ ├───────────────┼───────────────────────────────────────────────┼────────────────────────────────────────┤ │ Supervised │ Ask before every significant action │ New users, sensitive domains │ ├───────────────┼───────────────────────────────────────────────┼────────────────────────────────────────┤ │ Act & Report │ Proceed, then summarize what was done │ Established users, moderate-risk tasks │ ├───────────────┼───────────────────────────────────────────────┼────────────────────────────────────────┤ │ Full Autonomy │ Operate within agreed scope without check-ins │ Trusted users, low-risk domains │ └───────────────┴───────────────────────────────────────────────┴────────────────────────────────────────┘


  1. Progression Mechanism
  • Users can manually set their tier per domain at any time
  • Claude can suggest an upgrade when patterns indicate the user consistently approves actions without modification ("You've approved every file edit this session — want me to act and report instead?")
  • Users can override downward at any time, including mid-session ("be more cautious from here")

  1. Domain Scoping

Trust tiers are namespaced, not global. Example profile:

code_editing: full_autonomy file_deletion: supervised financial: supervised research/writing: act_and_report

This prevents a high-trust code relationship from bleeding into unrelated high-stakes domains.


  1. Auditability

Every action taken under Act & Report or Full Autonomy gets logged to a session summary the user can review — making autonomous behavior legible and correctable, which directly supports alignment goals.


Why this works for Anthropic

  • It doesn't require trusting Claude more by default — it lets users calibrate trust explicitly
  • It creates a natural feedback loop that improves both UX and safety
  • It mirrors how trust works with human collaborators, making the model more intuitive

Alternative Solutions

No response

Priority

Critical - Blocking my work

Feature Category

CLI commands and flags

Use Case Example

No response

Additional Context

No response

extent analysis

TL;DR

Implement a trust calibration system with tiered autonomy levels to deepen the relationship between users and Claude over time.

Guidance

  • Introduce a persistent trust profile per user to store autonomy tiers and domains where elevated autonomy is granted.
  • Develop a progression mechanism that allows users to manually set their tier per domain and suggests upgrades based on consistent approval patterns.
  • Implement domain-scoped trust tiers to prevent high-trust relationships from bleeding into unrelated domains.
  • Ensure auditability by logging actions taken under Act & Report or Full Autonomy to a session summary for user review.

Example

A potential implementation could involve creating a TrustProfile class to store user-specific trust data, with methods to update and retrieve autonomy tiers for different domains.

Notes

The proposed solution requires careful consideration of user experience, safety, and alignment goals. The implementation should balance flexibility and customization with the need for intuitive and legible autonomous behavior.

Recommendation

Apply the proposed trust calibration system with tiered autonomy levels, as it addresses the need for a deepening relationship between users and Claude while prioritizing safety and alignment.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING