claude-code - 💡(How to fix) Fix Feature request: evaluate and score user-created subagents and skills with keep/remove recommendations [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#51679Fetched 2026-04-22 07:55:47
View on GitHub
Comments
0
Participants
1
Timeline
3
Reactions
0
Participants
Timeline (top)
labeled ×3

Add a built-in Claude Code command (e.g., /audit-subagents, /audit-skills, or a unified /audit-extensions) that analyzes the user's custom subagents (.claude/agents/*.md) and skills (.claude/skills/*/SKILL.md), scores each one on value contribution, and recommends whether to keep, revise, or remove them. Surface the same analysis as a dedicated section in the HTML report produced by /insights.

Root Cause

Add a built-in Claude Code command (e.g., /audit-subagents, /audit-skills, or a unified /audit-extensions) that analyzes the user's custom subagents (.claude/agents/*.md) and skills (.claude/skills/*/SKILL.md), scores each one on value contribution, and recommends whether to keep, revise, or remove them. Surface the same analysis as a dedicated section in the HTML report produced by /insights.

Fix Action

Fix / Workaround

  • Many are created once for a narrow task and never used again.
  • Some overlap heavily with each other or with built-in capabilities.
  • Some have drifted out of sync with the project (referencing files/paths that no longer exist, or describing behavior the codebase has moved past).
  • Descriptions are often too vague for the dispatcher to match against — so the skill/agent silently never fires.
  • There's no principled way to decide which ones to prune, short of manual review.

The result is cruft: longer context, worse dispatch precision (more candidate skills/agents diluting the match), and higher cognitive load when writing new ones.

RAW_BUFFERClick to expand / collapse

Summary

Add a built-in Claude Code command (e.g., /audit-subagents, /audit-skills, or a unified /audit-extensions) that analyzes the user's custom subagents (.claude/agents/*.md) and skills (.claude/skills/*/SKILL.md), scores each one on value contribution, and recommends whether to keep, revise, or remove them. Surface the same analysis as a dedicated section in the HTML report produced by /insights.

Problem

Power users accumulate dozens of custom subagents and skills over time. In practice:

  • Many are created once for a narrow task and never used again.
  • Some overlap heavily with each other or with built-in capabilities.
  • Some have drifted out of sync with the project (referencing files/paths that no longer exist, or describing behavior the codebase has moved past).
  • Descriptions are often too vague for the dispatcher to match against — so the skill/agent silently never fires.
  • There's no principled way to decide which ones to prune, short of manual review.

The result is cruft: longer context, worse dispatch precision (more candidate skills/agents diluting the match), and higher cognitive load when writing new ones.

Proposed behavior

A command that, for each custom subagent and skill:

  1. Usage signal — count how often it has been invoked (from transcripts / session logs) over a user-specified window.
  2. Description quality score — check whether the description is specific enough to fire reliably (trigger keywords, SKIP clauses, concrete examples) vs. vague.
  3. Overlap detection — cluster items with substantially similar descriptions/triggers and flag redundancy.
  4. Staleness check — detect references to files, paths, tools, or commands that no longer exist in the repo.
  5. Value rating — a composite score (e.g., 0–10) with a one-line justification.
  6. RecommendationKEEP / REVISE / REMOVE, with the reason and, for REVISE, a suggested edit.

Output as a ranked table so the user can quickly sweep through and approve removals in bulk.

/insights integration

Add a new section to the HTML report generated by /insights titled e.g. "Custom Extensions Health" that renders the same audit data in a browser-friendly format:

  • Summary tiles at the top: total custom subagents, total custom skills, count flagged REMOVE, count flagged REVISE, count unused in the reporting window.
  • Sortable / filterable table with one row per subagent and skill, columns: name, type (agent/skill), scope (user/project/plugin), invocations in window, description-quality score, overlap group, staleness flags, composite value score, recommendation, and expandable justification.
  • Visual cues: red row for REMOVE, yellow for REVISE, green for KEEP; a small bar or sparkline showing invocation counts over time.
  • A "suggested revisions" drawer per row showing the rewritten description/trigger text, copy-to-clipboard for quick application.
  • Link at the top of the section pointing to the raw /audit-extensions command so users can re-run interactively.

This keeps the audit discoverable for users who already run /insights periodically, without requiring them to remember a separate command.

Why this belongs in Claude Code

Only Claude Code has visibility into both the definitions and the actual invocation history. A user can't easily compute "this skill has fired 0 times in 90 days" without building their own tooling on top of transcripts.

Nice-to-haves

  • Dry-run vs. apply mode (with --apply actually deleting or rewriting files after confirmation).
  • Per-scope filtering (--user, --project, --plugin).
  • Export as JSON for scripting.
  • CLI flag on /insights (e.g., --skip-extensions-audit) for users who don't want this section.

extent analysis

TL;DR

Implement a command to analyze custom subagents and skills, providing a score and recommendation for each, to help users declutter and optimize their extensions.

Guidance

  • Develop a scoring system that considers usage signals, description quality, overlap detection, staleness checks, and value rating to provide a comprehensive evaluation of each custom subagent and skill.
  • Integrate the audit results into the /insights HTML report, including a sortable and filterable table with visual cues for easy identification of recommended actions.
  • Consider implementing a dry-run mode and export options to provide users with flexibility and control over the audit process.
  • Use the existing transcript and session log data to inform the usage signal and invocation count calculations.

Example

// Example output of the audit command
| Name | Type | Scope | Invocations | Description Quality | Overlap Group | Staleness Flags | Value Score | Recommendation |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| My Skill | Skill | User | 0 | Low | Group 1 | Stale | 2 | REMOVE |
| My Agent | Agent | Project | 5 | Medium | Group 2 | Fresh | 8 | KEEP |

Notes

The implementation details of the scoring system and the integration with the /insights report will require careful consideration to ensure accuracy and usability. Additionally, the dry-run mode and export options should be designed with user experience in mind.

Recommendation

Apply a workaround by implementing the proposed /audit-extensions command, which will provide users with a principled way to evaluate and optimize their custom subagents and skills, improving the overall performance and maintainability of their extensions.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix Feature request: evaluate and score user-created subagents and skills with keep/remove recommendations [1 participants]