claude-code - 💡(How to fix) Fix Showcase: Mechanical Process Governance for Claude Code — 96-Agent Validated Framework [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#46274Fetched 2026-04-11 06:24:38
View on GitHub
Comments
0
Participants
1
Timeline
2
Reactions
0
Participants
Timeline (top)
labeled ×2

Fix Action

Fix / Workaround

SOIF uses only existing Claude Code features:

  • PreToolUse hooks for commit blocking (pre-commit-gate.sh)
  • Agent tool for parallel sub-agent dispatch (UAT testing)
  • Bash/Read/Write/Edit/Glob/Grep for all enforcement
  • Scheduled triggers for overnight autonomous testing
RAW_BUFFERClick to expand / collapse

What This Is

Solo Orchestrator is a development lifecycle governance framework built with Claude Code, for Claude Code. It mechanically enforces that AI agents follow a structured development process — they cannot skip steps, bypass safety checks, or advance phases without gate approval.

This isn't a proposal. It's a working implementation, tested with 96 Claude Opus 4.6 agents across two environments, with 1,032 tests and 17 bugs found and fixed.


Why This Matters for Claude Code

Claude Code's PreToolUse hooks enable something nobody else has built: deterministic process enforcement for autonomous coding agents. Not runtime guardrails (OpenAI AgentKit), not model-level safety (Anthropic's Constitutional AI), but lifecycle governance — ensuring agents follow a structured path from discovery through release.

Every competitor focuses on what agents can't do (runtime blocks). This focuses on what agents must do (process compliance).


How It Works

Three enforcement scripts, all using Claude Code hooks:

1. process-checklist.sh — Sequential State Machine Tracks 6 build loop steps per feature: tests_written → tests_verified_failing → implemented → security_audit → documentation_updated → feature_recorded. Steps cannot be skipped — each requires its predecessor. Also enforces Phase 2 init (7 steps), UAT sessions (9 steps), Phase 3 validation (9 steps), and Phase 4 release (6 steps).

2. pre-commit-gate.sh — PreToolUse Hook Registered as a PreToolUse hook on Bash tool calls. Reads the process state machine and emits {"permissionDecision": "deny"} when:

  • Build loop is incomplete (blocks git commit)
  • Agent attempts git commit --no-verify (blocks safety bypass)
  • Agent attempts git push --force (blocks history rewrite)
  • Agent attempts to reset process state (requires interactive terminal — blocks agent self-override)

3. check-phase-gate.sh — Phase Gate Enforcement Validates 5 development phases (Discovery → Architecture → Construction → Validation → Release) with:

  • Artifact existence checks at each gate transition
  • APPROVAL_LOG.md dated entry verification
  • POC mode blocking for Phase 4 production release
  • Cross-reference with process-state.json for completion verification

Additional governance:

  • Multi-track (Light/Standard/Full) adapting enforcement to project risk level
  • POC-to-production upgrade paths preserving all technical work while adding governance
  • Enterprise audit trail — every gate approval, process step, and override is logged
  • 87 documentation gaps categorized and addressed (gate requirements, platform coverage, track differentiation, POC lifecycle)

Validation: 96 Opus Agents, Dual-Environment Testing

Remote Run (Linux / Claude Code Remote)

  • 48 Opus 4.6 agents, each testing one phase/platform/scenario combination
  • 585 tests (383 positive, 202 negative)
  • 203 gate enforcement tests — 100% pass rate
  • 14 framework bugs found and fixed in-session
  • 87 documentation gaps identified and categorized

Local Run (macOS)

  • 48 Opus 4.6 agents, same test matrix
  • 447 tests, 372 passed, 75 failed (tracing to 3 root-cause bugs)
  • Found 1 critical macOS-only bug the Linux run couldn't catch:
    • grep -v "-->$" — macOS BSD grep interprets --> as flags, silently disabling ALL gate validation beyond Phase 0→1
    • Fixed with grep -v -e '-->$'

The meta-loop

The entire framework was built using Claude Code, then 96 Claude agents tested it, found bugs in it, and the bugs were fixed. The AI built its own governance framework, validated it, and improved it.


Enterprise Audit Results

Before UAT testing, the framework underwent an enterprise process audit (ISO 9001 / SOC 2 Type II benchmark):

  • 121 findings across 6 audit categories
  • 3 remediation rounds reduced to 0 Critical, 0 Major remaining
  • 18 audit reports preserved for traceability
  • Estimated 120-175 hours of manual work; completed in ~3 hours with Claude

Integration with Claude Code

SOIF uses only existing Claude Code features:

  • PreToolUse hooks for commit blocking (pre-commit-gate.sh)
  • Agent tool for parallel sub-agent dispatch (UAT testing)
  • Bash/Read/Write/Edit/Glob/Grep for all enforcement
  • Scheduled triggers for overnight autonomous testing

No custom model behavior, no external APIs, no dependencies beyond bash/jq/git.


What This Demonstrates

  1. Claude Code's hook system is powerful enough for enterprise governance — not just linting or formatting, but full lifecycle process enforcement
  2. Autonomous agents can validate their own governance — 96 agents testing enforcement mechanisms they're subject to
  3. Dual-environment testing catches platform-specific bugs — the macOS grep bug would have gone undetected in any single-environment test
  4. Solo developers can achieve enterprise-grade process discipline — the framework was built by one person with Claude

Links

  • Repository: https://github.com/kraulerson/solo-orchestrator
  • Key scripts: scripts/pre-commit-gate.sh, scripts/process-checklist.sh, scripts/check-phase-gate.sh
  • Related issues: #34535 (Peer Model Audit Gate), #45661 (Behavioral Governance)
  • Docs: docs/builders-guide.md, docs/user-guide.md, docs/governance-framework.md

extent analysis

TL;DR

To address potential issues with the Solo Orchestrator framework, review and adjust the pre-commit-gate.sh, process-checklist.sh, and check-phase-gate.sh scripts to ensure they are correctly enforcing the development lifecycle governance process.

Guidance

  • Verify that the PreToolUse hooks are correctly registered and functioning as expected to block inappropriate commits or actions by autonomous agents.
  • Check the process-checklist.sh script to ensure it accurately tracks and enforces the required steps in the development process, including build loop steps and phase transitions.
  • Test the check-phase-gate.sh script to confirm it correctly validates artifact existence, approval logs, and process state at each gate transition.
  • Review the enterprise audit results and address any remaining findings or recommendations to improve the framework's compliance and effectiveness.

Example

# Example of how to verify PreToolUse hook registration
# Check if pre-commit-gate.sh is registered as a PreToolUse hook
git config --get hook.pretooluse

Notes

The provided information does not indicate a specific issue to be solved but rather presents a framework and its testing. Therefore, the guidance focuses on verification and potential adjustment of the scripts to ensure they work as intended.

Recommendation

Apply workaround: Regularly review and test the scripts (pre-commit-gate.sh, process-checklist.sh, check-phase-gate.sh) to ensure they are functioning correctly and enforcing the desired governance process, as the framework's effectiveness depends on their proper operation.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING