claude-code - 💡(How to fix) Fix /ultrareview silently runs in dry_run mode on large diffs, reports completed with empty findings [2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#54115Fetched 2026-04-28 06:38:51
View on GitHub
Comments
2
Participants
2
Timeline
7
Reactions
0
Timeline (top)
labeled ×4commented ×2cross-referenced ×1

/ultrareview <branch> reports a successful completion with empty findings ([]) when the underlying Bug Hunter orchestrator actually fails. The tracking page shows "Review failed" but the task-notification reports status: completed with an empty findings array.

The root cause from the hook output appears to be that the orchestrator falls into dry_run: true mode when given a very large diff, completes in ~16 seconds with 0 tokens consumed and 0 agents producing output, yet still returns "completed" through the task channel.

Error Message

Why this is a bug, not user error

  1. Surface failure clearly in task-notification. When dry_run: true or bugs_submitted: 0 after a non-zero orchestrator exit, the notification should report status: failed with an error reason (e.g. "diff too large to fit in agent context").

Root Cause

The root cause from the hook output appears to be that the orchestrator falls into dry_run: true mode when given a very large diff, completes in ~16 seconds with 0 tokens consumed and 0 agents producing output, yet still returns "completed" through the task channel.

Fix Action

Fix / Workaround

Workaround we used

  • Claude Code CLI
  • Diff base before workaround: 53fa1de (post-git filter-repo initial commit)
  • Branch under review: ultrareview/full-codebase
  • Repo: github.com:Apostrophe-Corp/DDAO-Lottery (private)

Code Example

=== run_hunt.sh: Starting Review ===
Timestamp: 2026-04-28T00:04:35Z
PR: unknown
Repo: unknown
Mode: branch (base=53fa1de6d00779c090be112d2c28faaa31474ee9)
Python: Python 3.11.15
Working directory: /home/claude/repo
=== run_hunt.sh: Starting Bug Hunter orchestrator ===
=== run_hunt.sh: Orchestrator completed (exit=1) ===
=== Bug Hunter: Exit code: 1 ===

---

{
  "pr_number": null,
  "repository": "local-bundle",
  "commit_sha": null,
  "duration_seconds": 16.2,
  "total_cost_usd": 0,
  "total_input_tokens": 0,
  "total_output_tokens": 0,
  "fleet_size": 5,
  "dry_run": true,                  ← unexpected for a real review
  "agents_spawned": 5,
  "bugs_submitted": 0,
  "bugs_confirmed": 0,
  "bugs_refuted": 0,
  "reviews_submitted": 0,
  "agents": [],
  "reports": []
}
RAW_BUFFERClick to expand / collapse

/ultrareview silently runs in dry_run mode on large diffs

Summary

/ultrareview <branch> reports a successful completion with empty findings ([]) when the underlying Bug Hunter orchestrator actually fails. The tracking page shows "Review failed" but the task-notification reports status: completed with an empty findings array.

The root cause from the hook output appears to be that the orchestrator falls into dry_run: true mode when given a very large diff, completes in ~16 seconds with 0 tokens consumed and 0 agents producing output, yet still returns "completed" through the task channel.

Reproduction

  1. Push a branch that diffs ~830 files / ~268K insertions against main.
  2. Run /ultrareview <branch-name>.
  3. Receive a task-notification with status: completed and an empty findings array.
  4. Open the tracking link — it shows "Review failed".

This happened on two consecutive runs, burning 2 of 3 free reviews with no output.

Hook output (anonymised paths)

=== run_hunt.sh: Starting Review ===
Timestamp: 2026-04-28T00:04:35Z
PR: unknown
Repo: unknown
Mode: branch (base=53fa1de6d00779c090be112d2c28faaa31474ee9)
Python: Python 3.11.15
Working directory: /home/claude/repo
=== run_hunt.sh: Starting Bug Hunter orchestrator ===
=== run_hunt.sh: Orchestrator completed (exit=1) ===
=== Bug Hunter: Exit code: 1 ===

stats.json (key signals)

{
  "pr_number": null,
  "repository": "local-bundle",
  "commit_sha": null,
  "duration_seconds": 16.2,
  "total_cost_usd": 0,
  "total_input_tokens": 0,
  "total_output_tokens": 0,
  "fleet_size": 5,
  "dry_run": true,                  ← unexpected for a real review
  "agents_spawned": 5,
  "bugs_submitted": 0,
  "bugs_confirmed": 0,
  "bugs_refuted": 0,
  "reviews_submitted": 0,
  "agents": [],
  "reports": []
}

Why this is a bug, not user error

  • dry_run: true is set by the orchestrator itself — theres no user-facing flag for it
  • total_input_tokens: 0 confirms agents never received their work
  • agents_spawned: 5 says they were started, but agents: [] and empty reports/ directory says they never produced anything
  • Orchestrator completed (exit=1) — non-zero exit
  • The tracking page correctly shows "Review failed", but the task-notification in Claude Code reports status: completed

Impact

  • Two of three free reviews silently consumed for one branch
  • No way for the user to see something went wrong without manually clicking the tracking link
  • Even after detecting the failure, no clear retry path or refund mechanism

Suggested fixes

  1. Surface failure clearly in task-notification. When dry_run: true or bugs_submitted: 0 after a non-zero orchestrator exit, the notification should report status: failed with an error reason (e.g. "diff too large to fit in agent context").
  2. Dont count failed runs against the free quota. A run that produced zero token usage shouldnt decrement Free ultrareview N of 3.
  3. Document a max-diff guideline. Either gracefully reject diffs above N lines/files at launch with a clear message, or document the recommended split.

Workaround we used

Re-pointed the diff base to a more recent commit so the diff is 99 files / 4,317 insertions (R7 fix-phase only) instead of 834 files / 268K insertions (post-history-scrub initial state). This is what we'll re-run on the third free review.

Environment

  • Claude Code CLI
  • Diff base before workaround: 53fa1de (post-git filter-repo initial commit)
  • Branch under review: ultrareview/full-codebase
  • Repo: github.com:Apostrophe-Corp/DDAO-Lottery (private)

extent analysis

TL;DR

The issue can be mitigated by adjusting the diff size to be within a manageable range for the Bug Hunter orchestrator, as the current large diff is causing it to silently fail and report a successful completion.

Guidance

  • Review the diff size before running /ultrareview to ensure it's within a recommended range, potentially splitting large diffs into smaller, more manageable pieces.
  • Monitor the stats.json file for signals such as dry_run: true, total_input_tokens: 0, and agents: [] to detect potential issues.
  • Consider implementing a check at the start of the review process to reject or warn about diffs that exceed a certain size threshold.
  • Adjust the notification system to clearly report failures, such as when dry_run: true or bugs_submitted: 0 after a non-zero orchestrator exit.

Example

No specific code example is provided due to the nature of the issue, but adjusting the diff size can be achieved by re-pointing the diff base to a more recent commit, as done in the workaround.

Notes

The exact maximum diff size that the Bug Hunter orchestrator can handle is not specified, so experimentation may be necessary to find the optimal threshold. Additionally, the issue highlights the need for clearer failure reporting and potentially a retry mechanism.

Recommendation

Apply a workaround by ensuring the diff size is manageable for the Bug Hunter orchestrator, as directly addressing the root cause may require modifications to the orchestrator itself or the notification system, which are not detailed in the provided information.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix /ultrareview silently runs in dry_run mode on large diffs, reports completed with empty findings [2 comments, 2 participants]