gemini-cli - 💡(How to fix) Fix [WSL2][CRITICAL] Comprehensive reliability failure report: 7 incidents, fork table exhaustion, model comparison vs Claude Sonnet 4.6 [1 comments, 1 participants]

gemini-cli2026-04-28 14:54:45

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

google-gemini/gemini-cli#26117•Fetched 2026-04-29 06:35:42

View on GitHub

Comments

Participants

Timeline

Reactions

Author

jyongchul

Participants

jyongchul

Timeline (top)

cross-referenced ×7commented ×1labeled ×1

Error Message

Symptom: Error in: hooks.AfterAgent[0] Expected object, received string on every startup. Root cause: Confirmed by reading bundle source chunk-WFCK2Z32.js — Zod schema changed with no migration, no docs update, no helpful error message. Old valid format silently discarded: Symptom: Error: write EPIPE at writeToStdout — CLI crashes after every agent turn. 6. Breaking changes need migration: The hook schema change in v0.39.1 broke existing configs silently. Zod schema changes to user-facing config formats require a migration path and a clear, actionable error message.

Root Cause

Incident 1 — OAuth Session Silent Invalidation

Symptom: Failed to sign in: Cloud Code Private API not enabled drops to interactive auth menu, blocking all automation. Root cause: ~/.gemini/google_accounts.json shows "active": null after silent token drop. Additionally ~/.gemini/projects.json accumulated a corrupt "/mnt/c": "c" entry from a session launched from a Windows-mounted path, binding workspace to wrong GCP project. Recovery: Manual gemini auth reset, deletion of projects.json, re-auth. Time lost: ~45 minutes.

Code Example

"hooks": { "SessionStart": ["echo 'hello'"] }

---

"hooks": { "SessionStart": [{ "hooks": [{ "type": "command", "command": "echo 'hello'" }] }] }

---

-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable
/home/user/.npm-global/bin/gemini: fork: retry: Resource temporarily unavailable

RAW_BUFFERClick to expand / collapse

[CRITICAL] WSL2 Reliability Report: Weeks of Cascading Failures with Gemini CLI — Comprehensive Timeline & Model Comparison

This is a comprehensive report documenting months of production use of Gemini CLI in a WSL2 environment for autonomous agentic (--yolo) workflows. The issues documented here are not theoretical — each represents a real incident that consumed hours of debugging time and interrupted production work.

Environment

OS: Windows 11 (host) + WSL2 Ubuntu 24.04
Node.js: v22.17.0
Gemini CLI: v0.39.1 (npm global install)
Auth mode: oauth-personal (Google One AI Ultra subscription)
Shell: bash (WSL2)
Use case: Autonomous agentic workflows via --yolo mode, long-running sessions (4-8h), multiple MCP servers

Why This Report Matters

We switched from Claude Code CLI to Gemini CLI specifically to use Google's models within an autonomous pipeline. After weeks of use, we are documenting the results factually. Every incident below required manual intervention that Claude Code CLI did not require in the same environment.

Incident Timeline (Chronological)

Incident 1 — OAuth Session Silent Invalidation

Incident 2 — Hook Schema Silent Breaking Change (v0.39.1)

"hooks": { "SessionStart": ["echo 'hello'"] }

New required format (undocumented):

"hooks": { "SessionStart": [{ "hooks": [{ "type": "command", "command": "echo 'hello'" }] }] }

Time lost: ~2 hours reverse-engineering the bundle.

Incident 3 — EPIPE Crash on AfterAgent Hook

Symptom: Error: write EPIPE at writeToStdout — CLI crashes after every agent turn. Root cause: Hook subprocess inherits parent stdout during teardown. Even nohup .../dev/null 2>&1 & does not prevent it. Time lost: ~3 hours.

Incident 4 — --yolo Mode Blocked by Untrusted Workspace Prompt

Symptom: Interactive trust prompt appears even with --yolo flag. security.folderTrust.enabled: false does not suppress it. Impact: Entire automation pipeline blocked — --yolo mode becomes unusable without human present. Time lost: Ongoing regression.

Incident 5 — /mnt/c Path Corrupts projects.json

Symptom: Launching CLI from Windows-mounted path creates stale "/mnt/c": "c" entry causing Cloud Code API errors in all future sessions. Time lost: ~1 hour diagnosing why auth worked interactively but failed in automation.

Incident 6 — Process Deadlock / Terminal Bouncing (Ongoing)

Symptom: After heavy tool-call agent turns, CLI enters a deadlock — neither completing nor erroring. Terminal appears frozen. Only recovery: kill -9. Frequency: Approximately 1-in-4 sessions. Side effect: Dead sessions leave zombie node processes in WSL and ghost wslhost.exe bridge processes on Windows that are never reaped.

Incident 7 — WSL Fork Table Exhaustion (Live incident: 2026-04-28)

This happened today, directly caused by accumulated ghost processes from Incident 6.

After running gemini --yolo in a standard session, the next launch attempt produced:

-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable
/home/user/.npm-global/bin/gemini: fork: retry: Resource temporarily unavailable

Root cause discovered via Windows Task Manager:

5× wsl.exe ghost processes (orphaned from previous Gemini CLI deadlocks)
14× wslhost.exe ghost bridge processes
systemd-journald in uninterruptible D state inside WSL, blocking both wsl --shutdown and wsl --terminate Ubuntu

Recovery steps required:

Stop-Process -Name "wsl" -Force (Windows PowerShell, elevated)
Stop-Process -Name "wslhost" -Force
wsl --shutdown — still hung because vmcompute was stuck
Restart-Service vmcompute -Force (required elevated PowerShell)
Wait for WSL to fully restart

Total downtime: ~25 minutes of unrecoverable WSL requiring Windows-level Hyper-V service restart.

Direct Model Comparison: Gemini 2.5 Pro (via Gemini CLI) vs. Claude Sonnet 4.6 (via Claude Code CLI)

Both CLIs were run in the identical WSL2 environment on the same machine with identical shell configuration, hooks, and MCP servers.

Metric	Gemini CLI + Gemini 2.5 Pro	Claude Code CLI + Claude Sonnet 4.6
Session crashes / deadlocks	~25% of sessions	0 observed
EPIPE crashes after agent turn	Yes (Bug 3)	Never observed
OAuth session drops	Yes — on WSL suspend/resume	Never observed
`--yolo` mode blocked by interactive prompts	Yes (Bug 4)	Never observed
MCP config corruption on server failure	Yes — requires manual file recovery	Graceful degradation
Recovery from WSL network interruption	Hangs indefinitely	Exits cleanly
Ghost processes on Windows after session	Yes — 5 wsl.exe + 14 wslhost.exe	0 orphaned processes
Fork table exhaustion requiring Hyper-V restart	Yes (Incident 7, today)	Never occurred
Task completion rate in --yolo mode	Significantly degraded	Consistently high

Observation: When we switched from the Gemini 2.5 Pro model to Claude Sonnet 4.6 within Antigravity IDE (same interface, same WSL environment, same hooks), task delivery quality and completion rates improved measurably and immediately. The environment did not change. Only the model changed. This suggests the reliability gap is not purely a CLI infrastructure issue — the model's ability to handle WSL-specific tool errors, recover from ambiguous states, and complete tasks without requiring human intervention also differs significantly.

What Google Needs to Fix

Process reaping: Gemini CLI MUST install SIGTERM/SIGINT/SIGCHLD handlers that explicitly kill all spawned child process groups before exit. This is a Node.js process management responsibility.
--yolo mode must be interactive-prompt-free: Any interactive prompt in --yolo mode is a regression. Trust prompts, auth prompts, and confirmation dialogs must all be suppressible via config.
EPIPE crash: AfterAgent hooks must run in a fully detached subprocess (setsid / new process group) to prevent inheriting the CLI's stdout during teardown.
OAuth proactive refresh: Token validity should be checked at session start, not on first API call. Silent failures block automation.
projects.json sanitization: Entries with /mnt/c prefix should be rejected or auto-pruned — they are never valid GCP project references.
Breaking changes need migration: The hook schema change in v0.39.1 broke existing configs silently. Zod schema changes to user-facing config formats require a migration path and a clear, actionable error message.
WSL2 must be a first-class supported platform: Given that WSL2 is the primary Linux development environment for Windows users, it should receive the same QA attention as native Linux. The current state — where basic operations like running a session and exiting cleanly are unreliable — is not acceptable for a tool marketed for autonomous agentic use.

Conclusion

After weeks of real-world production use, the conclusion is factual: Gemini CLI in WSL2 is not production-ready for autonomous agentic workflows. The cascade of bugs is not a single issue — it is a systemic pattern of insufficient process management, missing WSL-aware safeguards, and inadequate testing on the Windows/WSL2 platform.

We continue to use Claude Code CLI as the baseline for all critical work until these issues are resolved.

Related issues: #26111, #26114

extent analysis

TL;DR

To improve the reliability of Gemini CLI in WSL2 for autonomous agentic workflows, Google needs to address several critical issues, including process reaping, interactive prompts in --yolo mode, EPIPE crashes, OAuth proactive refresh, projects.json sanitization, and WSL2 support.

Guidance

Implement SIGTERM/SIGINT/SIGCHLD handlers in Gemini CLI to explicitly kill spawned child process groups before exit.
Modify --yolo mode to be interactive-prompt-free by making trust prompts, auth prompts, and confirmation dialogs suppressible via config.
Run AfterAgent hooks in a fully detached subprocess to prevent inheriting the CLI's stdout during teardown.
Check token validity at session start, not on first API call, to prevent silent OAuth failures.
Sanitize projects.json by rejecting or auto-pruning entries with /mnt/c prefix.

Example

To run AfterAgent hooks in a detached subprocess, you can use the setsid command:

setsid gemini --yolo --after-agent "/path/to/hook.sh"

Note: This is a minimal example and may require additional modifications to work with Gemini CLI.

Notes

The provided guidance is based on the issues reported and may not be an exhaustive list of all necessary fixes. Additional testing and debugging may be required to ensure the reliability of Gemini CLI in WSL2.

Recommendation

Apply the suggested workarounds and fixes to improve the reliability of Gemini CLI in WSL2, as the current state is not production-ready for autonomous agentic workflows.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #chain error #conversation history #tool integration #LLM response

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

gemini-cli - 💡(How to fix) Fix [WSL2][CRITICAL] Comprehensive reliability failure report: 7 incidents, fork table exhaustion, model comparison vs Claude Sonnet 4.6 [1 comments, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Incident 1 — OAuth Session Silent Invalidation

Code Example

[CRITICAL] WSL2 Reliability Report: Weeks of Cascading Failures with Gemini CLI — Comprehensive Timeline & Model Comparison

Environment

Why This Report Matters

Incident Timeline (Chronological)

Incident 1 — OAuth Session Silent Invalidation

Incident 2 — Hook Schema Silent Breaking Change (v0.39.1)

Incident 3 — EPIPE Crash on AfterAgent Hook

Incident 4 — --yolo Mode Blocked by Untrusted Workspace Prompt

Incident 5 — /mnt/c Path Corrupts projects.json

Incident 6 — Process Deadlock / Terminal Bouncing (Ongoing)

Incident 7 — WSL Fork Table Exhaustion (Live incident: 2026-04-28)

Direct Model Comparison: Gemini 2.5 Pro (via Gemini CLI) vs. Claude Sonnet 4.6 (via Claude Code CLI)

What Google Needs to Fix

Conclusion

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

gemini-cli - 💡(How to fix) Fix [WSL2][CRITICAL] Comprehensive reliability failure report: 7 incidents, fork table exhaustion, model comparison vs Claude Sonnet 4.6 [1 comments, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Incident 1 — OAuth Session Silent Invalidation

Code Example

[CRITICAL] WSL2 Reliability Report: Weeks of Cascading Failures with Gemini CLI — Comprehensive Timeline & Model Comparison

Environment

Why This Report Matters

Incident Timeline (Chronological)

Incident 1 — OAuth Session Silent Invalidation

Incident 2 — Hook Schema Silent Breaking Change (v0.39.1)

Incident 3 — EPIPE Crash on AfterAgent Hook

Incident 4 — --yolo Mode Blocked by Untrusted Workspace Prompt

Incident 5 — /mnt/c Path Corrupts projects.json

Incident 6 — Process Deadlock / Terminal Bouncing (Ongoing)

Incident 7 — WSL Fork Table Exhaustion (Live incident: 2026-04-28)

Direct Model Comparison: Gemini 2.5 Pro (via Gemini CLI) vs. Claude Sonnet 4.6 (via Claude Code CLI)

What Google Needs to Fix

Conclusion

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING