gemini-cli - 💡(How to fix) Fix Explicit Model Selection Ignored After Flash Quota Exhaustion

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

The CLI attempts to request gemini-3-flash-preview. Since the Flash quota is exhausted, the request fails with a quota error, rendering the CLI unusable for any task requiring the model, even though the Pro model is available and requested.

RAW_BUFFERClick to expand / collapse

What happened?

The Gemini CLI (version 0.41.2) fails to respect explicit model selection (e.g., gemini-3.1-pro-preview) once the quota for gemini-3-flash-preview (or the default Flash model) is exhausted. Despite having Pro model quota available and explicitly configuring the CLI to use it, all subsequent requests are blocked by attempts to call the exhausted Flash model.

This behavior suggests a flaw in the ModelAvailabilityService or the model routing logic, where a "silent fallback" or a hardcoded preference for Flash persists even when the user has overridden the model selection.

The CLI attempts to request gemini-3-flash-preview. Since the Flash quota is exhausted, the request fails with a quota error, rendering the CLI unusable for any task requiring the model, even though the Pro model is available and requested.

What did you expect to happen?

The CLI should honor the explicit model selection (gemini-3.1-pro-preview) and successfully execute the request using the Pro model's quota.

Steps to Reproduce

  1. Exhaust the quota for gemini-3-flash-preview (the default/Flash category model).
  2. Explicitly set the CLI model to gemini-3.1-pro-preview.
  3. Execute any command that requires an LLM call (e.g., a simple prompt or a research task).

Client information

  • CLI Version: 0.41.2
  • Git Commit: b0c7a1722
  • Operating System: darwin v25
  • Sandbox Environment: no sandbox
  • Model Version: auto-gemini-3
  • Auth Type: oauth-personal

Login information

No response

Anything else we need to know?

No response

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

gemini-cli - 💡(How to fix) Fix Explicit Model Selection Ignored After Flash Quota Exhaustion