gemini-cli - 💡(How to fix) Fix Explicit Model Selection Ignored After Flash Quota Exhaustion

gemini-cli2026-05-09 23:17:17

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Error Message

The CLI attempts to request gemini-3-flash-preview. Since the Flash quota is exhausted, the request fails with a quota error, rendering the CLI unusable for any task requiring the model, even though the Pro model is available and requested.

RAW_BUFFERClick to expand / collapse

What happened?

The Gemini CLI (version 0.41.2) fails to respect explicit model selection (e.g., gemini-3.1-pro-preview) once the quota for gemini-3-flash-preview (or the default Flash model) is exhausted. Despite having Pro model quota available and explicitly configuring the CLI to use it, all subsequent requests are blocked by attempts to call the exhausted Flash model.

This behavior suggests a flaw in the ModelAvailabilityService or the model routing logic, where a "silent fallback" or a hardcoded preference for Flash persists even when the user has overridden the model selection.

What did you expect to happen?

The CLI should honor the explicit model selection (gemini-3.1-pro-preview) and successfully execute the request using the Pro model's quota.

Steps to Reproduce

Exhaust the quota for gemini-3-flash-preview (the default/Flash category model).
Explicitly set the CLI model to gemini-3.1-pro-preview.
Execute any command that requires an LLM call (e.g., a simple prompt or a research task).

Client information

CLI Version: 0.41.2
Git Commit: b0c7a1722
Operating System: darwin v25
Sandbox Environment: no sandbox
Model Version: auto-gemini-3
Auth Type: oauth-personal

Login information

No response

Anything else we need to know?

No response

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#database connection #vector store #embedding generation #cache error #pipeline error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

gemini-cli - 💡(How to fix) Fix Explicit Model Selection Ignored After Flash Quota Exhaustion

Recommended Tools

GitHub issue graph ai analysis

Error Message

What happened?

What did you expect to happen?

Steps to Reproduce

Client information

Login information

Anything else we need to know?

Still need to ship something?

TRENDING

gemini-cli - 💡(How to fix) Fix Explicit Model Selection Ignored After Flash Quota Exhaustion

Recommended Tools

GitHub issue graph ai analysis

Error Message

What happened?

What did you expect to happen?

Steps to Reproduce

Client information

Login information

Anything else we need to know?

Still need to ship something?

RELATED_DISCOVERY

TRENDING