gemini-cli - 💡(How to fix) Fix Showing 100% quota reached on Google One Pro plan despite low usage

gemini-cli2026-05-08 10:07:24

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Error Message

Symptom: The model remains in a "Thinking" loop for over 10 minutes without providing a response or triggering a timeout error.

3. High Error Rates

gemini-3.1-pro-preview: 28.6% Error Rate (4/14 requests)
gemini-3-flash-preview: 27.3% Error Rate (3/11 requests)

RAW_BUFFERClick to expand / collapse

What happened?

Bug Report: Incorrect Quota Tracking and Extreme Thinking Latency

System Context

Tier: Gemini Code Assist in Google One AI Pro
Auth Method: Signed in with Google ([email protected])
Models Affected: gemini-3.1-pro-preview, gemini-3-flash-preview
CLI Version: [Insert version from /about]

1. Issue: False "Quota Reached" Warning

The CLI status line reports 100% quota reached, yet the agent continues to respond and process queries.

Evidence: Session stats show only 27 total requests (14 Pro, 13 Flash), which is far below the expected daily limit for the Google One AI Pro tier.
Observed Behavior: The local UI/stat-tracker appears to misinterpret backend 429 errors or tool-call volume as a total depletion of the daily allowance.

2. Issue: Extreme "Thinking" Latency (10+ Minutes)

Requests frequently hang in a processing state for an unacceptable duration.

Symptom: The model remains in a "Thinking" loop for over 10 minutes without providing a response or triggering a timeout error.
Impact: This renders the CLI unusable during these hangs, requiring a manual process kill (Ctrl+C).
Stats Context: /stats shows an average latency of 5.6s on Pro, but individual peaks are exceeding 600s+ (10m).

3. High Error Rates

The /stats breakdown shows a high failure rate that correlates with the quota/hang issues:

gemini-3.1-pro-preview: 28.6% Error Rate (4/14 requests)
gemini-3-flash-preview: 27.3% Error Rate (3/11 requests)

Expected Behavior

The CLI should accurately sync with the Google One AI Pro backend quota.
The agent should implement a reasonable timeout (e.g., 2 minutes) rather than hanging for 10+ minutes.
Errors should be clearly identified as "Server Latency" or "Connection Timeout" rather than defaulting to "Quota Reached."

Reproduction Steps

Authenticate with a Google One AI Pro account.
Run a session involving architectural planning or tool calls.
Observe "Quota Reached" status bar message after ~25 requests.
Observe intermittent hangs where the agent stays in a "Thinking" state for 10+ minutes.

What did you expect to happen?

I expect Gemini CLI with my plan to be at least useful, but right now its quite difficult to use.

Client information

CLI Version: 0.41.2
Git Commit: b0c7a1722
Session ID: c005b748-d6b6-40b6-9351-25c9cd4f71f7
Operating System: win32 v24.15.0
Sandbox Environment: no sandbox
Model Version: auto-gemini-3
Auth Type: oauth-personal
Memory Usage: 190.7 MB
Terminal Name: Unknown
Terminal Background: #0c0c0c
Kitty Keyboard Protocol: Unsupported

Login information

No response

Anything else we need to know?

No response

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#tool integration #LLM response #prompt template #agent execution #callback error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

gemini-cli - 💡(How to fix) Fix Showing 100% quota reached on Google One Pro plan despite low usage

Recommended Tools

GitHub issue graph ai analysis

Error Message

3. High Error Rates

What happened?

Bug Report: Incorrect Quota Tracking and Extreme Thinking Latency

System Context

1. Issue: False "Quota Reached" Warning

2. Issue: Extreme "Thinking" Latency (10+ Minutes)

3. High Error Rates

Expected Behavior

Reproduction Steps

What did you expect to happen?

Client information

Login information

Anything else we need to know?

Still need to ship something?

TRENDING

gemini-cli - 💡(How to fix) Fix Showing 100% quota reached on Google One Pro plan despite low usage

Recommended Tools

GitHub issue graph ai analysis

Error Message

3. High Error Rates

What happened?

Bug Report: Incorrect Quota Tracking and Extreme Thinking Latency

System Context

1. Issue: False "Quota Reached" Warning

2. Issue: Extreme "Thinking" Latency (10+ Minutes)

3. High Error Rates

Expected Behavior

Reproduction Steps

What did you expect to happen?

Client information

Login information

Anything else we need to know?

Still need to ship something?

RELATED_DISCOVERY

TRENDING