codex - 💡(How to fix) Fix Severe quota drain accompanied by intelligence/quality degradation in VSCode extension (Suspected Context/Cache issue)

StepCodex · 2026-05-26T09:51:34Z

[codex] What version of the IDE extension are you using? 26.5519.32039 What subscription do you have? Recently, I have experienced a dual issue with the VSCode… ### What version of the IDE extension are you using? 26.5519.32039 ### What subscription do you have? Recently, I have experienced a dual issue with the VSCode extension: a drastic, anomalous surge in quota consumption combined with a noticeable regression in model intelligence and task completion quality during my daily routines. ### Which IDE are you using? VS Code ### What platform is your computer? Linux 5.15.0-177-generic x86_64 x86_64 ### What issue are you seeing? # Describe the Bug Recently, I have experienced a dual issue with the VSCode extension: a drastic, anomalous surge in quota consumption combined with a noticeable regression in model intelligence and task completion quality during my daily routines. 1. **Anomalous Quota Drain**: My weekly and 5-hour fast quotas are being drained at an unprecedented speed (fully exhausted within 2 days of regular work). This strongly indicates that **Prompt Caching is failing**, forcing the system to re-evaluate the full workspace context or chat history on every single request. 2. **Intelligence & Quality Degradation**: Concurrently, the quality of task completion has significantly dropped. Previously, Codex could complete high-quality tasks autonomously with minimal guidance. Now, it seems to "lose track" of the project context mid-session, requiring frequent, multi-turn manual prompt adjustments and course-corrections to deliver acceptable results. These two symptoms are likely highly correlated: the failure in context management/caching not only inflates token costs but also causes the model to lose the deep contextual awareness needed for high-quality generation. ### What steps can reproduce the bug? # Steps to Reproduce 1. Open a workspace and initiate a coding task or daily routine. 2. Proceed with multi-turn interactions or continuous code generations within the same session. 3. Observe the swift depletion of the quota. 4. Note that the model frequently fails to retain previous instructions, requiring constant micro-management to get the output right. ### What is the expected behavior? # Expected Behavior - Prompt Cache should work seamlessly to optimize token usage. - The model should maintain consistent high-quality reasoning and contextual awareness across multiple turns without needing redundant corrections. ### Additional information _No response_

codex2026-05-26 09:51:34

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

RAW_BUFFERClick to expand / collapse

What version of the IDE extension are you using?

26.5519.32039

What subscription do you have?

Recently, I have experienced a dual issue with the VSCode extension: a drastic, anomalous surge in quota consumption combined with a noticeable regression in model intelligence and task completion quality during my daily routines.

Which IDE are you using?

VS Code

What platform is your computer?

Linux 5.15.0-177-generic x86_64 x86_64

What issue are you seeing?

Describe the Bug

Anomalous Quota Drain: My weekly and 5-hour fast quotas are being drained at an unprecedented speed (fully exhausted within 2 days of regular work). This strongly indicates that Prompt Caching is failing, forcing the system to re-evaluate the full workspace context or chat history on every single request.
Intelligence & Quality Degradation: Concurrently, the quality of task completion has significantly dropped. Previously, Codex could complete high-quality tasks autonomously with minimal guidance. Now, it seems to "lose track" of the project context mid-session, requiring frequent, multi-turn manual prompt adjustments and course-corrections to deliver acceptable results.

These two symptoms are likely highly correlated: the failure in context management/caching not only inflates token costs but also causes the model to lose the deep contextual awareness needed for high-quality generation.

What steps can reproduce the bug?

Steps to Reproduce

Open a workspace and initiate a coding task or daily routine.
Proceed with multi-turn interactions or continuous code generations within the same session.
Observe the swift depletion of the quota.
Note that the model frequently fails to retain previous instructions, requiring constant micro-management to get the output right.

What is the expected behavior?

Expected Behavior

Prompt Cache should work seamlessly to optimize token usage.
The model should maintain consistent high-quality reasoning and contextual awareness across multiple turns without needing redundant corrections.

Additional information

No response

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering