codex - 💡(How to fix) Fix Unexpected rapid 5h limit depletion during long-running architect workflows

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Code Example

not available
RAW_BUFFERClick to expand / collapse

What version of Codex CLI is running?

0.130.0

What subscription do you have?

ChatGPT Plus

Which model were you using?

gpt-5.5

What platform is your computer?

Linux 5.10.16.3-microsoft-standard-WSL2 x86_64 x86_64

What terminal emulator and version are you using (if applicable)?

Windows Terminal + WSL2 Ubuntu 22.04 + tmux

Codex doctor report

not available

What issue are you seeing?

I’m seeing unexpectedly fast 5h limit depletion during long-running architecture/system-design workflows.

Environment:

  • Codex CLI used for repo-based engineering workflows
  • iterative semantic/UX/system design discussions
  • long-running architect-style sessions

Observed behavior:

  • 5h limit depletes much faster than previous sessions
  • even short prompts become expensive after long sessions
  • context window reached ~230K/258K
  • token usage becomes difficult to predict

Main pain points:

  • lack of visibility into actual context/token contributors
  • unclear auto-attached repo context behavior
  • long-session context inflation
  • reasoning replay costs are opaque

Suggestions:

  • context/token usage breakdown
  • repo attach visibility
  • manual context budgeting
  • lightweight architect mode
  • easier session reset/compression tools

This especially impacts power-user workflows involving:

  • semantic architecture discussions
  • iterative UX refinement
  • repo-wide reasoning sessions
  • long-running engineering conversations

Attached thread ID: 019e3287-e5ea-7cc0-b5af-8c136101fd84

What steps can reproduce the bug?

  1. Start a long-running Codex CLI architecture/engineering session on a repo-based workflow.

  2. Perform iterative semantic and UX discussions across many prompts:

    • architecture refinement
    • repo semantics
    • workflow design
    • command UX discussions
    • implementation iterations
  3. Continue the same session for a long period instead of starting fresh sessions.

  4. Observe context growth approaching the maximum context window (~230K/258K in my case).

  5. After long sessions, even relatively short prompts begin consuming the 5h limit much faster than expected.

Observed result:

  • rapid 5h limit depletion
  • token usage becomes difficult to predict
  • short prompts appear disproportionately expensive late in the session

Uploaded thread: 019e3287-e5ea-7cc0-b5af-8c136101fd84

What is the expected behavior?

Expected behavior:

  • more predictable token consumption during long-running sessions
  • visibility into context/token contributors
  • better transparency around auto-attached repo context
  • easier session reset/compression workflows

Additional information

I understand rate limits are usually automated and may not be manually adjustable.

However, if this session behavior appears abnormal from the uploaded thread data, I’d greatly appreciate any temporary assistance, clarification, or investigation support regarding the unusually fast 5h limit depletion.

Thank you again for reviewing the report.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING