codex - 💡(How to fix) Fix TUI /resume picker can block on global rollout scan despite cwd filter [2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openai/codex#22037Fetched 2026-05-11 03:20:25
View on GitHub
Comments
2
Participants
2
Timeline
7
Reactions
0
Timeline (top)
labeled ×5commented ×2

Root Cause

The TUI /resume picker can be slow in profiles with many local rollout files because the cwd-filtered resume list still goes through a filesystem-first scan of the global rollout tree.

Code Example

~/.codex/sessions rollout files: 4829
~/.codex/sessions size: 2.1G
largest rollout file: 141,283,019 bytes
state_5.sqlite threads: 4541 total, 330 cli/vscode, 1 for the current cwd
raw stat/sort of ~/.codex/sessions rollout files: 10.70s

---

~/.claude/projects/<sanitized-cwd>/<session-id>.jsonl

---

find ~/.codex/sessions -type f -name 'rollout-*.jsonl' | wc -l
find ~/.codex/sessions -type f -name 'rollout-*.jsonl' \
  | while IFS= read -r f; do stat -f '%m %z %N' "$f"; done \
  | sort -nr \
  | head -25 >/dev/null
RAW_BUFFERClick to expand / collapse

What version of Codex CLI is running?

codex-cli 0.130.0

Local source checkout used for code-path inspection: openai/codex at cac5354455.

What subscription do you have?

N/A for this report. This is local CLI/TUI resume picker behavior before a model request is made.

Which model were you using?

N/A. The observed latency is in local session listing / resume picker loading.

What platform is your computer?

Darwin 25.2.0 arm64 arm

What terminal emulator and version are you using (if applicable)?

VS Code integrated terminal, vscode 1.109.5.

What issue are you seeing?

The TUI /resume picker can be slow in profiles with many local rollout files because the cwd-filtered resume list still goes through a filesystem-first scan of the global rollout tree.

This is a narrower CLI/TUI variant of the larger local-history performance class discussed in #18693, and it also overlaps with resume listing correctness issues such as #20165 and #21619. The specific problem here is the first-screen /resume list path: even when the picker is scoped to the current working directory, the storage layout and listing path can force global rollout scanning before the user sees the picker results.

The current flow appears to be:

  1. /resume opens the TUI resume picker.
  2. The picker builds a thread/list request with a cwd filter and source filter.
  3. The request still sets use_state_db_only: false.
  4. The backend uses filesystem-first listing so it can repair / validate SQLite metadata.
  5. For updated_at ordering, the rollout list path must scan files because updated time is not encoded in filenames.
  6. After selecting a session, cold resume can additionally load the full rollout JSONL into memory.

Relevant code paths from current main:

  • codex-rs/tui/src/resume_picker.rs: thread_list_params includes cwd/source filters but sets use_state_db_only: false.
  • codex-rs/rollout/src/recorder.rs: list_threads_with_db_fallback performs filesystem-first listing and read-repair before returning filtered listings.
  • codex-rs/rollout/src/list.rs: the UpdatedAt sort path documents that it must scan files up to the scan cap because updated_at is not encoded in filenames.
  • codex-rs/thread-store/src/local/read_thread.rs and codex-rs/rollout/src/recorder.rs: cold resume history loading ultimately calls RolloutRecorder::load_rollout_items, which uses tokio::fs::read_to_string(path) for the rollout.

Local measurement from my profile:

~/.codex/sessions rollout files: 4829
~/.codex/sessions size: 2.1G
largest rollout file: 141,283,019 bytes
state_5.sqlite threads: 4541 total, 330 cli/vscode, 1 for the current cwd
raw stat/sort of ~/.codex/sessions rollout files: 10.70s

For comparison, Claude Code stores transcripts under per-project directories:

~/.claude/projects/<sanitized-cwd>/<session-id>.jsonl

On the same machine, scanning one heavy Claude project directory and sorting by mtime took about 0.74s. The point is not that Claude's whole profile is smaller; the full ~/.claude/projects tree has 6317 jsonl files. The faster common path comes from physical per-project partitioning, so "continue current directory" does not need to inspect all projects first.

What steps can reproduce the bug?

  1. Accumulate many Codex local rollout files under ~/.codex/sessions, especially across multiple projects.
  2. Make sure only a small subset belongs to the current working directory.
  3. Open Codex TUI in one project directory.
  4. Run /resume.
  5. Observe that the resume picker can take noticeable time to show or refresh results.
  6. Inspect the listing path: the picker passes a cwd filter, but because use_state_db_only is false, the backend can still scan the global rollout tree and repair/validate metadata before returning the page.

A local way to estimate the worst-case scan pressure is:

find ~/.codex/sessions -type f -name 'rollout-*.jsonl' | wc -l
find ~/.codex/sessions -type f -name 'rollout-*.jsonl' \
  | while IFS= read -r f; do stat -f '%m %z %N' "$f"; done \
  | sort -nr \
  | head -25 >/dev/null

What is the expected behavior?

Opening /resume in the TUI should make the common current-directory picker path fast and bounded by current-project metadata, not by the number of rollout files across all projects.

In particular, if the SQLite state DB already has current cwd/source/provider metadata, the first page should be able to render from that index without waiting for a global filesystem scan/repair.

Additional information

Suggested phased fix:

  1. For the default TUI /resume current-cwd list, try use_state_db_only: true first.
  2. If the DB returns no usable results, errors, or the user explicitly selects a global/show-all mode, fall back to the existing filesystem scan path.
  3. Move filesystem repair/reconciliation for normal picker opening to a background task so it does not block the first visible page.
  4. Longer term: add a per-cwd sidecar/index or encode enough metadata to avoid global rollout scans for project-scoped listing.
  5. Separately, consider making cold resume avoid synchronously reading very large rollout files in full before the UI can recover.

This keeps the existing correctness fallback while making the common interactive path closer to a DB-first, project-scoped resume picker.

If this direction matches the maintainers' intent, I am happy to send a small invited PR for the Phase 1 change.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

codex - 💡(How to fix) Fix TUI /resume picker can block on global rollout scan despite cwd filter [2 comments, 2 participants]