codex - 💡(How to fix) Fix TUI redraw pattern triggers macOS kernel `kalloc.1024` (vfs.namei) zone leak — 0.128.0 on macOS 26.4.1 [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openai/codex#20843Fetched 2026-05-04 05:09:33
View on GitHub
Comments
1
Participants
2
Timeline
6
Reactions
0
Timeline (top)
labeled ×3closed ×1commented ×1unlabeled ×1

On macOS 26.4.1 the data.kalloc.1024 kernel zone (vfs.namei lookup buffers) grows monotonically while a long-running codex session is active. Growth is sustained at roughly 370 MB/hour, has been independently observed and analysed against the Anthropic Claude Code report at https://github.com/anthropics/claude-code/issues/44824 (which identifies the underlying issue as a macOS kernel-side bug in the GPU driver / WindowServer chain triggered by high-frequency TUI redraw patterns).

This issue is filed to (a) make sure the Codex maintainers are aware that codex is in the same trigger-class as Claude Code, (b) share a self-contained reproduction with PID-attributed evidence so the codex side can evaluate throttle/batch/plain-mode mitigations, and (c) link the existing Ghostty fix (https://github.com/ghostty-org/ghostty/issues/10289 → Ghostty 1.3) as a reference for what an emulator-side fix looked like.

Root Cause

Suspected codex code path (NOT confirmed as the leak source — see "Apple-side root cause" below)

Fix Action

Fix / Workaround

This issue is filed to (a) make sure the Codex maintainers are aware that codex is in the same trigger-class as Claude Code, (b) share a self-contained reproduction with PID-attributed evidence so the codex side can evaluate throttle/batch/plain-mode mitigations, and (c) link the existing Ghostty fix (https://github.com/ghostty-org/ghostty/issues/10289 → Ghostty 1.3) as a reference for what an emulator-side fix looked like.

That fix is not the same as the macOS kernel kalloc.1024 leak — they are two distinct bugs. But the trigger class is the same: high-frequency, multi-codepoint TUI output. This matters here because:

  • It demonstrates that emulator/CLI-side adjustments to output cadence and grapheme handling do make material differences in downstream allocation pressure.
  • It is a precedent for the proposition that mitigations on the application side (codex) can blunt the impact of an underlying allocator-side bug, even before the underlying bug is fixed.

Possible mitigations on the codex side (open for discussion)

Borrowed from the claude-code#44824 mitigation list and adapted to codex's ratatui/crossterm stack:

Code Example

2026-04-29 21:58  data.kalloc.1024 = 17.47 GB   (baseline)
2026-05-02 16:45  data.kalloc.1024 = 17.90 GB   (after ~3 days)
2026-05-02 16:30:29  inuse=18,681,159  17.82GB
2026-05-02 16:35:32  inuse=18,712,875  17.85GB  +31,716 elems / 303s  (~32 MB/5min ≈ 370 MB/h)
2026-05-02 16:40:34  inuse=18,745,587  17.88GB  +32,712 elems / 302s

---

... (codex internal)__open  (in libsystem_kernel.dylib)
... (codex internal)write   (in libsystem_kernel.dylib)
... → cthread_yield → swtch_pri   (busy yield pattern)

---

/Users/runner/work/codex/codex/codex-rs/tui/src/custom_terminal.rs
tui/src/custom_terminal.rs:189    (tracing event)
tui/src/app/resize_reflow.rs
tui/src/chatwidget.rs
... (50+ tui/src/*.rs paths)

---

while true; do sudo zprint | awk '/data\.kalloc\.1024/{print strftime("%T"), int(($2*$7)/1024/1024), "MB inuse"}'; sleep 30; done
RAW_BUFFERClick to expand / collapse

Summary

On macOS 26.4.1 the data.kalloc.1024 kernel zone (vfs.namei lookup buffers) grows monotonically while a long-running codex session is active. Growth is sustained at roughly 370 MB/hour, has been independently observed and analysed against the Anthropic Claude Code report at https://github.com/anthropics/claude-code/issues/44824 (which identifies the underlying issue as a macOS kernel-side bug in the GPU driver / WindowServer chain triggered by high-frequency TUI redraw patterns).

This issue is filed to (a) make sure the Codex maintainers are aware that codex is in the same trigger-class as Claude Code, (b) share a self-contained reproduction with PID-attributed evidence so the codex side can evaluate throttle/batch/plain-mode mitigations, and (c) link the existing Ghostty fix (https://github.com/ghostty-org/ghostty/issues/10289 → Ghostty 1.3) as a reference for what an emulator-side fix looked like.

Environment

  • macOS 26.4.1 (25E253), ARM64 (Apple Silicon)
  • Terminal: iTerm2 (iTermServer-3.6.10)
  • codex CLI: 0.128.0 (latest on npm as of 2026-05-02)
  • Installed via mise / npm: /Users/<user>/.local/share/mise/installs/node/22.22.0/lib/node_modules/@openai/codex

Symptom (measurable, reproducible by simply leaving codex running)

The kernel zone data.kalloc.1024 grows monotonically. Sample timeline from one machine:

2026-04-29 21:58  data.kalloc.1024 = 17.47 GB   (baseline)
2026-05-02 16:45  data.kalloc.1024 = 17.90 GB   (after ~3 days)
2026-05-02 16:30:29  inuse=18,681,159  17.82GB
2026-05-02 16:35:32  inuse=18,712,875  17.85GB  +31,716 elems / 303s  (~32 MB/5min ≈ 370 MB/h)
2026-05-02 16:40:34  inuse=18,745,587  17.88GB  +32,712 elems / 302s

Where kalloc.1024 represents 1024-byte kernel slab allocations used by vfs.namei for path lookup buffers. The allocations are never freed — exhaustion eventually causes a WindowServer watchdog timeout / kernel panic (this matches the anthropics/claude-code#44824 finding).

Smoking-gun evidence — single codex PID is the only sustained tty writer

Across 8 long-running CLI sessions on the same host (4× claude, 4× codex incl. one Codex.app GUI), measured tty fd 1u write offset deltas over 17.6 seconds:

PIDttybytes/sec
51662, 8869, 8579, 11756, 11327, 37246, 78988 (other 7 sessions)various ttysXXX0
18124 (codex)/dev/ttys006~8,200

i.e. the only process producing sustained tty output on this host is one codex CLI session. Every other long-lived CLI is idle at the tty level.

sample 18124 3 (Apple sample tool, 3s @ 1ms) shows main-thread activity at 56/2354 ticks. Among those active ticks, the stack ends in:

... (codex internal) → __open  (in libsystem_kernel.dylib)
... (codex internal) → write   (in libsystem_kernel.dylib)
... → cthread_yield → swtch_pri   (busy yield pattern)

Each kernel-side __open() runs through vfs.namei, which allocates a 1024-byte buffer in kalloc.1024. On affected macOS builds those buffers do not get freed.

Suspected codex code path (NOT confirmed as the leak source — see "Apple-side root cause" below)

strings against the shipped Rust binary surfaces:

/Users/runner/work/codex/codex/codex-rs/tui/src/custom_terminal.rs
tui/src/custom_terminal.rs:189    (tracing event)
tui/src/app/resize_reflow.rs
tui/src/chatwidget.rs
... (50+ tui/src/*.rs paths)

In codex-rs/tui/src/custom_terminal.rs (read from main branch on 2026-05-02):

  • L440-448 autoresize() — invokes self.size() on every draw call, delegating to the backend (crossterm)
  • L467-525 try_draw() — main render entry point
  • L780-824 draw() — emits ANSI escape sequences per cell via queue!() macro

The codex source itself does not contain any direct File::open("/dev/tty") — so the per-frame open() syscall observed in the kernel stack is most likely arriving via crossterm or Rust std internals on the terminal::size() path. (Confirming this in crossterm would be useful for the upstream report — happy to follow up if maintainers want.)

Apple-side root cause (per claude-code#44824)

"Claude Code's TUI generates a high-frequency stream of terminal redraws. Each redraw triggers kernel allocations through the GPU compositor pipeline: Claude Code → terminal escape sequences → Terminal.app/iTerm → WindowServer → AGXG13X (Apple GPU driver) → kalloc.1024 allocations The allocations in kalloc.1024 are never freed — this is a kernel-side bug (likely in Apple's GPU driver or WindowServer), but Claude Code is the only known application that triggers it at this rate due to its TUI rendering pattern."

Codex is in the same trigger-class. Apple-side Feedback report has been filed in parallel (we will link the Feedback ID here once Apple assigns one).

Why Ghostty's analysis is relevant here (precise framing)

Ghostty 1.3 fixed a different but related memory leak — a Ghostty-process-side RSS leak in PageList scrollback pruning that allowed non-standard mmap pages to be reused without proper munmap (Mitchell Hashimoto, "Finding and Fixing Ghostty's Largest Memory Leak", https://mitchellh.com/writing/ghostty-memory-leak-fix , and ghostty-org/ghostty#10289). The trigger pattern in Ghostty's case was Claude Code's stream of multi-codepoint grapheme outputs causing Ghostty to allocate non-standard pages frequently.

That fix is not the same as the macOS kernel kalloc.1024 leak — they are two distinct bugs. But the trigger class is the same: high-frequency, multi-codepoint TUI output. This matters here because:

  • It demonstrates that emulator/CLI-side adjustments to output cadence and grapheme handling do make material differences in downstream allocation pressure.
  • It is a precedent for the proposition that mitigations on the application side (codex) can blunt the impact of an underlying allocator-side bug, even before the underlying bug is fixed.

Possible mitigations on the codex side (open for discussion)

Borrowed from the claude-code#44824 mitigation list and adapted to codex's ratatui/crossterm stack:

  1. Throttle the redraw loop when the user is not actively typing or when no model token has arrived in the last N ms (e.g., cap the spinner/animation refresh at 5–10 Hz instead of whatever the current cadence is). See also openai/codex#11877 where the underlying "terminal animations use excessive terminal output" pattern is already tracked — this issue quantifies its kernel-level impact.
  2. Batch escape sequences more aggressively before flushing — queue!() already batches per-cell, but a final flush boundary at the end of every animation tick rather than per logical update may help.
  3. Plain output / --no-tui mode for long-running sessions (server-side, headless workflows). Already partially exists for codex exec and similar — extending it to interactive long sessions would let users escape the leak without losing codex itself.
  4. Investigate terminal::size() cadenceautoresize() is called every try_draw(). crossterm's Unix implementation explicitly calls File::open("/dev/tty") per call — see crossterm/src/terminal/sys/unix.rs line 64 (libc feature) and line 77 (non-libc), plus line 222 (File::options().write(true).open("/dev/tty") in the keyboard enhancement query). Each open("/dev/tty") runs through kernel vfs.namei and allocates one kalloc.1024 slab element. Replacing the per-draw call with a SIGWINCH-driven size cache (cached fd or cached size invalidated only on SIGWINCH) would eliminate the per-frame namei traffic entirely on the codex side, regardless of the underlying Apple kernel bug. This is the highest-leverage codex-side mitigation in the list. (Estimated effect from code analysis; we have not yet built a patched codex fork and measured kalloc.1024 growth before/after. A small SIGWINCH-cache test fork + zprint delta would be the cleanest verification.)
  5. Document the issue in the codex CLI README / troubleshooting guide so users running long sessions on macOS 26.x are aware and can monitor kalloc.1024 (script in claude-code#44824).

Reproduction

  1. macOS 26.4.1 ARM64, iTerm2.
  2. codex (CLI 0.128.0), start any interactive session that displays the spinner/animation continuously (anything that has the / glyph cycling).
  3. Watch:
    while true; do sudo zprint | awk '/data\.kalloc\.1024/{print strftime("%T"), int(($2*$7)/1024/1024), "MB inuse"}'; sleep 30; done
  4. Within 5 minutes you should see the inuse value grow by 30+ MB and continue indefinitely. Killing the codex process stops the growth immediately.

Cross-references

  • anthropics/claude-code#44824 — primary technical analysis of the macOS kernel-side kalloc.1024 bug
  • ghostty-org/ghostty#10289 + https://mitchellh.com/writing/ghostty-memory-leak-fix — Ghostty 1.3 fix (separate Ghostty-side mmap-page leak, not the same kernel bug, but same trigger class — multi-codepoint grapheme TUI output)
  • orbstack/orbstack#2368 — independent report of the same kalloc.1024 zone exhaustion via a completely different trigger (ephemeral container creation), strongly suggesting the zone exhaustion is an Apple-kernel-side issue rather than application-side
  • openai/codex#16866 — different kernel zone (os_refcnt overflow) but same category: codex on Apple Silicon macOS exhausting kernel-level resources
  • openai/codex#11877 — terminal animation output as a recognized trigger pattern; this issue quantifies its kernel-level impact
  • openai/codex#13314 — closed without zone-level analysis; this report provides one
  • openai/codex#12414 — Windows-side analog (different zone, plausibly same idle TUI redraw trigger class)
  • openai/codex#9345, #11643, #14666, #17257 — prior memory-related reports against codex (closed without kalloc-zone analysis)
  • crossterm-rs/crossterm src/terminal/sys/unix.rs — exact location of the per-call File::open("/dev/tty") discussed in mitigation #4

What I'm asking for

  • Acknowledgment that codex is in the same trigger-class as Claude Code on affected macOS builds.
  • Discussion of which (if any) of the mitigations above the maintainers consider in scope for codex.
  • (Optional) Confirmation of whether the per-try_draw terminal::size() call is needed at all on macOS, given that SIGWINCH already gives us the resize signal.

I'm happy to provide additional sample/lsof/zprint snapshots, run instrumented codex builds, or test mitigation patches.

extent analysis

TL;DR

The most likely fix for the kalloc.1024 kernel zone growth issue in codex on macOS 26.4.1 is to implement one of the proposed mitigations, such as throttling the redraw loop or batching escape sequences, to reduce the frequency of kernel allocations.

Guidance

  • Investigate the terminal::size() cadence in crossterm and consider replacing the per-draw call with a SIGWINCH-driven size cache to eliminate per-frame vfs.namei traffic.
  • Throttle the redraw loop when the user is not actively typing or when no model token has arrived in the last N ms to reduce the frequency of kernel allocations.
  • Batch escape sequences more aggressively before flushing to reduce the number of kernel allocations.
  • Consider implementing a plain output or --no-tui mode for long-running sessions to escape the leak without losing codex functionality.

Example

No code snippet is provided as the issue requires a more in-depth analysis of the codex and crossterm codebase.

Notes

The issue is caused by a kernel-side bug in the GPU driver or WindowServer, but codex can still implement mitigations to reduce the impact of the bug. The proposed mitigations may not completely fix the issue but can help reduce the frequency of kernel allocations.

Recommendation

Apply a workaround, such as throttling the redraw loop or batching escape sequences, to reduce the frequency of kernel allocations and mitigate the impact of the kernel-side bug. This is a temporary solution until the underlying bug is fixed by Apple.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

codex - 💡(How to fix) Fix TUI redraw pattern triggers macOS kernel `kalloc.1024` (vfs.namei) zone leak — 0.128.0 on macOS 26.4.1 [1 comments, 2 participants]