codex - 💡(How to fix) Fix CLI: subagents memory leak

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Root Cause

When the session becomes large enough, RAM leak worsens indefinitely, eventually emitting "codex(...) MallocStackLogging: can't turn off malloc stack logging because it was not enabled." and make system unusable. (See #17139 )

RAW_BUFFERClick to expand / collapse

What version of Codex CLI is running?

0.130.0

What subscription do you have?

Pro Lite

Which model were you using?

No response

What platform is your computer?

Apple Silicon

What terminal emulator and version are you using (if applicable)?

No response

Codex doctor report

What issue are you seeing?

Spawning many subagents causes the native codex process memory and thread count to grow rapidly. The CLI can become temporarily unusable and may eventually exit/crash.

This appears to happen inside the native codex process, not in the Node wrapper or external child processes.

Observation:

  • codex RSS grew from ~23 MB to 315 MB+
  • vmmap physical footprint reached ~363.7 MB
  • MALLOC_SMALL resident grew from ~29.8 MB to ~229.0 MB
  • malloc allocation count grew from ~79k to ~691k
  • thread count grew from 22 to 68

When the session becomes large enough, RAM leak worsens indefinitely, eventually emitting "codex(...) MallocStackLogging: can't turn off malloc stack logging because it was not enabled." and make system unusable. (See #17139 )

What steps can reproduce the bug?

Spawn subagents in a large session

What is the expected behavior?

Subagent fanout should have bounded memory growth.

Completed subagents should not remain fully live in memory indefinitely. Their event queues, session loops, thread resources, and model/tool state should either be drained, cold-unloaded, or released after completion while preserving enough metadata/results for the parent thread.

The CLI should remain responsive and should not exit/crash.

Additional information

Cause 1: v1 child event queue may not be drained

codex::spawn creates a per-session unbounded event channel:

session/mod.rs:482, async_channel::unbounded() session/mod.rs:747:, next_event() consumes from that receiver.

When a child thread is spawned, the thread manager consumes the initial SessionConfigured event, but it does not drain later child events: thread_manager.rs:1197 thread_manager.rs:1245

The completion watcher only subscribes to child status, waits for a terminal status, and injects a completion message into the parent. It does not consume the child event stream: agent/control.rs:320 agent/control.rs:943 agent/control.rs:1012 There is also TODO about this: agent/control.rs:308

Cause 2: completed child sessions are not released The completion watcher notifies the parent, but the child session remains live. Actual release appears to happen through the explicit close path: agent/control.rs:715, shutdown_live_agent agent/control.rs:737, close_agent agent/control.rs:79, spawn slot reservation agent/control.rs:99, release path

This means terminal children can keep their session loop, event channel, metadata, and associated runtime allocations alive until the user or client explicitly closes them.

  • Subagents getting the whole session transcript

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING