codex - 💡(How to fix) Fix Persistent orphaned subagents, missing lifecycle controls, and eventual session freezes [2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openai/codex#19197Fetched 2026-04-24 05:59:00
View on GitHub
Comments
2
Participants
2
Timeline
7
Reactions
0
Author
Timeline (top)
labeled ×3commented ×2cross-referenced ×1unlabeled ×1

Codex leaks subagents and does not manage their lifecycle reliably. After enough subagents accumulate, Codex itself hits the too many subagents launched limit and attempts cleanup, but usually terminates only 1–3 subagents. The remaining stuck/orphaned subagents are no longer visible to it, continue to count against the limit, and block further launches. If I ask the main thread to clean up inactive subagents, the result is the same: only partial cleanup, with the remaining subagents still stuck and still counted. Over time, the session can hard-freeze mid-task and must be killed manually.

This does not look like simple memory exhaustion: free RAM is still available and swap is large when the freeze happens.

This is not an MCP/apps issue. I had already disabled apps explicitly in codex-config.toml because MCP/apps initialization was previously getting stuck indefinitely:

[features]
multi_agent = true
apps = false

Root Cause

This is not an MCP/apps issue. I had already disabled apps explicitly in codex-config.toml because MCP/apps initialization was previously getting stuck indefinitely:

Code Example

[features]
multi_agent = true
apps = false
RAW_BUFFERClick to expand / collapse

What version of Codex CLI is running?

0.123.0

What subscription do you have?

Pro+ ($200)

Which model were you using?

gpt-5.4 high

What platform is your computer?

Ubuntu 24.04.4 LTS (GNU/Linux 6.8.0-90-generic x86_64)

What terminal emulator and version are you using (if applicable)?

tmux

What issue are you seeing?

Summary

Codex leaks subagents and does not manage their lifecycle reliably. After enough subagents accumulate, Codex itself hits the too many subagents launched limit and attempts cleanup, but usually terminates only 1–3 subagents. The remaining stuck/orphaned subagents are no longer visible to it, continue to count against the limit, and block further launches. If I ask the main thread to clean up inactive subagents, the result is the same: only partial cleanup, with the remaining subagents still stuck and still counted. Over time, the session can hard-freeze mid-task and must be killed manually.

This does not look like simple memory exhaustion: free RAM is still available and swap is large when the freeze happens.

This is not an MCP/apps issue. I had already disabled apps explicitly in codex-config.toml because MCP/apps initialization was previously getting stuck indefinitely:

[features]
multi_agent = true
apps = false

Actual behavior

  • Subagents can remain stuck after their work is finished.
  • Codex eventually reaches its own subagent limit.
  • When this happens, Codex may try to clean up automatically, but usually terminates only 1–3 subagents.
  • The remaining stuck/orphaned subagents are not properly visible to Codex anymore.
  • Those invisible subagents still count against the launch limit.
  • Asking the main thread to clean up inactive subagents produces the same partial cleanup.
  • Entering a subagent through /agents does not provide a way to terminate it.
  • There are no usable user-facing lifecycle controls for subagents.
  • In long-running degraded sessions, Codex can hard-freeze and must be killed manually.
  • This can happen even with plenty of free RAM and large swap available.
<img width="510" height="243" alt="Image" src="https://github.com/user-attachments/assets/b6e9e8ce-8c0d-475d-90c5-9ec1c11d054c" />

What steps can reproduce the bug?

  1. Use a session heavily with subagents.
  2. Let Codex spawn multiple subagents across tasks.
  3. After some time, some subagents remain alive after their useful work is already finished.
  4. Codex reaches the subagent limit and starts failing with too many subagents launched.
  5. Codex attempts cleanup on its own, but usually removes only 1–3 subagents.
  6. The remaining stuck/orphaned subagents are not cleaned up and appear to be no longer visible to Codex.
  7. Asking the main thread to clean up inactive subagents leads to the same partial result.
  8. New subagents still cannot be launched reliably.
  9. If the session keeps running in this degraded state, Codex may eventually freeze and require a manual kill.

What is the expected behavior?

  • Codex should track all active subagents accurately.
  • Finished, inactive, or orphaned subagents should remain visible until fully cleaned up.
  • Automatic cleanup should remove all inactive/orphaned subagents, not just 1–3.
  • Manual cleanup from the main thread should also remove all inactive/orphaned subagents reliably.
  • Stuck subagents should not continue counting against the limit after cleanup is requested.
  • /agents should allow explicit termination of the current subagent.
  • Users should have direct lifecycle controls for listing, terminating, force-killing, and resyncing subagents.
  • Leaked subagents should not be able to degrade the session into a blocked or frozen state.

Impact

This is a serious blocker for subagent-heavy workflows. Instead of enabling parallel decomposition, the current behavior causes degraded sessions, blocked launches, repeated manual recovery attempts, and eventual session loss.

Related issue

There is also a separate delegation quality problem: even when the main session runs on GPT-5.4, Codex often delegates unsuitable tasks to Codex 5.2, including research or spec writing, and often adds unnecessary restrictive instructions. This is separate, but it makes the subagent failure mode worse.

extent analysis

TL;DR

The most likely fix involves improving Codex's subagent management to accurately track and clean up inactive or orphaned subagents.

Guidance

  • Investigate the Codex configuration and code to identify why subagents are not being properly cleaned up, focusing on the too many subagents launched limit and the cleanup mechanism.
  • Verify that the issue is not related to the disabled MCP/apps feature by testing with it re-enabled, despite the previous issues with MCP/apps initialization getting stuck.
  • Consider implementing or requesting user-facing lifecycle controls for subagents, such as listing, terminating, force-killing, and resyncing, to better manage subagent resources.
  • Review the /agents functionality to determine why it does not provide a way to terminate subagents and consider enhancements to this feature.

Example

No specific code example can be provided without further details on the Codex CLI implementation, but a potential area of investigation could involve the cleanup logic in the Codex codebase.

Notes

The provided information suggests a complex issue with Codex's subagent management, potentially involving both configuration and code-level problems. Without access to the Codex code or more detailed logs, it's challenging to provide a precise fix.

Recommendation

Apply a workaround by closely monitoring subagent usage and manually intervening when the too many subagents launched limit is approached, to prevent session degradation and freezes, until a more permanent solution can be implemented.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING