codex - 💡(How to fix) Fix Persistent orphaned subagents, missing lifecycle controls, and eventual session freezes [2 comments, 2 participants]

Codex leaks subagents and does not manage their lifecycle reliably. After enough subagents accumulate, Codex itself hits the too many subagents launched limit and attempts cleanup, but usually terminates only 1–3 subagents. The remaining stuck/orphaned subagents are no longer visible to it, continue to count against the limit, and block further launches. If I ask the main thread to clean up inactive subagents, the result is the same: only partial cleanup, with the remaining subagents still stuck and still counted. Over time, the session can hard-freeze mid-task and must be killed manually.

This does not look like simple memory exhaustion: free RAM is still available and swap is large when the freeze happens.

This is not an MCP/apps issue. I had already disabled apps explicitly in codex-config.toml because MCP/apps initialization was previously getting stuck indefinitely:

[features]
multi_agent = true
apps = false

What version of Codex CLI is running?

0.123.0

What subscription do you have?

Pro+ ($200)

Which model were you using?

gpt-5.4 high

What platform is your computer?

Ubuntu 24.04.4 LTS (GNU/Linux 6.8.0-90-generic x86_64)

What terminal emulator and version are you using (if applicable)?

tmux

What issue are you seeing?

Summary

This does not look like simple memory exhaustion: free RAM is still available and swap is large when the freeze happens.

This is not an MCP/apps issue. I had already disabled apps explicitly in codex-config.toml because MCP/apps initialization was previously getting stuck indefinitely:

[features]
multi_agent = true
apps = false

Actual behavior

Subagents can remain stuck after their work is finished.
Codex eventually reaches its own subagent limit.
When this happens, Codex may try to clean up automatically, but usually terminates only 1–3 subagents.
The remaining stuck/orphaned subagents are not properly visible to Codex anymore.
Those invisible subagents still count against the launch limit.
Asking the main thread to clean up inactive subagents produces the same partial cleanup.
Entering a subagent through /agents does not provide a way to terminate it.
There are no usable user-facing lifecycle controls for subagents.
In long-running degraded sessions, Codex can hard-freeze and must be killed manually.
This can happen even with plenty of free RAM and large swap available.

What steps can reproduce the bug?

Use a session heavily with subagents.
Let Codex spawn multiple subagents across tasks.
After some time, some subagents remain alive after their useful work is already finished.
Codex reaches the subagent limit and starts failing with too many subagents launched.
Codex attempts cleanup on its own, but usually removes only 1–3 subagents.
The remaining stuck/orphaned subagents are not cleaned up and appear to be no longer visible to Codex.
Asking the main thread to clean up inactive subagents leads to the same partial result.
New subagents still cannot be launched reliably.
If the session keeps running in this degraded state, Codex may eventually freeze and require a manual kill.

What is the expected behavior?

Codex should track all active subagents accurately.
Finished, inactive, or orphaned subagents should remain visible until fully cleaned up.
Automatic cleanup should remove all inactive/orphaned subagents, not just 1–3.
Manual cleanup from the main thread should also remove all inactive/orphaned subagents reliably.
Stuck subagents should not continue counting against the limit after cleanup is requested.
/agents should allow explicit termination of the current subagent.
Users should have direct lifecycle controls for listing, terminating, force-killing, and resyncing subagents.
Leaked subagents should not be able to degrade the session into a blocked or frozen state.

Impact

This is a serious blocker for subagent-heavy workflows. Instead of enabling parallel decomposition, the current behavior causes degraded sessions, blocked launches, repeated manual recovery attempts, and eventual session loss.

Related issue

There is also a separate delegation quality problem: even when the main session runs on GPT-5.4, Codex often delegates unsuitable tasks to Codex 5.2, including research or spec writing, and often adds unnecessary restrictive instructions. This is separate, but it makes the subagent failure mode worse.

extent analysis

TL;DR

The most likely fix involves improving Codex's subagent management to accurately track and clean up inactive or orphaned subagents.

Guidance

Investigate the Codex configuration and code to identify why subagents are not being properly cleaned up, focusing on the too many subagents launched limit and the cleanup mechanism.
Verify that the issue is not related to the disabled MCP/apps feature by testing with it re-enabled, despite the previous issues with MCP/apps initialization getting stuck.
Consider implementing or requesting user-facing lifecycle controls for subagents, such as listing, terminating, force-killing, and resyncing, to better manage subagent resources.
Review the /agents functionality to determine why it does not provide a way to terminate subagents and consider enhancements to this feature.

Example

No specific code example can be provided without further details on the Codex CLI implementation, but a potential area of investigation could involve the cleanup logic in the Codex codebase.

Notes

The provided information suggests a complex issue with Codex's subagent management, potentially involving both configuration and code-level problems. Without access to the Codex code or more detailed logs, it's challenging to provide a precise fix.

Recommendation

Apply a workaround by closely monitoring subagent usage and manually intervening when the too many subagents launched limit is approached, to prevent session degradation and freezes, until a more permanent solution can be implemented.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

codex - 💡(How to fix) Fix Persistent orphaned subagents, missing lifecycle controls, and eventual session freezes [2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

What version of Codex CLI is running?

What subscription do you have?

Which model were you using?

What platform is your computer?

What terminal emulator and version are you using (if applicable)?

What issue are you seeing?

Summary

Actual behavior

What steps can reproduce the bug?

What is the expected behavior?

Impact

Related issue

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

codex - 💡(How to fix) Fix Persistent orphaned subagents, missing lifecycle controls, and eventual session freezes [2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

What version of Codex CLI is running?

What subscription do you have?

Which model were you using?

What platform is your computer?

What terminal emulator and version are you using (if applicable)?

What issue are you seeing?

Summary

Actual behavior

What steps can reproduce the bug?

What is the expected behavior?

Impact

Related issue

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING