codex - 💡(How to fix) Fix Forked subagents lose prompt-cache lineage for inherited parent context

Root Cause

fork_context=true is most useful when the parent has already accumulated important project/task context and a subagent needs to perform a bounded task using that context.

Without cache-lineage preservation, that pattern becomes unexpectedly expensive:

each fork can rebill a large inherited prefix as uncached input;
multi-subagent workflows multiply the cost;
users may see rapid quota/cost drain even when the actual per-subagent task prompt is small.

Fix Action

Fix / Workaround

Start a normal Codex session using gpt-5.5.
Add enough stable context to the parent thread, for example by discussing or loading a large repo/file/document so the parent has a substantial cached prefix.
Spawn a subagent with inherited/forked context for a small task.
Inspect the usage events in the rollout/session JSONL, especially input_tokens and cached_input_tokens, for the forked child turn.
Compare that with either:
- a normal continuation in the parent thread, or
- a patched build where the forked child preserves the parent prompt-cache lineage.

What version of Codex CLI is running?

codex-cli 0.133.0

Which model were you using?

gpt-5.5

What platform is your computer?

macOS

What issue are you seeing?

Forked subagents appear to lose the parent thread's prompt-cache lineage even though they inherit a large prefix of the parent conversation.

When a subagent is spawned with inherited context, the child thread usually starts from a long parent prefix that should be highly cacheable. However, the child currently gets its own thread-derived prompt_cache_key, so the backend appears to treat the fork as an unrelated request instead of a continuation/fork of the parent cache lineage.

The practical effect is that fork_context=true subagents can pay mostly uncached input-token cost for a prefix that was already present and cached in the parent thread.

This is especially expensive for workflows that intentionally load substantial project context in the main thread and then delegate small follow-up tasks to forked subagents.

What steps can reproduce the bug?

Start a normal Codex session using gpt-5.5.
Add enough stable context to the parent thread, for example by discussing or loading a large repo/file/document so the parent has a substantial cached prefix.
Spawn a subagent with inherited/forked context for a small task.
Inspect the usage events in the rollout/session JSONL, especially input_tokens and cached_input_tokens, for the forked child turn.
Compare that with either:
- a normal continuation in the parent thread, or
- a patched build where the forked child preserves the parent prompt-cache lineage.

Expected behavior

A forked subagent that inherits parent context should preserve the relevant parent prompt-cache lineage for the inherited prefix.

The child should still have its own thread/session identity, but the prompt-cache key used for the reusable inherited prefix should not be replaced solely because the fork has a new child thread id.

In other words, forked subagents should be isolated as conversation threads, but not forced into unrelated prompt-cache namespaces for the parent prefix they inherit.

Actual behavior

The forked child receives a new thread-derived prompt_cache_key.

Because the cache key changes at fork startup, the inherited parent prefix is not reused as effectively as expected. The observed cached_input_tokens for forked subagents is much lower than it should be for a mostly identical inherited prefix.

Why this matters

fork_context=true is most useful when the parent has already accumulated important project/task context and a subagent needs to perform a bounded task using that context.

Without cache-lineage preservation, that pattern becomes unexpectedly expensive:

each fork can rebill a large inherited prefix as uncached input;
multi-subagent workflows multiply the cost;
users may see rapid quota/cost drain even when the actual per-subagent task prompt is small.

Related issues / PRs

This is related to, but more specific than:

#21796: stable prompt-cache key support for identical startup context across app-server sessions.
#20301: broader low cache hit rate reports with GPT-5.5.
#18683: feature request for batch-spawning forked subagents from shared preloaded context.
#20909: a closed PR that attempted to preserve fork prompt-cache state.

Unlike #20301, this report is not about a general model-level cache anomaly. It is specifically about forked subagents losing cache lineage because the fork gets a new thread-derived prompt-cache key.

Suggested fix

When starting a forked subagent, preserve the parent's prompt-cache lineage for the inherited context instead of deriving a fresh key only from the child thread id.

A narrow fix is to carry an inherited prompt-cache key through fork startup and use it when constructing the child session/client, while keeping the child thread id and conversation state separate.

Reference implementation: john-lu-ptc/codex#1

FAQ

Expected behavior

A forked subagent that inherits parent context should preserve the relevant parent prompt-cache lineage for the inherited prefix.

The child should still have its own thread/session identity, but the prompt-cache key used for the reusable inherited prefix should not be replaced solely because the fork has a new child thread id.

In other words, forked subagents should be isolated as conversation threads, but not forced into unrelated prompt-cache namespaces for the parent prefix they inherit.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

codex - 💡(How to fix) Fix Forked subagents lose prompt-cache lineage for inherited parent context

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

What version of Codex CLI is running?

Which model were you using?

What platform is your computer?

What issue are you seeing?

What steps can reproduce the bug?

Expected behavior

Actual behavior

Why this matters

Related issues / PRs

Suggested fix

FAQ

Expected behavior

Still need to ship something?

TRENDING