codex - 💡(How to fix) Fix Inefficient context compression [2 comments, 2 participants]

PMCSummer · 2026-04-17T09:30:43Z

[codex] What variant of Codex are you using? App What feature would you like to see? Context compression often preserves intent, but drops execution-critical s… ## Fix / Workaround - “this is the exact file to patch” - “this existing module is the template to mirror” - “do not widen scope beyond this boundary” - “these tests are sufficient; the others are unnecessary” - “this path was already checked and rejected” ### 3. Frontier state instead of only narrative state Compression should preserve something like: - confirmed facts - touched files - current patch point - blockers - next atomic action ### Bias compression toward implementation continuity Once the agent has entered implementation mode, compression should favor: - patch continuity - exact symbol/path continuity - test continuity ### What variant of Codex are you using? App ### What feature would you like to see? ## Context compression often preserves intent, but drops execution-critical state First of all, context compression is genuinely valuable. In long-running tasks it **can** help the agent stay on track and continue working without immediately hitting context limits. However, in practice I keep running into one recurring failure mode: > after compression, the agent often re-collects the same context it had already gathered before compression, which consumes most of the tokens that compression was supposed to save. This is especially noticeable in **large repositories with complex local dependencies**, where implementation depends not just on task intent, but on many small, fragile, already-verified details. --- ## The core issue From my experience, the problem is not that compression exists, but that it often compresses the **wrong layer of context**. It tends to preserve: - the general task intent - the high-level plan - the narrative of what the agent was trying to do But it often drops what is actually needed to finish implementation: - exact integration points - already-checked files and why they were checked - rejected hypotheses / dead ends - narrow scope boundaries - specific symbols, paths, checkpoints, test targets - “what is already known to be sufficient” As a result, the agent still “remembers” the goal, but loses the **working set** required to execute it efficiently. --- ## What this looks like in practice A typical pattern looks like this: 1. The agent starts narrow and gathers the relevant implementation context. 2. It identifies the likely integration surface and neighboring owner-surfaces. 3. It begins implementation. 4. Context compression triggers. 5. After compression, the agent still remembers the *intent*, but loses the *execution frontier*. 6. It starts re-reading the same seams, maps, ADRs, tests, topology files, telemetry files, etc. 7. Token usage spikes again, and the effective gain from compression becomes very small. So the result is something like: - compression saves some space temporarily - but the lost execution context has to be rebuilt - which burns a large part of the saved budget In effect, compression sometimes behaves like: - **~20% useful retention for implementation** - while forcing the agent to spend **~80% of the saved budget** rebuilding context Those numbers are approximate, not benchmarked, but they match the practical experience very closely. --- ## Concrete example from a real run In one repo task, the agent explicitly said it would proceed narrowly: - read only the seam docs - inspect the existing neighboring pattern - add a minimal package - wire a narrow integration path - run only targeted tests But after compression, instead of continuing from the implementation frontier, it started re-reading a large portion of the same context again: - seam docs - contour maps - ADRs - neighboring module files - topology policy - runtime trace - subject tick models / policy / telemetry - multiple owner tests and testkits So although the agent had already done the expensive discovery work, compression removed enough of the **execution-relevant memory** that it had to repeat discovery almost from scratch. That is exactly the opposite of what compression should optimize for. --- ## Why this matters For coding tasks, the most valuable context is often **not** the broad summary of the task, but the accumulated local engineering knowledge, for example: - “this is the exact file to patch” - “this existing module is the template to mirror” - “do not widen scope beyond this boundary” - “these tests are sufficient; the others are unnecessary” - “this path was already checked and rejected” Losing that information is costly because the model must pay for it again. So the current behavior can make compression feel less like optimization and more like a **tax on amnesia**: the agent remembers what it wanted to do, but forgets what it had already learned. --- ## Expected behavior Ideally, context compression should prioritize preserving: ### 1. Execution-critical state - exact files already read - exact integration p

codex2026-04-17 09:30:43

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openai/codex#18318•Fetched 2026-04-18 05:56:06

View on GitHub

Comments

Participants

Timeline

Reactions

Author

PMCSummer

Participants

github-actions[bot]

PMCSummer

Timeline (top)

labeled ×3commented ×2closed ×1cross-referenced ×1

Root Cause

For coding tasks, the most valuable context is often not the broad summary of the task, but the accumulated local engineering knowledge, for example:

“this is the exact file to patch”
“this existing module is the template to mirror”
“do not widen scope beyond this boundary”
“these tests are sufficient; the others are unnecessary”
“this path was already checked and rejected”

Losing that information is costly because the model must pay for it again.

So the current behavior can make compression feel less like optimization and more like a tax on amnesia: the agent remembers what it wanted to do, but forgets what it had already learned.

Fix Action

Fix / Workaround

“this is the exact file to patch”
“this existing module is the template to mirror”
“do not widen scope beyond this boundary”
“these tests are sufficient; the others are unnecessary”
“this path was already checked and rejected”

3. Frontier state instead of only narrative state

Compression should preserve something like:

confirmed facts
touched files
current patch point
blockers
next atomic action

Bias compression toward implementation continuity

Once the agent has entered implementation mode, compression should favor:

patch continuity
exact symbol/path continuity
test continuity

RAW_BUFFERClick to expand / collapse

What variant of Codex are you using?

App

What feature would you like to see?

Context compression often preserves intent, but drops execution-critical state

First of all, context compression is genuinely valuable. In long-running tasks it can help the agent stay on track and continue working without immediately hitting context limits.

However, in practice I keep running into one recurring failure mode:

after compression, the agent often re-collects the same context it had already gathered before compression, which consumes most of the tokens that compression was supposed to save.

This is especially noticeable in large repositories with complex local dependencies, where implementation depends not just on task intent, but on many small, fragile, already-verified details.

The core issue

From my experience, the problem is not that compression exists, but that it often compresses the wrong layer of context.

It tends to preserve:

the general task intent
the high-level plan
the narrative of what the agent was trying to do

But it often drops what is actually needed to finish implementation:

exact integration points
already-checked files and why they were checked
rejected hypotheses / dead ends
narrow scope boundaries
specific symbols, paths, checkpoints, test targets
“what is already known to be sufficient”

As a result, the agent still “remembers” the goal, but loses the working set required to execute it efficiently.

What this looks like in practice

A typical pattern looks like this:

The agent starts narrow and gathers the relevant implementation context.
It identifies the likely integration surface and neighboring owner-surfaces.
It begins implementation.
Context compression triggers.
After compression, the agent still remembers the intent, but loses the execution frontier.
It starts re-reading the same seams, maps, ADRs, tests, topology files, telemetry files, etc.
Token usage spikes again, and the effective gain from compression becomes very small.

So the result is something like:

compression saves some space temporarily
but the lost execution context has to be rebuilt
which burns a large part of the saved budget

In effect, compression sometimes behaves like:

~20% useful retention for implementation
while forcing the agent to spend ~80% of the saved budget rebuilding context

Those numbers are approximate, not benchmarked, but they match the practical experience very closely.

Concrete example from a real run

In one repo task, the agent explicitly said it would proceed narrowly:

read only the seam docs
inspect the existing neighboring pattern
add a minimal package
wire a narrow integration path
run only targeted tests

But after compression, instead of continuing from the implementation frontier, it started re-reading a large portion of the same context again:

seam docs
contour maps
ADRs
neighboring module files
topology policy
runtime trace
subject tick models / policy / telemetry
multiple owner tests and testkits

So although the agent had already done the expensive discovery work, compression removed enough of the execution-relevant memory that it had to repeat discovery almost from scratch.

That is exactly the opposite of what compression should optimize for.

Why this matters

For coding tasks, the most valuable context is often not the broad summary of the task, but the accumulated local engineering knowledge, for example:

“this is the exact file to patch”
“this existing module is the template to mirror”
“do not widen scope beyond this boundary”
“these tests are sufficient; the others are unnecessary”
“this path was already checked and rejected”

Losing that information is costly because the model must pay for it again.

So the current behavior can make compression feel less like optimization and more like a tax on amnesia: the agent remembers what it wanted to do, but forgets what it had already learned.

Expected behavior

Ideally, context compression should prioritize preserving:

1. Execution-critical state

exact files already read
exact integration points
concrete next step
minimal required test set
rejected alternatives
scope constraints

2. Negative knowledge

A lot of token savings come from remembering what does not need to be revisited:

files already ruled out
hypotheses already disproven
docs already confirmed as unnecessary to reread

3. Frontier state instead of only narrative state

Compression should preserve something like:

confirmed facts
touched files
current patch point
blockers
next atomic action

rather than mostly preserving a high-level “story” of the task.

Suggested improvements

A few possible directions that might help:

Preserve a structured “working frontier”

Instead of only summarizing intent, preserve a compact execution state such as:

confirmed facts
rejected hypotheses
touched files
next atomic step
remaining minimal tests
forbidden scope expansions

Prefer trimming over flattening

Rather than compressing the entire context aggressively, it may be better to:

trim repetitive narrative
trim redundant tool chatter
preserve implementation-relevant local facts

Preserve negative knowledge explicitly

Remembering what the agent already ruled out can save as many tokens as remembering what it found.

Bias compression toward implementation continuity

Once the agent has entered implementation mode, compression should favor:

patch continuity
exact symbol/path continuity
test continuity

over broad semantic summaries.

In short

The current compression is very good at preserving intent, but coding tasks often require preserving execution memory.

When compression removes execution-critical state, the agent often re-generates the context it had already paid for, which greatly reduces the practical benefit of compression in large, interconnected codebases.

So the issue is not:

“compression exists”

The issue is:

compression frequently preserves the wrong information for implementation-heavy tasks

If this part improves, context compression could become much more effective in real repository work.

Additional information

No response

extent analysis

TL;DR

Preserve execution-critical state and negative knowledge during context compression to improve its effectiveness in implementation-heavy tasks.

Guidance

Identify and prioritize the preservation of execution-critical state, such as exact files already read, integration points, and concrete next steps, during context compression.
Consider preserving negative knowledge, like files already ruled out and hypotheses already disproven, to reduce token usage.
Implement a structured "working frontier" to preserve a compact execution state, including confirmed facts, touched files, and next atomic steps.
Trim repetitive narrative and redundant tool chatter instead of aggressively compressing the entire context.
Bias compression toward implementation continuity, favoring patch continuity, exact symbol/path continuity, and test continuity.

Example

// Example of preserving execution-critical state
PreservedContext = {
  "confirmedFacts": ["file1", "file2"],
  "touchedFiles": ["file3", "file4"],
  "nextAtomicStep": "implementFeatureX",
  "remainingMinimalTests": ["test1", "test2"]
}

Notes

The provided solution focuses on preserving execution-critical state and negative knowledge during context compression. However, the actual implementation may vary depending on the specific requirements and constraints of the system.

Recommendation

Apply a workaround by preserving execution-critical state and negative knowledge during context compression, as this approach is likely to improve the effectiveness of compression in implementation-heavy tasks.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

Ideally, context compression should prioritize preserving:

#api #ssr #optimization #memory optimization #batch processing

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

codex - 💡(How to fix) Fix Inefficient context compression [2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

3. Frontier state instead of only narrative state

Bias compression toward implementation continuity

What variant of Codex are you using?

What feature would you like to see?

Context compression often preserves intent, but drops execution-critical state

The core issue

What this looks like in practice

Concrete example from a real run

Why this matters

Expected behavior

1. Execution-critical state

2. Negative knowledge

3. Frontier state instead of only narrative state

Suggested improvements

Preserve a structured “working frontier”

Prefer trimming over flattening

Preserve negative knowledge explicitly

Bias compression toward implementation continuity

In short

Additional information

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING