openclaw - 💡(How to fix) Fix [Bug]: Context-engine overflow retry can bind a fresh Codex thread without projected context

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

During the PR #88262 audit, I found a separate active-context-engine risk: when a resumed Codex thread with a thread_bootstrap context-engine binding overflows during turn/start, OpenClaw retries on a fresh Codex thread but sends only the bare current prompt. The old context-engine projection binding is then preserved on the fresh thread even though the fresh thread never received the projected context in its turn/start input.

This one is not primarily caused by #88262. It is a pre-existing or adjacent behavior that matters for lossless-claw style integrations because it can silently turn a context-engine-owned run into a contextless fresh Codex thread.

Root Cause

This one is not primarily caused by #88262. It is a pre-existing or adjacent behavior that matters for lossless-claw style integrations because it can silently turn a context-engine-owned run into a contextless fresh Codex thread.

Code Example

sequenceDiagram
    participant Engine as Context engine / lossless-claw
    participant OC as OpenClaw Codex bridge
    participant Old as Codex thread-old
    participant Fresh as Codex thread-fresh

    Engine->>OC: assemble prior context, projection epoch-before
    OC->>Old: thread/resume existing bootstrapped thread
    OC->>Old: turn/start current prompt
    Old-->>OC: context window overflow
    OC->>OC: clear old binding and restart thread
    OC->>Fresh: thread/start with context-engine binding metadata
    OC->>Fresh: turn/start "hello"
    Note over Fresh: No "OpenClaw assembled context" in retry input
    Fresh-->>OC: answer
    OC->>OC: save binding for thread-fresh with projection epoch-before

---

flowchart TD
    A["turn/start overflow on resumed context-engine thread"] --> B["Need fresh Codex thread"]
    B --> C{"Can project overflow-safe context?"}
    C -->|Yes| D["Reproject bounded context into fresh turn/start"]
    D --> E["Save binding with projection epoch"]
    C -->|No| F["Start fresh without projection"]
    F --> G["Do not mark binding as projected"]
    G --> H["Fail loudly or force reproject on next run"]
RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug / lossless context-engine continuity risk.

Summary

During the PR #88262 audit, I found a separate active-context-engine risk: when a resumed Codex thread with a thread_bootstrap context-engine binding overflows during turn/start, OpenClaw retries on a fresh Codex thread but sends only the bare current prompt. The old context-engine projection binding is then preserved on the fresh thread even though the fresh thread never received the projected context in its turn/start input.

This one is not primarily caused by #88262. It is a pre-existing or adjacent behavior that matters for lossless-claw style integrations because it can silently turn a context-engine-owned run into a contextless fresh Codex thread.

Why this is high confidence

  • The current test explicitly asserts the behavior:
  • In that test:
    • the session contains "pre-compaction context",
    • the binding says engineId: "lossless-claw" and projection.mode: "thread_bootstrap",
    • thread/resume succeeds for thread-old,
    • turn/start on thread-old throws Codex ran out of room in the model's context window,
    • OpenClaw starts thread-fresh,
    • turn/start on thread-fresh succeeds,
    • compact is not called,
    • assemble was called only once,
    • the retry input text is exactly "hello",
    • the saved binding points at thread-fresh while retaining the old epoch-before projection.
  • The retry path logs that it is starting a fresh thread after context-engine overflow:
  • The retry then calls restartContextEngineCodexThread() and starts the turn again:
  • restartContextEngineCodexThread is the same lifecycle start helper captured at startup:

Diagram

sequenceDiagram
    participant Engine as Context engine / lossless-claw
    participant OC as OpenClaw Codex bridge
    participant Old as Codex thread-old
    participant Fresh as Codex thread-fresh

    Engine->>OC: assemble prior context, projection epoch-before
    OC->>Old: thread/resume existing bootstrapped thread
    OC->>Old: turn/start current prompt
    Old-->>OC: context window overflow
    OC->>OC: clear old binding and restart thread
    OC->>Fresh: thread/start with context-engine binding metadata
    OC->>Fresh: turn/start "hello"
    Note over Fresh: No "OpenClaw assembled context" in retry input
    Fresh-->>OC: answer
    OC->>OC: save binding for thread-fresh with projection epoch-before

Expected behavior

If the active context engine owns the session continuity and OpenClaw has to abandon a previously bootstrapped Codex thread, the fresh thread must not be marked as having the old projection unless it actually receives equivalent context.

The retry should do one of these:

  • reproject a smaller/bounded context-engine assembly into the fresh thread,
  • ask the context engine for an overflow-safe retry projection,
  • compact or ask Codex-native compaction only in a way that preserves the engine's continuity contract,
  • clear the projection binding on the fresh thread and fail loudly if context cannot be carried,
  • or store metadata that says the fresh thread started without the old context so the next run reprojects.

Actual behavior

The retry turn/start receives exactly the current prompt ("hello" in the test), while the resulting binding still records the old thread_bootstrap projection epoch.

That creates a false binding invariant: future runs can treat thread-fresh as if it already contains the context-engine bootstrap, but the retry turn that created it did not include the context projection in the user input.

Minimal reproduction shape

The current test is already a minimal reproduction:

  1. Use an active context engine with engineId: "lossless-claw" and contextProjection: { mode: "thread_bootstrap", epoch: "epoch-before" }.
  2. Start from an existing binding for thread-old with the same projection epoch.
  3. Make thread/resume succeed.
  4. Make turn/start on thread-old throw a context-window overflow.
  5. Make thread/start produce thread-fresh.
  6. Make turn/start on thread-fresh succeed.
  7. Observe that the retry input is the bare current prompt and that the saved binding still has projection.epoch === "epoch-before".

Impact

This is a direct lossless-claw / active-context-engine continuity risk.

The failure mode is particularly subtle:

  • the user gets a successful answer,
  • the binding is updated to the fresh thread,
  • the binding still says the context-engine projection exists,
  • but the fresh thread never saw the old projected context.

Future runs may then skip reprojecting because the binding appears to match the context-engine epoch. That can make the loss of context persistent across turns.

Why this is separate from PR #88262

PR #88262 changed the mirrored-history injection behavior. This issue is about a context-engine overflow recovery path that already has an active context engine.

It is still worth filing from the same audit because it is the one active lossless-claw path that did smell off: the normal active context-engine path appears to pass historyMessages into assembly and project context correctly, but this overflow retry path can create a fresh native thread without carrying context while retaining a projection binding.

Suggested fix direction

Do not save or retain a thread_bootstrap projection binding for a fresh retry unless the fresh retry actually bootstrapped that context.

One possible shape:

flowchart TD
    A["turn/start overflow on resumed context-engine thread"] --> B["Need fresh Codex thread"]
    B --> C{"Can project overflow-safe context?"}
    C -->|Yes| D["Reproject bounded context into fresh turn/start"]
    D --> E["Save binding with projection epoch"]
    C -->|No| F["Start fresh without projection"]
    F --> G["Do not mark binding as projected"]
    G --> H["Fail loudly or force reproject on next run"]

At minimum, add a regression assertion that savedBinding.contextEngine.projection is absent or marked stale when the retry input does not contain the context-engine projection.

Evidence checked

  • Merged main commit inspected: 530351e394a19b1dd2943cb08259657a13f90572
  • Local PR-head audit checkout: 04a4427d0c0b8f2e7bb666bbbcda5c557d033972
  • Relevant test: extensions/codex/src/app-server/run-attempt.context-engine.test.ts

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

If the active context engine owns the session continuity and OpenClaw has to abandon a previously bootstrapped Codex thread, the fresh thread must not be marked as having the old projection unless it actually receives equivalent context.

The retry should do one of these:

  • reproject a smaller/bounded context-engine assembly into the fresh thread,
  • ask the context engine for an overflow-safe retry projection,
  • compact or ask Codex-native compaction only in a way that preserves the engine's continuity contract,
  • clear the projection binding on the fresh thread and fail loudly if context cannot be carried,
  • or store metadata that says the fresh thread started without the old context so the next run reprojects.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: Context-engine overflow retry can bind a fresh Codex thread without projected context