openclaw - 💡(How to fix) Fix Codex long-running sessions should use semantic thread/bootstrap cache ownership [3 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Root Cause

  • rotate on /new or /reset;
  • rotate on model/provider/auth/tool/MCP/app/environment incompatibility;
  • rotate on context-engine policy or projection epoch/fingerprint change;
  • rotate when a saved context-engine binding has no current active context engine;
  • rotate when the app-server reports the native thread is gone or actually overflows;
  • do not rotate solely because a compatible thread_bootstrap native rollout is above a hard-coded 70k guard.

Fix Action

Fixed

Code Example

codex app-server native transcript exceeded active token limit; starting a fresh thread
nativeTokens=116268, max 70000
RAW_BUFFERClick to expand / collapse

Codex long-running sessions should use semantic thread/bootstrap cache ownership instead of hard native-token rotation

Problem

Long-running Discord/Codex sessions can still become slow after #85978 when they do not enter or retain the context-engine thread_bootstrap path. The local Codex app-server startup guard can still log:

codex app-server native transcript exceeded active token limit; starting a fresh thread
nativeTokens=116268, max 70000

That means OpenClaw clears the saved native Codex thread and starts cold, even though the selected model may have much more context headroom. The 70k number is an OpenClaw native-thread reuse guard, not the model context window.

Current Understanding

  • Pi embedded runner normally reloads and injects bootstrap files every turn, then relies on stable prompt-prefix/provider-cache behavior to amortize repeated bytes.
  • Codex app-server has a different efficient path: persistent native thread reuse. Context engines can return contextProjection.mode = "thread_bootstrap" so OpenClaw injects assembled history once for a stable epoch and then resumes the same native thread.
  • Current lossless-claw main appears designed for this path: it returns thread_bootstrap with an epoch derived from summary context state rather than ordinary fresh-tail growth.
  • #85978 fixes one bug in this path by preventing the startup size guard from deleting a still-valid bootstrapped binding before compatibility is checked. It now also keeps stale/no-active-engine bindings safe.
  • The broader architecture gap remains for legacy/non-bootstrap sessions, old lossless-claw builds, session-file rollover, blunt compaction invalidation, workspace bootstrap surfaces outside the projection contract, and insufficient rotation diagnostics.

Desired Architecture

For Codex app-server, native-thread rotation should be primarily semantic:

  • rotate on /new or /reset;
  • rotate on model/provider/auth/tool/MCP/app/environment incompatibility;
  • rotate on context-engine policy or projection epoch/fingerprint change;
  • rotate when a saved context-engine binding has no current active context engine;
  • rotate when the app-server reports the native thread is gone or actually overflows;
  • do not rotate solely because a compatible thread_bootstrap native rollout is above a hard-coded 70k guard.

The native thread should be treated as a projection cache keyed by stable session/channel identity plus context-engine conversation/projection identity, not only by sessionFile + ".codex-app-server.json".

Proposed Follow-Ups

  1. Add a Codex native-thread rotation reason enum and diagnostics block. Log current/saved engine id, policy fingerprint, epoch/fingerprint, dynamic tools, MCP/app/environment/auth/model fingerprints, token source, native/session tokens, and whether mirrored history was projected.

  2. Make the native reuse guard model/config/context-owner aware. Keep strict clearing for legacy or ownerless sessions, but treat compatible context-engine thread_bootstrap sessions as semantically owned by the context engine unless the app-server actually rejects the turn.

  3. Preserve or migrate Codex bindings across LCM/session-file rollover when conversation identity and projection epoch remain compatible.

  4. Add explicit workspace bootstrap fingerprints to Codex thread binding/diagnostics. Track stable inherited developer instructions, turn-scoped collaboration instructions, prompt context contributors, and native project-doc loading separately.

  5. Revisit compaction invalidation. Successful context-engine-owned compaction currently clears Codex bindings. If compaction does not change projection epoch/fingerprint, native reuse may be preservable.

Acceptance Criteria

  • A long-running single-agent Codex/Discord session with stable lossless-claw thread_bootstrap epoch can exceed 70k native rollout tokens without cold-starting every turn.
  • When a turn cold-starts, logs state exactly which semantic or runtime compatibility dimension forced it.
  • If LCM compacts, rotates, or rewrites the transcript, OpenClaw either preserves the compatible Codex binding or logs the exact epoch/policy/session identity reason it could not.
  • /doctor or equivalent status output distinguishes model/provider context overflow from OpenClaw native-thread reuse guard rotation.

Related

  • #85975
  • #85978
  • Architecture note: /Volumes/LEXAR/Codex/openclaw-codex-long-session-architecture-20260524.md

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Codex long-running sessions should use semantic thread/bootstrap cache ownership [3 pull requests]