Root Cause

rotate on /new or /reset;
rotate on model/provider/auth/tool/MCP/app/environment incompatibility;
rotate on context-engine policy or projection epoch/fingerprint change;
rotate when a saved context-engine binding has no current active context engine;
rotate when the app-server reports the native thread is gone or actually overflows;
do not rotate solely because a compatible thread_bootstrap native rollout is above a hard-coded 70k guard.

Fix Action

Fixed

Fixed by PR: Fix Codex native thread reuse for context-engine bootstraps (https://github.com/openclaw/openclaw/pull/85978)
Fixed by PR: fix(codex): make native thread token guard configurable (https://github.com/openclaw/openclaw/pull/86069)
Fixed by PR: fix(codex): preserve semantic native threads across compaction (https://github.com/openclaw/openclaw/pull/86160)

Codex long-running sessions should use semantic thread/bootstrap cache ownership instead of hard native-token rotation

StepCodex · 2026-05-24T11:10:24Z

[openclaw] Codex long-running sessions should use semantic thread/bootstrap cache ownership instead of hard native-token rotation Problem Long-running Discord/… ## Fixed - Fixed by PR: Fix Codex native thread reuse for context-engine bootstraps (https://github.com/openclaw/openclaw/pull/85978) - Fixed by PR: fix(codex): make native thread token guard configurable (https://github.com/openclaw/openclaw/pull/86069) - Fixed by PR: fix(codex): preserve semantic native threads across compaction (https://github.com/openclaw/openclaw/pull/86160) # Codex long-running sessions should use semantic thread/bootstrap cache ownership instead of hard native-token rotation ## Problem Long-running Discord/Codex sessions can still become slow after #85978 when they do not enter or retain the context-engine `thread_bootstrap` path. The local Codex app-server startup guard can still log: ```text codex app-server native transcript exceeded active token limit; starting a fresh thread nativeTokens=116268, max 70000 ``` That means OpenClaw clears the saved native Codex thread and starts cold, even though the selected model may have much more context headroom. The 70k number is an OpenClaw native-thread reuse guard, not the model context window. ## Current Understanding - Pi embedded runner normally reloads and injects bootstrap files every turn, then relies on stable prompt-prefix/provider-cache behavior to amortize repeated bytes. - Codex app-server has a different efficient path: persistent native thread reuse. Context engines can return `contextProjection.mode = "thread_bootstrap"` so OpenClaw injects assembled history once for a stable epoch and then resumes the same native thread. - Current lossless-claw main appears designed for this path: it returns `thread_bootstrap` with an epoch derived from summary context state rather than ordinary fresh-tail growth. - #85978 fixes one bug in this path by preventing the startup size guard from deleting a still-valid bootstrapped binding before compatibility is checked. It now also keeps stale/no-active-engine bindings safe. - The broader architecture gap remains for legacy/non-bootstrap sessions, old lossless-claw builds, session-file rollover, blunt compaction invalidation, workspace bootstrap surfaces outside the projection contract, and insufficient rotation diagnostics. ## Desired Architecture For Codex app-server, native-thread rotation should be primarily semantic: - rotate on `/new` or `/reset`; - rotate on model/provider/auth/tool/MCP/app/environment incompatibility; - rotate on context-engine policy or projection epoch/fingerprint change; - rotate when a saved context-engine binding has no current active context engine; - rotate when the app-server reports the native thread is gone or actually overflows; - do not rotate solely because a compatible `thread_bootstrap` native rollout is above a hard-coded 70k guard. The native thread should be treated as a projection cache keyed by stable session/channel identity plus context-engine conversation/projection identity, not only by `sessionFile + ".codex-app-server.json"`. ## Proposed Follow-Ups 1. Add a Codex native-thread rotation reason enum and diagnostics block. Log current/saved engine id, policy fingerprint, epoch/fingerprint, dynamic tools, MCP/app/environment/auth/model fingerprints, token source, native/session tokens, and whether mirrored history was projected. 2. Make the native reuse guard model/config/context-owner aware. Keep strict clearing for legacy or ownerless sessions, but treat compatible context-engine `thread_bootstrap` sessions as semantically owned by the context engine unless the app-server actually rejects the turn. 3. Preserve or migrate Codex bindings across LCM/session-file rollover when conversation identity and projection epoch remain compatible. 4. Add explicit workspace bootstrap fingerprints to Codex thread binding/diagnostics. Track stable inherited developer instructions, turn-scoped collaboration instructions, prompt context contributors, and native project-doc loading separately. 5. Revisit compaction invalidation. Successful context-engine-owned compaction currently clears Codex bindings. If compaction does not change projection epoch/fingerprint, native reuse may be preservable. ## Acceptance Criteria - A long-running single-agent Codex/Discord session with stable lossless-claw `thread_bootstrap` epoch can exceed 70k native rollout tokens without cold-starting every turn. - When a turn cold-starts, logs state exactly which semantic or runtime compatibility dimension forced it. - If LCM compacts, rotates, or rewrites the transcript, OpenClaw either preserves the compatible Codex binding or logs the exact epoch/policy/session identity reason it could not. - `/doctor` or equivalent status output distinguishes model/provider context overflow from OpenClaw native-thread reuse guard rotation. ## Related - #85975 - #85978 - Architecture note: `/Volumes/LEXAR/Codex/openclaw-codex-long-session-architec

Problem

Long-running Discord/Codex sessions can still become slow after #85978 when they do not enter or retain the context-engine thread_bootstrap path. The local Codex app-server startup guard can still log:

codex app-server native transcript exceeded active token limit; starting a fresh thread
nativeTokens=116268, max 70000

That means OpenClaw clears the saved native Codex thread and starts cold, even though the selected model may have much more context headroom. The 70k number is an OpenClaw native-thread reuse guard, not the model context window.

Current Understanding

Pi embedded runner normally reloads and injects bootstrap files every turn, then relies on stable prompt-prefix/provider-cache behavior to amortize repeated bytes.
Codex app-server has a different efficient path: persistent native thread reuse. Context engines can return contextProjection.mode = "thread_bootstrap" so OpenClaw injects assembled history once for a stable epoch and then resumes the same native thread.
Current lossless-claw main appears designed for this path: it returns thread_bootstrap with an epoch derived from summary context state rather than ordinary fresh-tail growth.
#85978 fixes one bug in this path by preventing the startup size guard from deleting a still-valid bootstrapped binding before compatibility is checked. It now also keeps stale/no-active-engine bindings safe.
The broader architecture gap remains for legacy/non-bootstrap sessions, old lossless-claw builds, session-file rollover, blunt compaction invalidation, workspace bootstrap surfaces outside the projection contract, and insufficient rotation diagnostics.

Desired Architecture

For Codex app-server, native-thread rotation should be primarily semantic:

rotate on /new or /reset;
rotate on model/provider/auth/tool/MCP/app/environment incompatibility;
rotate on context-engine policy or projection epoch/fingerprint change;
rotate when a saved context-engine binding has no current active context engine;
rotate when the app-server reports the native thread is gone or actually overflows;
do not rotate solely because a compatible thread_bootstrap native rollout is above a hard-coded 70k guard.

The native thread should be treated as a projection cache keyed by stable session/channel identity plus context-engine conversation/projection identity, not only by sessionFile + ".codex-app-server.json".

Proposed Follow-Ups

Add a Codex native-thread rotation reason enum and diagnostics block. Log current/saved engine id, policy fingerprint, epoch/fingerprint, dynamic tools, MCP/app/environment/auth/model fingerprints, token source, native/session tokens, and whether mirrored history was projected.
Make the native reuse guard model/config/context-owner aware. Keep strict clearing for legacy or ownerless sessions, but treat compatible context-engine thread_bootstrap sessions as semantically owned by the context engine unless the app-server actually rejects the turn.
Preserve or migrate Codex bindings across LCM/session-file rollover when conversation identity and projection epoch remain compatible.
Add explicit workspace bootstrap fingerprints to Codex thread binding/diagnostics. Track stable inherited developer instructions, turn-scoped collaboration instructions, prompt context contributors, and native project-doc loading separately.
Revisit compaction invalidation. Successful context-engine-owned compaction currently clears Codex bindings. If compaction does not change projection epoch/fingerprint, native reuse may be preservable.

Acceptance Criteria

A long-running single-agent Codex/Discord session with stable lossless-claw thread_bootstrap epoch can exceed 70k native rollout tokens without cold-starting every turn.
When a turn cold-starts, logs state exactly which semantic or runtime compatibility dimension forced it.
If LCM compacts, rotates, or rewrites the transcript, OpenClaw either preserves the compatible Codex binding or logs the exact epoch/policy/session identity reason it could not.
/doctor or equivalent status output distinguishes model/provider context overflow from OpenClaw native-thread reuse guard rotation.

#85975
#85978
Architecture note: /Volumes/LEXAR/Codex/openclaw-codex-long-session-architecture-20260524.md

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix Codex long-running sessions should use semantic thread/bootstrap cache ownership [3 pull requests]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fixed

Code Example

Codex long-running sessions should use semantic thread/bootstrap cache ownership instead of hard native-token rotation

Problem

Current Understanding

Desired Architecture

Proposed Follow-Ups

Acceptance Criteria

Related

Still need to ship something?

TRENDING