codex - 💡(How to fix) Fix thread/rollback numTurns uses inconsistent units between core history and app-server replay

Code Example

T1: U1 -> A1
T2: U2 -> A2
T3: U3
    U4  // steer
    U5  // steer
    A3
T4: U6 -> A6

---

U1, U2, U3, U4, U5, U6

---

num_turns = 4   // U3, U4, U5, U6

---

T1: U1 -> A1
T2: U2 -> A2

---

T1, T2, T3, T4

---

app-server replayed transcript: empty

---

TUI shows:                  T1, T2
app-server/session replay:  empty

---

T1: U1 -> A1
T2: goal / agent-only output
    // no visible UserMessage in this raw turn
T3: U2 -> A2

---

U1, U2

---

num_turns = 2   // U1, U2

---

TUI-local transcript: empty

---

T1, T2, T3

---

app-server replayed transcript:

T1: U1 -> A1

---

TUI shows:                  empty
app-server/session replay:  T1: U1 -> A1

What version of Codex CLI is running?

v0.131.0-alpha.16

What subscription do you have?

Pro

Which model were you using?

gpt-5.5-xhigh

What platform is your computer?

Linux 6.8.0-1050-aws x86_64 x86_64

What terminal emulator and version are you using (if applicable)?

No response

Codex doctor report

No response

What issue are you seeing?

thread/rollback can make the TUI-local transcript and the app-server replayed transcript disagree about which history survived a successful rewind.

The root cause appears to be that the same numTurns value is interpreted in different units:

core/protocol treat numTurns as a count of user/instruction-turn rollback boundaries;
app-server ThreadHistory replay treats numTurns as a count of raw app-server Turn objects;
TUI backtrack computes numTurns from visible user messages.

This only works when every raw app-server Turn contains exactly one rollback boundary. Valid histories can violate that invariant.

codex-rs/protocol/src/protocol.rs documents Op::ThreadRollback as “drop the last N user turns from in-memory context”.
ThreadRolledBackEvent.num_turns is documented as “Number of user turns that were removed from context”.
codex-rs/core/src/context_manager/history.rs::drop_last_n_user_turns drops the last N instruction turns.
is_user_turn_boundary treats ordinary non-contextual user messages and structured assistant inter-agent instructions as rollback boundaries.
codex-rs/app-server/src/request_processors/thread_processor.rs forwards num_turns unchanged to Op::ThreadRollback { num_turns }.
codex-rs/app-server-protocol/src/protocol/thread_history.rs::ThreadHistoryBuilder::handle_thread_rollback clears/truncates self.turns by raw Turn count.

Problematic cases include:

a raw app-server turn with zero visible UserMessage items, such as a goal/autonomous continuation turn or other agent-only/tool-only turn;
a raw app-server turn with multiple UserMessage items, such as mid-turn steering;
a structured assistant inter-agent instruction, which core counts as an instruction-turn boundary but TUI-visible user-message counting does not.

This can make core model history, app-server thread/read includeTurns, persisted rollout reconstruction, resume/fork reconstruction, and TUI backtrack disagree about which conversation prefix survived rollback.

What steps can reproduce the bug?

This can be reproduced with histories where raw app-server Turn objects do not map 1:1 to visible user-message rollback boundaries.

Below are two simplified scenarios. T1, T2, etc. are raw app-server turns. U1, U2, etc. are visible user messages. A1, A2, etc. are assistant messages.

The same class may also affect any feature that creates zero-visible-user raw turns or core-only rollback boundaries, such as standalone !cmd, compaction/review/hook/abort-only turns, and subagent/multi-agent structured inter-agent messages. The two scenarios below are the clearest examples.

Scenario 1: multiple steer messages inside one raw turn

Suppose the persisted history has this shape:

T1: U1 -> A1
T2: U2 -> A2
T3: U3
    U4  // steer
    U5  // steer
    A3
T4: U6 -> A6

Now use the TUI backtrack/rewind UI and rewind to U3.

From the TUI's point of view, the visible user messages are:

U1, U2, U3, U4, U5, U6

So rewinding to U3 asks to remove the last four visible user-message boundaries:

num_turns = 4   // U3, U4, U5, U6

The TUI-local transcript is then trimmed to the prefix before U3:

T1: U1 -> A1
T2: U2 -> A2

That is also what core user/instruction-boundary semantics would imply, assuming U3, U4, U5, and U6 are ordinary non-contextual user messages.

However, app-server ThreadHistory replay currently treats the same num_turns = 4 as four raw app-server turns. Since the raw turns are:

T1, T2, T3, T4

replaying the rollback as raw-turn truncation removes all four turns:

app-server replayed transcript: empty

So after TUI rewind:

TUI shows:                  T1, T2
app-server/session replay:  empty

This is a visible data mismatch: the TUI appears to preserve U1/A1 and U2/A2, but thread/read includeTurns, resume, fork, or another client rebuilding from app-server ThreadHistory can see a different transcript.

Scenario 2: goal / agent-only raw turn with zero visible user messages

A similar mismatch can happen in the opposite direction when the rollback span contains a raw turn with no visible user message, for example a goal-driven / autonomous / agent-only continuation turn.

Suppose the persisted history has this shape:

T1: U1 -> A1
T2: goal / agent-only output
    // no visible UserMessage in this raw turn
T3: U2 -> A2

Now use the TUI backtrack/rewind UI and rewind to the first visible user message, U1.

From the TUI's point of view, the visible user messages are:

U1, U2

So rewinding to U1 asks to remove the last two visible user-message boundaries:

num_turns = 2   // U1, U2

The TUI-local transcript is then trimmed to the prefix before U1:

TUI-local transcript: empty

Core user/instruction-boundary semantics would also remove both visible user turns, leaving no prior user conversation.

However, app-server ThreadHistory replay sees three raw turns:

T1, T2, T3

and treats num_turns = 2 as “remove the last two raw turns”. It removes T2 and T3, but leaves T1:

app-server replayed transcript:

T1: U1 -> A1

So after TUI rewind:

TUI shows:                  empty
app-server/session replay:  T1: U1 -> A1

In both scenarios, rollback can look successful in the TUI because the TUI trims its local transcript by visible user-message position. The persisted/app-server replay can still disagree because it applies the same num_turns value to a different unit: raw app-server Turn objects. The same class of unit mismatch can also become a model-context problem when future requests are built from a mismatched reconstructed history, or when core has rollback boundaries that are not visible user messages.

What is the expected behavior?

thread/rollback should use one consistent rollback boundary across:

core in-memory history,
persisted rollout reconstruction,
app-server ThreadHistoryBuilder / thread/read includeTurns,
resume/fork reconstruction,
TUI backtrack/local transcript trimming.

Since the protocol and core already document/use numTurns as user/instruction-turn count, the least surprising fix is to keep that semantic and make app-server transcript replay follow it.

Suggested fix:

Define numTurns explicitly as “number of rollback boundaries / instruction turns”, not raw app-server Turn objects.
Keep app-server replay aligned with core’s rollback-boundary semantics:
- ordinary non-contextual user messages are rollback boundaries;
- structured assistant inter-agent instructions are rollback boundaries;
- contextual/hidden user messages such as goal context are not ordinary visible user-message boundaries.
Update ThreadHistoryBuilder::handle_thread_rollback to truncate by rollback boundary position rather than by raw Turn count.
If the rollback boundary falls inside a raw app-server Turn, truncate only that turn’s item suffix instead of deleting the whole raw turn.

One maintainable implementation approach would be to record rollback-boundary positions while replaying rollout items into ThreadHistory. A boundary position could point to a raw turn index plus an item index inside that turn. Then ThreadRolledBack { num_turns } can look up the Nth boundary from the end and truncate at that exact position.

Additional information

No response

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

codex - 💡(How to fix) Fix thread/rollback numTurns uses inconsistent units between core history and app-server replay

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

What version of Codex CLI is running?

What subscription do you have?

Which model were you using?

What platform is your computer?

What terminal emulator and version are you using (if applicable)?

Codex doctor report

What issue are you seeing?

What steps can reproduce the bug?

What is the expected behavior?

Additional information

Still need to ship something?

TRENDING

codex - 💡(How to fix) Fix thread/rollback numTurns uses inconsistent units between core history and app-server replay

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

What version of Codex CLI is running?

What subscription do you have?

Which model were you using?

What platform is your computer?

What terminal emulator and version are you using (if applicable)?

Codex doctor report

What issue are you seeing?

What steps can reproduce the bug?

What is the expected behavior?

Additional information

Still need to ship something?

RELATED_DISCOVERY

TRENDING