codex - 💡(How to fix) Fix Goal sessions can pin cached_input_tokens to small fixed prefix (e.g. 2432)

codex2026-05-30 18:37:32

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

In long-running goal-driven threads, last_token_usage.cached_input_tokens can stay pinned to a small constant (commonly 2432) for many consecutive turns.

The same thread later recovers to much higher cache hits after a context_compacted event, which strongly suggests this is a prompt-stability/composition issue (prefix invalidation), not a cache service outage.

Error Message

A) Long plateau at `2432`

Root Cause

When this happens, prompt-cache efficiency drops sharply for long threads, increasing token usage and latency despite substantial repeated context.

Code Example

2026-05-31T01:36:00Z  input=206240  cached=2432  output=635
2026-05-31T01:37:07Z  input=207059  cached=2432  output=199
2026-05-31T01:37:54Z  input=207829  cached=2432  output=355
2026-05-31T01:39:13Z  input=208362  cached=2432  output=37
2026-05-31T01:40:35Z  input=208455  cached=2432  output=37
2026-05-31T01:41:19Z  input=208540  cached=2432  output=224
... (many similar turns)

---

2026-05-31T02:02:40Z  token_count: cached=0
2026-05-31T02:02:40Z  event: compacted
2026-05-31T02:02:40Z  event: context_compacted

---

2026-05-31T02:03:06Z  cached=4480
2026-05-31T02:03:15Z  cached=15232
2026-05-31T02:03:31Z  cached=21376
2026-05-31T02:04:01Z  cached=26496
2026-05-31T02:04:48Z  cached=39296
2026-05-31T02:05:54Z  cached=49536
2026-05-31T02:06:40Z  cached=52608

---

{
  "threadId": "<redacted>",
  "objective": "<redacted>",
  "status": "active",
  "tokensUsed": 10039897,
  "timeUsedSeconds": 3214,
  "createdAt": 1780188796,
  "updatedAt": 1780193170
}

---

{
  "threadId": "<redacted>",
  "objective": "<redacted>",
  "status": "active",
  "tokensUsed": 10041221,
  "timeUsedSeconds": 3227,
  "createdAt": 1780188796,
  "updatedAt": 1780193183
}

RAW_BUFFERClick to expand / collapse

What version of Codex is running?

codex-cli 0.135.0-alpha.1
Also observed with gpt-5.3-codex and gpt-5.5

Platform

Linux x86_64

Summary

In long-running goal-driven threads, last_token_usage.cached_input_tokens can stay pinned to a small constant (commonly 2432) for many consecutive turns.

Why this matters

When this happens, prompt-cache efficiency drops sharply for long threads, increasing token usage and latency despite substantial repeated context.

Reproduction (generic)

Start a thread and set an active goal (/goal) with a non-trivial objective.
Continue for many turns with normal user messages and tool activity.
Observe token_count events in local rollout JSONL.
Track payload.info.last_token_usage.cached_input_tokens per turn.

Observed behavior

A) Long plateau at `2432`

For a long span, every turn reports the same low cached prefix:

2026-05-31T01:36:00Z  input=206240  cached=2432  output=635
2026-05-31T01:37:07Z  input=207059  cached=2432  output=199
2026-05-31T01:37:54Z  input=207829  cached=2432  output=355
2026-05-31T01:39:13Z  input=208362  cached=2432  output=37
2026-05-31T01:40:35Z  input=208455  cached=2432  output=37
2026-05-31T01:41:19Z  input=208540  cached=2432  output=224
... (many similar turns)

B) Immediate recovery after compaction boundary

Same thread, around compaction:

2026-05-31T02:02:40Z  token_count: cached=0
2026-05-31T02:02:40Z  event: compacted
2026-05-31T02:02:40Z  event: context_compacted

Following turns then ramp cache quickly:

2026-05-31T02:03:06Z  cached=4480
2026-05-31T02:03:15Z  cached=15232
2026-05-31T02:03:31Z  cached=21376
2026-05-31T02:04:01Z  cached=26496
2026-05-31T02:04:48Z  cached=39296
2026-05-31T02:05:54Z  cached=49536
2026-05-31T02:06:40Z  cached=52608

This “plateau -> compaction -> ramp-up” pattern is repeatable in our logs.

Evidence that goal metadata is highly dynamic

During the plateau period, thread/goal/updated fires frequently and includes changing fields every few seconds:

tokensUsed
timeUsedSeconds
updatedAt

Redacted example (objective omitted):

{
  "threadId": "<redacted>",
  "objective": "<redacted>",
  "status": "active",
  "tokensUsed": 10039897,
  "timeUsedSeconds": 3214,
  "createdAt": 1780188796,
  "updatedAt": 1780193170
}

Next update shortly after:

{
  "threadId": "<redacted>",
  "objective": "<redacted>",
  "status": "active",
  "tokensUsed": 10041221,
  "timeUsedSeconds": 3227,
  "createdAt": 1780188796,
  "updatedAt": 1780193183
}

Source-level context (same version family)

Current code/templates appear consistent with this behavior:

codex-rs/core/templates/goals/continuation.md
- includes dynamic budget lines (Tokens used, Token budget)
codex-rs/core/templates/goals/objective_updated.md
- includes dynamic budget lines
codex-rs/app-server/src/request_processors/thread_goal_processor.rs
- thread/goal/updated contains time_used_seconds and updated_at
codex-rs/app-server/README.md
- documents thread/goal/updated as including full current goal

This issue does not claim these fields are always wrong; the concern is whether volatile data is entering model-visible context too early and fragmenting cache prefix reuse.

Expected behavior

Adjacent turns with stable instructions/history should typically reuse a larger cached prefix and not stay pinned to a tiny constant for long stretches.

Actual behavior

Long runs where cache repeatedly hits only a tiny fixed prefix (e.g. 2432) until compaction rewrites context.

Possible fix directions

Keep model-visible goal context stable; avoid placing rapidly changing counters/timestamps in cache-critical prefix segments.
Separate UI/runtime telemetry (tokensUsed, timeUsedSeconds, updatedAt) from prompt-injection content.
Add diagnostics for first prompt-diff boundary between adjacent turns to make cache invalidation root-cause obvious.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

Adjacent turns with stable instructions/history should typically reuse a larger cached prefix and not stay pinned to a tiny constant for long stretches.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

codex - 💡(How to fix) Fix Goal sessions can pin cached_input_tokens to small fixed prefix (e.g. 2432)

Recommended Tools

GitHub issue graph ai analysis

Error Message

A) Long plateau at `2432`

Root Cause

Code Example

What version of Codex is running?

Platform

Summary

Why this matters

Reproduction (generic)

Observed behavior

A) Long plateau at `2432`

B) Immediate recovery after compaction boundary

Evidence that goal metadata is highly dynamic

Source-level context (same version family)

Expected behavior

Actual behavior

Possible fix directions

FAQ

Expected behavior

Still need to ship something?

TRENDING

codex - 💡(How to fix) Fix Goal sessions can pin cached_input_tokens to small fixed prefix (e.g. 2432)

Recommended Tools

GitHub issue graph ai analysis

Error Message

A) Long plateau at 2432

Root Cause

Code Example

What version of Codex is running?

Platform

Summary

Why this matters

Reproduction (generic)

Observed behavior

A) Long plateau at 2432

B) Immediate recovery after compaction boundary

Evidence that goal metadata is highly dynamic

Source-level context (same version family)

Expected behavior

Actual behavior

Possible fix directions

FAQ

Expected behavior

Still need to ship something?

TRENDING

A) Long plateau at `2432`

A) Long plateau at `2432`