codex - 💡(How to fix) Fix Significant model quality regression in codex-5-5 (xhigh) — Instructions not followed, repeated regressions, affects both CLI and IDE

codex2026-05-26 06:22:04

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Error Message

Ask the model to implement a non-trivial feature (e.g., add a validated form field with specific error handling logic).

RAW_BUFFERClick to expand / collapse

What issue are you seeing?

Since a recent update, the model behavior behind the xhigh effort level (which we assume maps to codex-5-5 or equivalent) has degraded drastically compared to what it was a few weeks ago. The quality drop is so significant that it no longer feels like the same model — it resembles a heavily quantized or distilled variant rather than the full-capability model we were previously using.

Observed symptoms:

The model frequently ignores explicit instructions in AGENTS.md and in prompt context, even when they are clear, concise, and well-structured.
It produces repeated regressions: changes that were already reviewed and corrected are reintroduced in subsequent turns.
It fails to maintain task coherence across multi-step workflows — it loses track of context and reverts to generic or incorrect behavior mid-task.
It appears to hallucinate tool usage or skip tool calls it should be making.
Overall reasoning quality feels noticeably inferior to previous versions.

This is not an isolated or edge-case scenario. It makes the tool practically unusable for serious development workflows.

Questions for the team

Was the underlying model recently updated, swapped, or re-quantized? The behavioral delta is large enough that it does not feel like the same checkpoint.
When can we expect a quality improvement or rollback? Given that this affects both CLI and IDE users, it would be helpful to have a public timeline or at least an acknowledgment that the regression is known.

Environment

Codex CLI: latest
IDE: Codex (latest)
OS: Linux / Windows
Effort level: xhigh

What steps can reproduce the bug?

Set effort level to xhigh.
Provide a detailed AGENTS.md with explicit coding conventions and task instructions.
Ask the model to implement a non-trivial feature (e.g., add a validated form field with specific error handling logic).
Observe that the model either ignores the conventions, introduces regressions from prior turns, or produces code that contradicts the stated requirements.
Repeat across multiple sessions — the behavior is consistent, not random.

Reproducible in:

Codex CLI (latest version)
IDE integration (VS Code extension)

This is not a CLI-only issue. The same degraded model behavior is present in the IDE, which rules out any client-side regression.

What is the expected behavior?

No response

Additional information

No response

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering