codex - 💡(How to fix) Fix Make reasoning summaries opt-in via a flag (with per-model override) for self-hosted reasoning models [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openai/codex#22061Fetched 2026-05-11 03:19:53
View on GitHub
Comments
0
Participants
1
Timeline
8
Reactions
0
Author
Participants
Timeline (top)
labeled ×4cross-referenced ×2unlabeled ×2

For users running self-hosted reasoning models (DeepSeek R1, Qwen QwQ, Qwen-3 thinking, etc.) on Ollama / LM Studio / llama.cpp / vLLM, codex's bundled catalog has no entry for the slug, so supports_reasoning_summaries defaults to false and the reasoning stream never reaches the UI. The only existing escape hatch is the global model_supports_reasoning_summaries knob in ~/.codex/config.toml, which is all-or-nothing for users who run more than one local model.

The result is that running a real reasoning model locally feels strictly worse than running a non-reasoning model locally — the reasoning UI just doesn't show up, even though the model is emitting reasoning content.

Root Cause

For users running self-hosted reasoning models (DeepSeek R1, Qwen QwQ, Qwen-3 thinking, etc.) on Ollama / LM Studio / llama.cpp / vLLM, codex's bundled catalog has no entry for the slug, so supports_reasoning_summaries defaults to false and the reasoning stream never reaches the UI. The only existing escape hatch is the global model_supports_reasoning_summaries knob in ~/.codex/config.toml, which is all-or-nothing for users who run more than one local model.

The result is that running a real reasoning model locally feels strictly worse than running a non-reasoning model locally — the reasoning UI just doesn't show up, even though the model is emitting reasoning content.

Fix Action

Fix / Workaround

This is the shape that felt cleanest after weighing user UX against scope, but I'm open to alternatives — e.g. heuristic-only with no per-model override, override-only with no heuristic, or rejecting both and treating local reasoning models as something users should keep configuring manually. Happy to rework the patch shape if there's a direction you'd prefer.

Code Example

[model_providers.my-vllm]
name = "my vllm cluster"
base_url = "http://localhost:8000/v1"
wire_api = "responses"
local_provider = true                       # heuristic enabled

[model_providers.my-vllm.model_overrides."my-custom-reasoning-model"]
supports_reasoning_summaries = true         # explicit override
RAW_BUFFERClick to expand / collapse

Summary

For users running self-hosted reasoning models (DeepSeek R1, Qwen QwQ, Qwen-3 thinking, etc.) on Ollama / LM Studio / llama.cpp / vLLM, codex's bundled catalog has no entry for the slug, so supports_reasoning_summaries defaults to false and the reasoning stream never reaches the UI. The only existing escape hatch is the global model_supports_reasoning_summaries knob in ~/.codex/config.toml, which is all-or-nothing for users who run more than one local model.

The result is that running a real reasoning model locally feels strictly worse than running a non-reasoning model locally — the reasoning UI just doesn't show up, even though the model is emitting reasoning content.

Proposed solution

Two new provider-scoped fields on model_providers.<name>:

  1. local_provider = true — marks a provider as serving locally-hosted models. When set, codex applies a conservative reasoning-name heuristic that auto-enables supports_reasoning_summaries for unknown models whose slug tokens match a known reasoning-model designator (r1, r2, r3, qwq, thinking, or reasoning, matched on alphanumeric segment boundaries — not substrings, so warp1 doesn't trip r1).

    Built-in ollama and lmstudio providers default to local_provider = true. Every other provider (including OpenAI, Azure, Bedrock) defaults to false. User-defined providers opt in explicitly.

  2. model_overrides.<slug>.supports_reasoning_summaries — explicit per-model escape hatch for cases the heuristic doesn't catch (e.g. qwen3-30b-a3b-thinking-2507 is fine, but myorg/some-reasoning-thing isn't), or to force-disable the heuristic for a specific model on a local provider. Bidirectional — both true and false are honored.

Example config.toml:

[model_providers.my-vllm]
name = "my vllm cluster"
base_url = "http://localhost:8000/v1"
wire_api = "responses"
local_provider = true                       # heuristic enabled

[model_providers.my-vllm.model_overrides."my-custom-reasoning-model"]
supports_reasoning_summaries = true         # explicit override

Precedence (lowest to highest)

  1. Model catalog default (existing behavior preserved).
  2. Heuristic auto-enable on local_provider = true for matching slugs.
  3. Global model_supports_reasoning_summaries = true (only enables; Some(false) still ignored to preserve existing semantics).
  4. Per-model model_overrides.<slug>.supports_reasoning_summaries (always wins).

Why a heuristic at all (and why this specific one)

A blanket auto-enable on local_provider = true would be over-reaching — strict OpenAI-compatible servers fronting non-reasoning models could 4xx on the unknown reasoning.summary request param. The heuristic limits the auto-enable to slugs that are unambiguously reasoning models. The allow-list is small and self-explanatory:

  • r1 / r2 / r3 — DeepSeek R-series family
  • qwq — Qwen QwQ family
  • thinking — Qwen-3 reasoning variants
  • reasoning — generic catch-all

Models that don't match still work fine, they just don't auto-enable. The model_overrides escape hatch covers them.

What I considered and rejected

  • Auto-enable for every model on a local provider: too aggressive, would break strict servers running non-reasoning models.
  • Per-model overrides only (no heuristic): solves correctness but doesn't actually fix the UX — users still have to write one override block per model.
  • A new global model_supports_reasoning_summaries_per_provider = true knob: doesn't scale to the multi-model-per-provider case.

The combined heuristic + override design lets the 90% case work with one local_provider = true line, while keeping the 10% case correct via an explicit override.

deny_unknown_fields

ModelProviderOverride uses #[serde(deny_unknown_fields)] so typos like supports_reasoning_summary (missing trailing ies) are caught at config-load time instead of silently no-oping at runtime.

Reference implementation

A working implementation with tests lives on the team-wcv fork:

Touches:

  • codex-rs/model-provider-info/src/lib.rs — new local_provider: bool and model_overrides: Option<HashMap<…, ModelProviderOverride>> fields. Built-in ollama/lmstudio default to local_provider = true; openai/bedrock to false.
  • codex-rs/models-manager/src/config.rs — adds active_provider_is_local: bool and model_provider_overrides: HashMap<String, ActiveModelOverride> to ModelsManagerConfig.
  • codex-rs/models-manager/src/model_info.rs — new looks_like_reasoning_model(slug) helper plus the four-layer precedence logic in with_config_overrides.
  • codex-rs/core/src/config/mod.rsto_models_manager_config resolves the active provider's overrides into the manager config.
  • codex-rs/core/config.schema.json — regenerated via just write-config-schema.
  • Tests: 9 new in model_info_tests.rs covering precedence, gating, exact-slug keying, and bidirectional override; 6 new in model_provider_info_tests.rs covering built-in defaults, TOML round-tripping, and deny_unknown_fields.

cargo test -p codex-models-manager -p codex-model-provider-info -p codex-config -p codex-core --lib passes ~1900 tests. cargo clippy --tests -- -D warnings and cargo fmt clean.

Open to alternatives

This is the shape that felt cleanest after weighing user UX against scope, but I'm open to alternatives — e.g. heuristic-only with no per-model override, override-only with no heuristic, or rejecting both and treating local reasoning models as something users should keep configuring manually. Happy to rework the patch shape if there's a direction you'd prefer.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

codex - 💡(How to fix) Fix Make reasoning summaries opt-in via a flag (with per-model override) for self-hosted reasoning models [1 participants]