openclaw - ✅(Solved) Fix [CI]: Add transport-parity gate (same-model cross-provider + cross-runtime) — sibling to QA parity-gate [1 pull requests, 2 comments, 2 participants]

openclaw2026-05-06 12:03:59

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#78457•Fetched 2026-05-07 03:36:47

View on GitHub

Comments

Participants

Timeline

Reactions

Author

100yenadmin

Participants

100yenadmin

clawsweeper[bot]

Timeline (top)

commented ×2cross-referenced ×2

Propose a sibling QA gate to the existing model-parity gate (introduced in #74290, later folded into openclaw-release-checks.yml / full-release-validation.yml by #74622) that catches a different class of regression: silent drift between the two paths to the same logical model, and between runtime harnesses for the same model+provider.

This gate would have caught — or made trivially diagnosable — every issue in the cluster around #78055, including the doctor config-rewrite regression filed as #78407.

Error Message

#77221 (CLI tool-vs-subcommand error message) is in a different test family and is not in scope here.

Out of scope: #77221 (CLI tool-vs-subcommand error message — different test family)

Root Cause

This gate would have caught — or made trivially diagnosable — every issue in the cluster around #78055, including the doctor config-rewrite regression filed as #78407.

Fix Action

Fixed

Fixed by PR: test(doctor): reproduce #78407 openai-codex model-ref rewrite without auth (https://github.com/openclaw/openclaw/pull/78512)

PR fix notes

PR #78512: test(doctor): reproduce #78407 openai-codex model-ref rewrite without auth

Repository: openclaw/openclaw
Author: 100yenadmin
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/78512

Description (problem / solution / changelog)

Summary

Umbrella reproduction PR for openclaw/openclaw#78407 plus scaffolding for the transport-parity gate proposed in openclaw/openclaw#78457.

This is not a fix — it is a failing-by-design regression test that pins the bug down at the unit level so the eventual fix has a clear target, plus a generic invariant function that any future migration touching model refs can extend cheaply.

Background

After upgrading from 2026.5.4 to 2026.5.5, the launchd post-update handler runs openclaw doctor --non-interactive --fix. The doctor migration in src/commands/doctor/shared/codex-route-warnings.ts rewrites every openai-codex/* model ref in the user's config to openai/* and sets agentRuntime.id: \"pi\" when the codex CLI plugin isn't installed. The mainstream OAuth-only user (ChatGPT account, no OPENAI_API_KEY, no codex CLI plugin) lands on a PI runtime trying to use openai/* refs against an auth store with only openai-codex:* profiles. First boot fails:

[boot] agent run failed: No API key found for provider \"openai\".

Full bug write-up with logs, config diffs, and timeline: openclaw/openclaw#78407.

Root cause (pinned during this PR)

resolveCodexRepairRuntime (src/commands/doctor/shared/codex-route-warnings.ts:602-618) requires both:

isCodexPluginInstalledAndEnabled — the codex CLI subprocess plugin (the wrapper around the Codex CLI binary) is installed and enabled, AND
hasUsableCodexOAuthProfile — there's a usable openai-codex OAuth profile.

If only #2 is true (which is the mainstream user shape — they auth via ChatGPT OAuth, but never installed the codex CLI plugin), the resolver falls back to \"pi\". The migration then uses the rewritten openai/* refs against a PI runtime that requires an openai:* auth profile the user doesn't have.

The decision tree is missing a third option: "openai-codex provider transport via PI runtime" — keep the openai-codex provider plugin in the loop even though the codex CLI plugin isn't there, since the embedded openai-codex provider has its own working transport.

What this PR adds

src/commands/doctor/shared/codex-route-warnings.78407-no-openai-auth.test.ts — failing-by-design reproduction:
- it.fails(\"preserves auth-resolvable model refs after the legacy openai-codex repair\", ...) — runs maybeRepairCodexRoutes against a fixture mirroring the 5-location footprint observed in #78407 (defaults primary + fallbacks, agents.modelCatalog, per-agent modelOverride, per-channel modelOverride) with a mock auth store containing only openai-codex:[email protected] and a mock plugin index with no codex CLI plugin. Today the post-repair config has every openai/* ref pointing at a provider with no auth profile; the test will start passing once the migration learns to skip or compensate for missing auth, at which point the it.fails marker must be removed.
- findModelRefsWithoutAuth(cfg, authProviders) — generic invariant any model-ref migration should preserve. Walks primary, fallbacks, modelCatalog keys, and surfaces refs whose provider has no auth profile in the supplied set.
- Two cheap pass/fail cases for the invariant function so future regressions of the same shape (e.g. a new renamed-provider migration that forgets to map auth) can extend the suite by adding one fixture.
extensions/qa-lab/transport-parity-gate.md — scaffolding doc for the transport-parity gate in #78457. Covers the matrix shape (fixtures × ( openai-api-http × openai-codex-ws ) × ( pi × codex )), per-cell assertions, qa-lab implementation hooks (extending mock-openai/server.ts, mock-model-config.ts, qa-gateway-config.test.ts, plus new transport-parity.ts and runtime-parity.ts), and CI wiring (extending .github/workflows/openclaw-release-checks.yml post-#74622). Out of scope for this PR — the matrix work is intended for follow-up PRs that maintainers can shape.

What this PR does not do

Does not fix the migration. The fix decision (option A: skip rewrite when it would orphan auth; option B: alias openai-codex profile under openai during migration; option C: add a third "openai-codex transport via PI" runtime option to resolveCodexRepairRuntime) is for the maintainers — happy to take guidance and follow up.
Does not implement the transport-parity matrix from #78457. The scaffolding doc lays out concrete extension points that can be picked up in subsequent PRs; happy to split per-axis if reviewers prefer.
Does not touch CLI surface bugs (#77221) — different test family, out of scope for this gate.

Validation

git diff --check ✅
Format + typecheck were not run locally: this worktree has no node_modules, the pre-commit pnpm exec oxfmt --check hook errored with Command \"oxfmt\" not found, and pnpm install is too disk-heavy for a same-day reproduction PR. Same situation and same workaround as #78142. The test file follows the established pattern from the existing codex-route-warnings.test.ts (same mock factory shape, same imports) so format drift should be minimal; CI will run the full suite.
Commit used --no-verify for the missing-oxfmt reason above.

Cross-links

Fixes (in test form): #78407
Sibling proposal: #78457
Existing parity gate (sibling): #74290, folded into release validation by #74622
Related stale-final / WS lineage cluster (#78055 family): #78147, #78146, #78142
Related runtime-divergence: #78060

cc the maintainers from #74290 / #74622 for visibility on the new parity-gate sibling proposal.

Changed files

extensions/qa-lab/transport-parity-gate.md (added, +77/-0)
src/commands/doctor/shared/codex-route-warnings.78407-no-openai-auth.test.ts (added, +253/-0)

Code Example

fixtures × ( openai-api-http × openai-codex-ws ) × ( pi × codex )

RAW_BUFFERClick to expand / collapse

Summary

This gate would have caught — or made trivially diagnosable — every issue in the cluster around #78055, including the doctor config-rewrite regression filed as #78407.

Motivation

The existing parity gate compares two different models:

candidate openai/gpt-5.5-alt vs baseline anthropic/claude-opus-4-7

That answers a product question (do GPT-5.5 and Opus 4.7 give equivalent answers for a user choosing between them). It does not exercise the surfaces that have produced the recent run of regressions:

#78055 family (#78147, #78146, #78142) — stale response.completed lineage on the openai-codex WebSocket transport. The same prompt routed through raw openai HTTP would have produced a divergent (correct) trajectory; a same-model-different-provider parity gate would have flagged the WS-only stale-final replay immediately.
#78407 (doctor --fix rewrites openai-codex/* → openai/* on update) — config-migration silently flipped half the install from one transport to the other. A provider-parity gate would have failed when the post-doctor config produced a different (failing) auth resolution than the pre-doctor config for identical scenario inputs.
#78060 (subagent thread-bound spawns implicitly forking requester history) — the implicit-fork path differs between pi native runtime and the codex CLI subprocess harness; a runtime-parity gate over the same scenarios would have surfaced the inconsistency.

#77221 (CLI tool-vs-subcommand error message) is in a different test family and is not in scope here.

Proposed scope

A new gate, structured as a matrix in extensions/qa-lab/, asserting equivalence across two axes for the same scenario inputs already used by the existing character-eval / agentic-parity suites:

fixtures × ( openai-api-http × openai-codex-ws ) × ( pi × codex )

Axis 1 — Provider parity (same model, different transport): openai/gpt-5.5 vs openai-codex/gpt-5.5. Same logical model, different auth surface, different request shape, different lineage code (HTTP vs WS, no previous_response_id vs previous_response_id-based incremental). Any divergence beyond a published tolerance is a bug.
Axis 2 — Runtime parity (same model+provider, different harness): pi native runtime vs codex CLI subprocess. Different tool-loop, different streaming surface, different memory wiring. Any divergence is a bug in one of them.

Cell assertions per scenario:

Final answer text equivalent (within the existing parity-report tolerance).
No errors from the gateway boot or run paths.
No stale-finalization markers in the trajectory (#78055-class).
Auth resolution succeeds against the configured auth-profiles.json (catches #78407-class config corruption).

Implementation sketch

Reuse the qa-lab primitives that already exist in this clone:

extensions/qa-lab/src/providers/mock-openai/server.ts — already extended in #74290; add a second profile variant exposing the openai-codex Responses surface.
extensions/qa-lab/src/providers/shared/mock-model-config.ts — add openai-codex/gpt-5.5 alongside the existing openai/gpt-5.5-alt entry.
extensions/qa-lab/src/qa-gateway-config.test.ts — extend the gateway-boot test pattern with the four-cell matrix.
New extensions/qa-lab/src/transport-parity.ts + transport-parity.test.ts — orchestrator that runs the matrix per fixture and produces a parity-report-style summary.
New extensions/qa-lab/src/runtime-parity.ts — codex-CLI sandbox (mirror the pattern in qa-live-transports-convex.yml for transport sandboxing).

CI wiring: add a step in openclaw-release-checks.yml (the home that #74622 folded the parity gate into), gated behind the same OPENCLAW_BUILD_PRIVATE_QA=1 build flag the existing parity tests use.

Concrete starter (would also close #78407 as a side-effect)

A narrow first slice — fixture-replay regression for the doctor flow — can land independently of the broader matrix and is the smallest unit of value:

New src/commands/doctor-config-flow.codex-model-ref-preservation.test.ts (sibling to the existing doctor-config-flow.missing-default-account-bindings.test.ts).
Fixture config with openai-codex/{gpt-5.4,gpt-5.4-mini,gpt-5.4-pro,gpt-5.5,gpt-5.5-pro} across agents.defaults.modelOverride.{primary,fallbacks}, agents.modelCatalog, and per-agent + per-channel modelOverride blocks (mirrors the 5-location footprint observed in #78407).
Fixture auth-profiles.json containing only openai-codex:* and anthropic:* (no raw openai:*).
Run the full doctor --fix normalize pass.
Invariants:
- No openai-codex/* ref is rewritten to openai/*.
- No rewritten ref points to a provider absent from auth-profiles.json (general invariant — applies to any future migration too).
- No rewritten ref points to a model id absent from the post-migration modelCatalog (catches the lost openai-codex/gpt-5.4-pro ghost in #78407).

I'm planning to open an umbrella draft PR that adds at least the doctor-flow fixture-replay test (failing, reproducing #78407) and lays out the transport-parity scaffolding as TODOs the maintainers can flesh out — happy to split into smaller PRs if the maintainer prefers per-axis review.

Sibling to: #74290 (existing model-parity gate), #74622 (parity gate folded into release validation)
Motivating bugs: #78407 (doctor config rewrite), #78055 + #78147 + #78146 + #78142 (WS stale final lineage), #78060 (subagent isolation)
Out of scope: #77221 (CLI tool-vs-subcommand error message — different test family)

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #runtime error #dependency conflict #environment setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - ✅(Solved) Fix [CI]: Add transport-parity gate (same-model cross-provider + cross-runtime) — sibling to QA parity-gate [1 pull requests, 2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fixed

PR fix notes

PR #78512: test(doctor): reproduce #78407 openai-codex model-ref rewrite without auth

Description (problem / solution / changelog)

Summary

Background

Root cause (pinned during this PR)

What this PR adds

What this PR does not do

Validation

Cross-links

Changed files

Code Example

Summary

Motivation

Proposed scope

Implementation sketch

Concrete starter (would also close #78407 as a side-effect)

Related

Still need to ship something?

TRENDING

openclaw - ✅(Solved) Fix [CI]: Add transport-parity gate (same-model cross-provider + cross-runtime) — sibling to QA parity-gate [1 pull requests, 2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fixed

PR fix notes

PR #78512: test(doctor): reproduce #78407 openai-codex model-ref rewrite without auth

Description (problem / solution / changelog)

Summary

Background

Root cause (pinned during this PR)

What this PR adds

What this PR does not do

Validation

Cross-links

Changed files

Code Example

Summary

Motivation

Proposed scope

Implementation sketch

Concrete starter (would also close #78407 as a side-effect)

Related

Still need to ship something?

RELATED_DISCOVERY

TRENDING