openclaw - ✅(Solved) Fix qa-lab: run the parity gate end-to-end in CI and publish artifacts [1 pull requests, 1 participants]

openclaw2026-04-11 15:38:05

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#64878•Fetched 2026-04-12 13:26:25

View on GitHub

Comments

Participants

Timeline

Reactions

Author

100yenadmin

Participants

100yenadmin

Timeline (top)

cross-referenced ×3

Fix Action

Fix

Add a GitHub Actions workflow at .github/workflows/parity-gate.yml that runs on PRs touching extensions/qa-lab/**, qa/scenarios/**, or src/agents/**. The workflow should:

Start the qa-lab mock server.
Run openclaw qa suite --parity-pack agentic --model openai/gpt-5.4 --output-dir .artifacts/qa-e2e/gpt54.
Run openclaw qa suite --parity-pack agentic --model anthropic/claude-opus-4-6 --output-dir .artifacts/qa-e2e/opus46.
Run openclaw qa parity-report --candidate-summary .artifacts/qa-e2e/gpt54/qa-suite-summary.json --baseline-summary .artifacts/qa-e2e/opus46/qa-suite-summary.json --output-dir .artifacts/qa-e2e/parity.
Upload qa-suite-summary.json × 2 and qa-agentic-parity-{report.md,summary.json} as build artifacts.
Fail the build on pass: false from the parity-report exit code.

This turns criterion 5 from "unverified" to "continuously verified against the mock baseline." Note: a mock-only workflow is necessary but not sufficient — see the companion issue on making the mock differentiate providers. This one is about having any proof at all.

PR fix notes

PR #64909: qa-lab: (GPT 5.4 Parity vs. Opus Agentic) stage mock auth profiles so the parity gate runs without real credentials

Repository: openclaw/openclaw
Author: 100yenadmin
State: closed | merged: False
Link: https://github.com/openclaw/openclaw/pull/64909

Description (problem / solution / changelog)

Summary

Makes the mock structural parity gate actually runnable without real provider credentials.

After the current rebase, this PR is intentionally narrow. The legacy runtime seam work that was in earlier drafts has been absorbed elsewhere; what this branch still owns is the auth-staging path that the mock parity gate needs in order to reach the mock providers at all.

Current scope:

stage placeholder auth profiles per provider / agent dir in mock-openai mode
apply the staged auth profile config once per provider after the profile files exist
default the gateway-child provider mode through the shared helper so omitted providerMode still stages mock auth correctly

Part of #64227.

Why this still matters

Without this branch, the mock parity lane can still fail before the mock server ever sees a request because the auth resolver refuses to route through the mock provider base URL until a matching auth profile exists.

This branch makes the structural gate self-contained:

openai and anthropic placeholder API-key profiles are staged for the agents the suite uses
the child config is patched to reference those profiles
callers that omit provider mode still get the mock-openai auth staging path through the default-provider-mode helper

The mock server does not validate the key contents. The placeholder is just enough to satisfy the auth resolver so the parity harness can actually hit the local mock server.

What changed

`extensions/qa-lab/src/gateway-child.ts`

stages placeholder mock auth profiles for openai and anthropic
applies the auth-profile config once per provider after staging
resolves the effective provider mode through the shared helper before deciding whether mock auth staging is needed

`extensions/qa-lab/src/gateway-child.test.ts`

covers the staged mock-auth profile path
covers the provider / agent override path
covers the default-provider-mode path so omitted providerMode still stages mock auth correctly

Validation

Current head validation after rebase:

CI=1 pnpm exec vitest run extensions/qa-lab/src/gateway-child.test.ts

Result: 26/26 passing.

Program-level verification update:

this branch is part of what made the offline structural parity rerun possible without real keys
on the integrated patched stack, with this branch plus the qa-lab mock/provider follow-ups, the full offline 10-scenario parity rerun passed end-to-end

Non-goals

no runtime strict-agentic behavior changes
no scenario YAML changes
no parity-report scoring changes
no live-frontier credential path changes

Changed files

extensions/qa-lab/src/gateway-child.test.ts (modified, +87/-0)
extensions/qa-lab/src/gateway-child.ts (modified, +87/-2)

Code Example

find . -path '*/.artifacts/qa-e2e/qa-agentic-parity*'   # returns nothing
rg -l 'parity-pack|qa-agentic-parity' .github/workflows/  # returns nothing

RAW_BUFFERClick to expand / collapse

Problem

The parity gate has never been run end-to-end. There are no .artifacts/qa-e2e/qa-agentic-parity-* files anywhere in the repo, and no CI workflow triggers openclaw qa suite --parity-pack agentic. This means the entire "parity gate shows GPT-5.4 matches or beats Opus 4.6" claim (completion criterion 5 in #64227) rests on code that has never produced a real measurement.

Verification that this is a real gap:

find . -path '*/.artifacts/qa-e2e/qa-agentic-parity*'   # returns nothing
rg -l 'parity-pack|qa-agentic-parity' .github/workflows/  # returns nothing

Fix

Add a GitHub Actions workflow at .github/workflows/parity-gate.yml that runs on PRs touching extensions/qa-lab/**, qa/scenarios/**, or src/agents/**. The workflow should:

Start the qa-lab mock server.
Run openclaw qa suite --parity-pack agentic --model openai/gpt-5.4 --output-dir .artifacts/qa-e2e/gpt54.
Run openclaw qa suite --parity-pack agentic --model anthropic/claude-opus-4-6 --output-dir .artifacts/qa-e2e/opus46.
Run openclaw qa parity-report --candidate-summary .artifacts/qa-e2e/gpt54/qa-suite-summary.json --baseline-summary .artifacts/qa-e2e/opus46/qa-suite-summary.json --output-dir .artifacts/qa-e2e/parity.
Upload qa-suite-summary.json × 2 and qa-agentic-parity-{report.md,summary.json} as build artifacts.
Fail the build on pass: false from the parity-report exit code.

Acceptance

Workflow file exists and runs on PR check events for the relevant paths.
At least one successful CI run produces artifacts that can be downloaded from the Actions tab.
Workflow fails when the gate fails (verified with a deliberately broken scenario on a throwaway branch).

Part of

#64227 (wave 3 — parity proof). Companion to the mock-dispatch issue.

extent analysis

TL;DR

Create a GitHub Actions workflow to run the parity gate end-to-end, ensuring continuous verification of the "parity gate shows GPT-5.4 matches or beats Opus 4.6" claim.

Guidance

Add a new GitHub Actions workflow at .github/workflows/parity-gate.yml to run on PRs touching specific directories (extensions/qa-lab/**, qa/scenarios/**, or src/agents/**).
Implement the 6 steps outlined in the proposed fix, including running the qa suite with different models and generating a parity report.
Verify the workflow by checking for the existence of the workflow file, successful CI runs producing artifacts, and the workflow failing when the gate fails.
Ensure the workflow uploads required artifacts, such as qa-suite-summary.json and qa-agentic-parity-{report.md,summary.json}, for further analysis.

Example

name: Parity Gate
on:
  pull_request:
    paths:
      - 'extensions/qa-lab/**'
      - 'qa/scenarios/**'
      - 'src/agents/**'
jobs:
  parity-gate:
    runs-on: ubuntu-latest
    steps:
      # Start the qa-lab mock server
      # Run the qa suite with different models
      # Generate the parity report
      # Upload artifacts

Notes

This solution focuses on creating a workflow to run the parity gate end-to-end, providing continuous verification of the claim. However, it is noted that a mock-only workflow is necessary but not sufficient, and a companion issue addresses making the mock differentiate providers.

Recommendation

Apply the proposed workaround by creating the GitHub Actions workflow, as it provides a clear and concrete solution to the problem, ensuring continuous verification of the claim.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#generation error #database connection #vector store #embedding generation #cache error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - ✅(Solved) Fix qa-lab: run the parity gate end-to-end in CI and publish artifacts [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fix

PR fix notes

PR #64909: qa-lab: (GPT 5.4 Parity vs. Opus Agentic) stage mock auth profiles so the parity gate runs without real credentials

Description (problem / solution / changelog)

Summary

Why this still matters

What changed

`extensions/qa-lab/src/gateway-child.ts`

`extensions/qa-lab/src/gateway-child.test.ts`

Validation

Non-goals

Changed files

Code Example

Problem

Fix

Acceptance

Part of

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

openclaw - ✅(Solved) Fix qa-lab: run the parity gate end-to-end in CI and publish artifacts [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fix

PR fix notes

PR #64909: qa-lab: (GPT 5.4 Parity vs. Opus Agentic) stage mock auth profiles so the parity gate runs without real credentials

Description (problem / solution / changelog)

Summary

Why this still matters

What changed

extensions/qa-lab/src/gateway-child.ts

extensions/qa-lab/src/gateway-child.test.ts

Validation

Non-goals

Changed files

Code Example

Problem

Fix

Acceptance

Part of

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING

`extensions/qa-lab/src/gateway-child.ts`

`extensions/qa-lab/src/gateway-child.test.ts`