openclaw - ✅(Solved) Fix RFC: RuntimePlan finalization and embedded runner structural cleanup [14 pull requests, 1 comments, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#72072Fetched 2026-04-27 05:35:21
View on GitHub
Comments
1
Participants
1
Timeline
15
Reactions
0
Participants
Timeline (top)
cross-referenced ×14commented ×1

This RFC tracks the post-#71722 finalization roadmap for the contract-first Pi/Codex runtime work.

The original RFC #71004 is implemented and closed. The core RuntimePlan boundary is now in main: OpenClaw-owned runtime policy is represented in AgentRuntimePlan, built before attempts, consumed by Pi/Codex paths, covered by parity contracts, and routed through the additive Harness V2 lifecycle.

This RFC is therefore not another GPT-5.4 bug-fix push. It is the cleanup ladder needed to make the merged architecture maintainable: native Harness V2 ownership, runner file splits by contract boundary, neutral embedded-runner naming, docs, and final smoke verification.

Root Cause

This RFC tracks the post-#71722 finalization roadmap for the contract-first Pi/Codex runtime work.

The original RFC #71004 is implemented and closed. The core RuntimePlan boundary is now in main: OpenClaw-owned runtime policy is represented in AgentRuntimePlan, built before attempts, consumed by Pi/Codex paths, covered by parity contracts, and routed through the additive Harness V2 lifecycle.

This RFC is therefore not another GPT-5.4 bug-fix push. It is the cleanup ladder needed to make the merged architecture maintainable: native Harness V2 ownership, runner file splits by contract boundary, neutral embedded-runner naming, docs, and final smoke verification.

Fix Action

Fixed

PR fix notes

PR #72098: [refactor] Document RuntimePlan finalization baseline (RFC #72072 PR 1/7)

Description (problem / solution / changelog)

Refs #72072 (RFC) — PR 1 of 7.

Summary

Documentation only. This is PR 1 of the 7-PR roadmap for RFC 72072 (RuntimePlan finalization and embedded runner structural cleanup). It captures a clean baseline before the structural PRs land, audits the drift between the RFC's recon snapshot and current origin/main, and adjusts the per-PR scope where main has already moved.

The new doc lives at docs/refactor/runtime-plan-finalization-baseline.md.

What is being fixed

The RFC's recon was a read-only sweep against an earlier origin/main snapshot. Between that snapshot and current origin/main at HEAD 64af2feda0 there are 442 commits, several of which touched the load-bearing files for this RFC. Without a baseline doc, every later PR would either restate the same drift findings or quietly drift again. PR 1 anchors all later PRs.

Affected surface: docs/refactor/ only. No production or test code is touched.

Architecture diff

flowchart TD
  RFC[RFC 72072 recon snapshot] -->|stale by 442 commits| Drift{Audit drift}
  Drift -->|relocate seam line numbers| LaterPRs[PR 2 through PR 7]
  Drift -->|reduce PR 6 scope where main already shipped canonicalization| LaterPRs
  CurrentMain[origin main at 64af2feda0] -->|baseline check capture| LaterPRs

File map

ActionPathPurpose
adddocs/refactor/runtime-plan-finalization-baseline.mdBaseline doc with check capture, drift audit, and per-PR scope adjustments

Validation

All commands ran on a clean clone of origin/main at HEAD 64af2feda0, with pnpm install clean and pnpm-lock.yaml restored to remove an unrelated workspace lockfile drift.

pnpm check:architecture
# green: 0 runtime value cycles, 0 madge cycles

pnpm check:test-types
# green: tsgo:core:test + tsgo:extensions:test, 0 type errors

node scripts/run-vitest.mjs run --config test/vitest/vitest.agents.config.ts \
  src/agents/runtime-plan/build.test.ts \
  src/agents/runtime-plan/types.test.ts \
  src/agents/runtime-plan/types.compat.test.ts \
  src/agents/runtime-plan/tools.test.ts \
  src/agents/runtime-plan/tools.diagnostics.test.ts \
  src/agents/harness/v2.test.ts \
  src/agents/harness/selection.test.ts
# green: 45 tests passed across the listed files (running types.test.ts alone confirms its 2 tests run)

node scripts/run-vitest.mjs run --config test/vitest/vitest.extensions.config.ts \
  extensions/codex/src/app-server/run-attempt.test.ts \
  extensions/codex/src/app-server/event-projector.test.ts
# green: 53 tests passed across 2 files in 129s

No baseline drift blocks any later PR. The RuntimePlan + Harness V2 + Codex app-server contracts are green.

Key drift findings (full table in the doc)

  • pi-embedded-runner/run/attempt.ts is now 3,212 LOC (RFC said ~2,850). PR 3 + PR 4 must re-locate seams.
  • pi-embedded-runner/run.ts is now 2,347 LOC (RFC said ~2,100). PR 5 must re-locate seams.
  • pi-embedded-runner.ts is now a 49-line alias barrel with as-aliased neutral names. RFC said ~504 LOC.
  • embedded-runner.ts (canonical flat barrel) already exists as a 17-line neutral re-export. RFC said it did not exist.
  • aliases.test.ts already includes the bidirectional neutral-vs-Pi identity asserts the RFC asked PR 6 to add.
  • embedded-runner/index.ts and pi-embedded-runner/index.ts (directory barrels) still do not exist; PR 6 keeps that scope.
  • Native V2 factory pattern still does not exist in harness/v2.ts. PR 2 unchanged.

PR 6's scope reduces accordingly. PR 2 through PR 5 stay shape-correct but their seam line numbers in the RFC are now starting points to verify, not ranges to extract from blindly.

What did NOT change

  • No production code.
  • No test code.
  • No public plugin surface (registerAgentHarness(...), AgentHarness, harness/index.ts all unchanged).
  • No imports.
  • No file moves or renames.
  • No pi-embedded-runner/* paths (still work as before; PR 6 will keep them as deprecated aliases).

Why this is the right fix

The alternative was for PR 2 to absorb the baseline-and-drift audit silently and march into structural changes. That would mix three different kinds of work (audit, scope adjustment, architectural change) into one PR, making it harder to review and harder to back out. Splitting the audit into a docs-only PR keeps each later structural PR small, focused, and individually revertible. CLAUDE.md guidance "no PR mixes behavior changes with file moves or naming changes" extends naturally to "no PR mixes audit with structural change."

Notes for reviewers

  • Linked RFC: openclaw/openclaw#72072.
  • Predecessor context: PR 71722 (closed-merged at commit 2c35a6e), RFC 71004 (closed-implemented).
  • The drift audit is the load-bearing part of this PR. If any of the reduced-scope items in PR 6 should be reintroduced as stricter behavior, please flag here so PR 6 can carry the change.
  • Subsequent PRs will open as drafts on fresh origin/main branches; this PR does not stack.

Changed files

  • docs/refactor/runtime-plan-finalization-baseline.md (added, +148/-0)

PR #72105: [refactor] Add native AgentHarnessV2 factory registry (RFC #72072 PR 2/7)

Description (problem / solution / changelog)

Refs #72072 (RFC) — PR 2 of 7.

Summary

Promotes Harness V2 from "generic V1 adapter" to an internal lifecycle boundary by adding a native AgentHarnessV2 factory registry. The built-in PI harness now ships a native V2 factory; selection prefers it, with a guaranteed fall-back to adaptAgentHarnessToV2 for any harness without a native factory. A parity test locks the contract that native and adapted paths produce the same final AgentHarnessAttemptResult so PR 4 cannot regress cleanup ordering or classification routing without surfacing the drift.

This PR is scaffolding for PR 4, not a behavior change. PR 4 (split Pi stream and lifecycle) is what fills the native PI cleanup hook with split-lifecycle teardown. PR 2 keeps native and adapter observationally identical.

What is being fixed

Today every selected harness goes through adaptAgentHarnessToV2(harness) at src/agents/harness/selection.ts:193. That adapter only knows the V1 shape, so the V2 lifecycle (prepare/start/send/resolveOutcome/cleanup) has nowhere meaningful to plumb future split-out modules. Until PR 2 lands the seam, PR 4's cleanup work would have to either edit attempt.ts directly or widen the public V1 AgentHarness shape — both bad outcomes.

Scope of the SDK widening for plugin-supplied native V2 (so Codex can register one too) is explicitly optional future per the RFC. PR 2 adds the internal-only registry surface; bundled Codex stays V1 in this PR.

Architecture diff

flowchart TD
  subgraph Before
    selA[selection.ts] -->|adaptAgentHarnessToV2| adapter[V1 adapter]
    adapter --> v2run[runAgentHarnessV2LifecycleAttempt]
  end
  subgraph After
    selB[selection.ts] -->|resolveAgentHarnessV2| split{native factory registered?}
    split -- yes --> native[native AgentHarnessV2]
    split -- no --> adapter2[adaptAgentHarnessToV2]
    native --> v2run2[runAgentHarnessV2LifecycleAttempt]
    adapter2 --> v2run2
    pi[builtin-pi.ts module load] -->|registerNativeAgentHarnessV2Factory pi| registry[(internal registry)]
    registry -.- split
  end

File map

ActionPathPurpose
modifysrc/agents/harness/v2.tsAdd NativeAgentHarnessV2Factory type, internal registry, register/clear/getNativeAgentHarnessV2Factory helpers, and resolveAgentHarnessV2(harness) resolution function. None of these are exported via harness/index.ts or the plugin SDK.
modifysrc/agents/harness/selection.tsReplace one call (line 193) from adaptAgentHarnessToV2(harness) to resolveAgentHarnessV2(harness).
modifysrc/agents/harness/builtin-pi.tsAdd createPiAgentHarnessV2(harness) and register it as the "pi" factory at module load. send calls harness.runAttempt(session.params) so PR 4 can plumb the split-lifecycle without breaking parity. Export PI_AGENT_HARNESS_ID and PI_AGENT_HARNESS_LABEL constants.
modifysrc/agents/harness/v2.test.tsAdd tests for resolveAgentHarnessV2 (registry hit, fallback, parity contract). Existing 14 adapter tests untouched.
addsrc/agents/harness/builtin-pi.test.tsNative PI factory tests: registration at module load, returned V2 routes send through harness.runAttempt, cleanup is intentionally empty at PR 2.

Do NOT modify (and did not): src/agents/harness/index.ts (public surface), src/agents/harness/types.ts, src/plugin-sdk/agent-harness-runtime.ts, extensions/codex/harness.ts. Plugin shape stays exactly the same.

Validation

pnpm check:architecture
# green: 0 runtime value cycles, 0 madge cycles

pnpm check:test-types
# green: tsgo:core:test + tsgo:extensions:test, 0 type errors

node scripts/run-vitest.mjs run --config test/vitest/vitest.agents.config.ts \
  src/agents/runtime-plan/build.test.ts \
  src/agents/runtime-plan/types.test.ts \
  src/agents/runtime-plan/types.compat.test.ts \
  src/agents/runtime-plan/tools.test.ts \
  src/agents/runtime-plan/tools.diagnostics.test.ts \
  src/agents/harness/v2.test.ts \
  src/agents/harness/selection.test.ts \
  src/agents/harness/builtin-pi.test.ts
# green: 7 files / 52 tests passed (was 45 baseline; +7 new tests in v2 + builtin-pi)

node scripts/run-vitest.mjs run --config test/vitest/vitest.extensions.config.ts \
  extensions/codex/src/app-server/run-attempt.test.ts \
  extensions/codex/src/app-server/event-projector.test.ts \
  extensions/codex/index.test.ts
# green: 3 files / 56 tests passed in 40.6s

pnpm check:changed
# green: lint, import cycles, webhook/pairing guards, agent test lane

What did NOT change

  • Public harness/index.ts exports: identical (no V2 types or factory functions exposed there).
  • Public V1 AgentHarness type: identical.
  • registerAgentHarness(...), getAgentHarness(...), listRegisteredAgentHarnesses(...), clearAgentHarnesses(...), disposeRegisteredAgentHarnesses(...): identical signatures.
  • Plugin SDK at src/plugin-sdk/agent-harness-runtime.ts: identical (no V2 surface widening).
  • extensions/codex/harness.ts: untouched. Codex stays V1, gets V2-adapted as before.
  • src/agents/harness/v2.ts lifecycle invariants: runAgentHarnessV2LifecycleAttempt ordering (prepare → start → send → resolveOutcome → cleanup), error-cleanup pairing, and diagnostic emission all unchanged.
  • Result classification choke-point: applyAgentHarnessResultClassification remains the single source.

Why this is the right fix

Three alternatives were considered:

  1. Inline the native lifecycle in adaptAgentHarnessToV2 — would entangle adapter and native code paths, making PR 4's split-lifecycle work harder to land cleanly.
  2. Add an optional createV2() field on the public V1 AgentHarness — widens the plugin SDK contract for a feature that is explicitly optional future per the RFC. Reverting later would be breaking.
  3. Special-case Pi inside selection.ts — embeds harness identity into the orchestration layer, which is exactly what the V2 boundary is supposed to avoid.

The internal registry pattern in this PR keeps the V2 lifecycle boundary generic and extensible, leaves the public V1 surface untouched, and makes Codex's eventual native V2 factory a single registration call once the optional SDK widening is approved.

Notes for reviewers

  • Linked RFC: openclaw/openclaw#72072. Predecessor: PR #72098 (PR 1 of 7, baseline doc).
  • selection.test.ts mocks ./builtin-pi.js; the mock replaces the entire module so the module-level Pi registration does not run inside that test file. All 25 selection tests still observe the V1-adapter path, which is correct (the mock harness has no real backing).
  • Vitest module isolation is what keeps the registry registrations scoped per test file. Tests that need to assert registry behavior register under unique ids ("no-native-factory-test", "native-factory-preference-test", "parity-adapter", "parity-native") so they do not collide with the module-level "pi" registration.
  • The native PI cleanup hook is intentionally empty in this PR. PR 4 will replace it with split-lifecycle teardown plumbed through attempt.subscription-cleanup.ts. The parity test will surface any drift in cleanup-call signature when that change lands.

Changed files

  • src/agents/harness/builtin-pi.test.ts (added, +106/-0)
  • src/agents/harness/builtin-pi.ts (modified, +50/-2)
  • src/agents/harness/selection.ts (modified, +2/-2)
  • src/agents/harness/v2.test.ts (modified, +121/-1)
  • src/agents/harness/v2.ts (modified, +38/-0)

PR #72110: [refactor] Extract attempt tool-policy helpers (RFC #72072 PR 3/7, reduced scope)

Description (problem / solution / changelog)

Refs #72072 (RFC) — PR 3 of 7 (reduced scope).

Summary

Extracts the cleanly-bounded tool-policy helpers (resolveUnknownToolGuardThreshold, applyEmbeddedAttemptToolsAllow, shouldCreateBundleMcpRuntimeForAttempt, collectAttemptExplicitToolAllowlistSources) out of pi-embedded-runner/run/attempt.ts (3,212 LOC) into a new domain module attempt-tools.ts (151 LOC). attempt.ts re-exports the public symbols so all existing import paths keep working.

This is a reduced PR 3 scope compared to the RFC. The RFC envisioned three new modules in this PR (attempt-tools.ts, attempt-prompt.ts, attempt-transport.ts). After deeper recon against current origin/main, two of the three planned extractions (attempt-prompt.ts, attempt-transport.ts) sit on per-turn seams inside the main stream loop rather than on per-attempt setup seams; pulling them cleanly requires the lifecycle work that PR 4 plans. They will land as separate follow-ups, not in PR 3.

What is being fixed

Pure tool-policy logic was inline in the embedded attempt orchestrator, mixed with cross-cutting orchestration state. Reviewing the unknown-tool guard threshold or the explicit-allowlist source aggregation meant scrolling past a 2.5k LOC orchestration body. The four functions extracted here are independently testable, have no closure dependencies on runEmbeddedAttempt's state, and were already covered by attempt.test.ts.

attempt.test.ts continues to import them from ./attempt.js (via the re-export), so test-side blast radius is zero.

Affected surface: src/agents/pi-embedded-runner/run/attempt.ts, new file src/agents/pi-embedded-runner/run/attempt-tools.ts. No public API change. No test changes.

Architecture diff

flowchart TD
  subgraph Before
    runEmbeddedAttempt --> inlineUnknownGuard[inline resolveUnknownToolGuardThreshold]
    runEmbeddedAttempt --> inlineToolsAllow[inline applyEmbeddedAttemptToolsAllow]
    runEmbeddedAttempt --> inlineBundleMcpDecision[inline shouldCreateBundleMcpRuntimeForAttempt]
    runEmbeddedAttempt --> inlineAllowlistAggregation[inline collectAttemptExplicitToolAllowlistSources]
    attemptTest[attempt.test.ts] --> attemptModule[attempt.ts]
  end
  subgraph After
    runEmbeddedAttempt2[runEmbeddedAttempt] -- imports --> toolsModule[attempt-tools.ts]
    attemptModule2[attempt.ts] -- re-exports --> toolsModule
    attemptTest2[attempt.test.ts] --> attemptModule2
  end

File map

ActionPathPurpose
addsrc/agents/pi-embedded-runner/run/attempt-tools.tsNew domain module owning four pure-or-near-pure tool helpers, all previously in attempt.ts.
modifysrc/agents/pi-embedded-runner/run/attempt.tsRemove the four moved functions (~120 LOC), re-export them from the new module so external callers keep working, drop now-unused imports (TOOL_NAME_SEPARATOR, pi-tools.policy.js policy resolvers, subagent-capabilities.js helpers, collectExplicitToolAllowlistSources, UNKNOWN_TOOL_THRESHOLD). attempt.ts body still calls the four helpers via the local import.

LOC change: attempt.ts 3,212 → 3,091 (−121 LOC). New attempt-tools.ts is 151 LOC.

Validation

pnpm check:architecture
# green: 0 runtime value cycles, 0 madge cycles

pnpm check:test-types
# green: tsgo:core:test + tsgo:extensions:test, 0 type errors

node scripts/run-vitest.mjs run --config test/vitest/vitest.agents.config.ts \
  src/agents/pi-embedded-runner/run/attempt.test.ts
# green: 1 file / 122 tests passed in 3.26s

node scripts/run-vitest.mjs run --config test/vitest/vitest.agents.config.ts \
  src/agents/pi-embedded-runner/run/attempt.test.ts \
  src/agents/runtime-plan/build.test.ts \
  src/agents/runtime-plan/tools.test.ts \
  src/agents/harness/v2.test.ts \
  src/agents/harness/selection.test.ts
# green: 5 files / 164 tests passed

pnpm check:changed
# All static checks (lint:core, runtime/madge import cycles, webhook/pairing guards) green.
# Targeted lane: agents.config.ts → 38 files / 663 tests passed in 13.33s.
# Unrelated E2E flake: vitest.e2e.config.ts → 1 failure in
#   `src/agents/pi-embedded-runner.run-embedded-pi-agent.auth-profile-rotation.e2e.test.ts`
#   ("preserves user-pinned auth profiles across provider aliases", 127s).
# This e2e file does not import any of the symbols this PR moved
# (`grep` for `attempt-tools` / `applyEmbeddedAttemptToolsAllow` / etc. returns nothing).
# The failure is a pre-existing flake in the auth-profile-rotation surface, not caused by this PR.

Why the reduced scope

The RFC's PR 3 design assumed the prompt-cache prep was a one-shot setup that could be pulled out into a single prepareAttemptPromptCache(...): PromptCachePrep function before the per-turn stream loop. Recon against current origin/main shows the cache observation is born inside the loop on every turn (beginPromptCacheObservation at attempt.ts:2276 within the per-turn body). Extracting it cleanly without disturbing the rest of the loop body is PR 4's work, not PR 3's.

Similarly, the transport configuration (extra-param resolution, stream-fn wrapping, Google prompt-cache stream wrapper) is currently a sequence of decorations applied inline against activeSession.agent.streamFn mid-loop, not a one-shot setup. A safe extraction needs the lifecycle split that PR 4 introduces.

PR 3 still delivers the slice of the RFC that is independently safe and reviewable: pure tool-policy helpers off the orchestration body. PR 3.1 / PR 4 follow-ups will continue the per-turn-seam work.

What did NOT change

  • Public API: zero surface change. All previously-exported symbols still export from ./attempt.js.
  • attempt.test.ts: untouched. All 122 tests still pass importing from ./attempt.js.
  • Behavior: zero. This is a textual move with no logic change.
  • Imports for callers: zero. No consumer of attempt.ts had to change.
  • runEmbeddedAttempt body: only the call sites that use the moved helpers; no orchestration logic touched.
  • Plugin SDK at src/plugin-sdk/agent-harness-runtime.ts: untouched.
  • The other helper files in this directory (attempt.transcript-policy.ts, attempt.subscription-cleanup.ts, attempt.sessions-yield.ts, attempt.stop-reason-recovery.ts, attempt.tool-call-argument-repair.ts, attempt.tool-call-normalization.ts, etc.): untouched.

Notes for reviewers

  • Linked RFC: openclaw/openclaw#72072. Predecessors: PR #72098 (PR 1, baseline doc), PR #72105 (PR 2, native AgentHarnessV2 factory).
  • The reduced scope decision is documented in the baseline doc at PR 1 and surfaced again here. If you want the full three-module extraction in this PR, please flag and I will reopen for the prompt + transport seams together with PR 4's lifecycle work.
  • collectAttemptExplicitToolAllowlistSources is intentionally exported from attempt-tools.ts (it was previously a private symbol in attempt.ts). It is not re-exported from attempt.ts since no external caller needs it; only attempt.ts imports it locally to drive the explicit-allowlist guard.
  • The double import + export {} from pattern in attempt.ts for the three publicly-re-exported helpers is intentional: import brings the symbols into scope so the body can call them; the separate export { ... } from line keeps the re-export available to external callers.

Changed files

  • src/agents/pi-embedded-runner/run/attempt-tools.ts (added, +151/-0)
  • src/agents/pi-embedded-runner/run/attempt.ts (modified, +13/-132)

PR #72113: [refactor] Extract attempt message summary helpers (RFC #72072 PR 4/7, reduced scope)

Description (problem / solution / changelog)

Refs #72072 (RFC) — PR 4 of 7 (reduced scope).

Summary

Extracts the two pure message-summarization helpers (summarizeMessagePayload, summarizeSessionContext) from pi-embedded-runner/run/attempt.ts into a new domain module attempt-message-summary.ts. These were the first cleanly-bounded slice of the lifecycle/diagnostics phase and have no closure dependencies on runEmbeddedAttempt's state. attempt.ts imports summarizeSessionContext for one diagnostic call site; the helpers are not part of the public attempt surface so they are not re-exported.

This is a reduced PR 4 scope, mirroring the conservative cut taken in #72110 (PR 3). The RFC's PR 4 design called for attempt-stream-loop.ts (~1,550-1,850 LOC pulled out) and attempt-lifecycle.ts (~1,900-2,050 LOC pulled out). After deeper recon against current origin/main, both seams are inside runEmbeddedAttempt's closure with deep state dependencies (the cleanup finally block at lines ~3036-3085 alone references session, sessionManager, releaseWsSession, bundleMcpRuntime, bundleLspRuntime, sessionLock, removeToolResultContextGuard, flushPendingToolResultsAfterIdle, aborted, timedOut, idleTimedOut, timedOutDuringCompaction, promptError, params.sessionId, emitDiagnosticRunCompleted, trajectoryRecorder, trajectoryEndRecorded). Moving them safely needs a dedicated focused pass with full e2e coverage.

What is being fixed

Two pure message-introspection helpers were inline in the embedded attempt orchestrator. They are diagnostic-only (one is called from a debug log site; the other is unused outside of itself except internally in summarizeSessionContext). Reviewing diagnostic shape meant scrolling past unrelated orchestration. Their move clears ~62 LOC out of attempt.ts while keeping the public surface unchanged.

Affected surface: src/agents/pi-embedded-runner/run/attempt.ts, new file src/agents/pi-embedded-runner/run/attempt-message-summary.ts.

Architecture diff

flowchart TD
  subgraph Before
    runEmbeddedAttempt --> inlineSummarizePayload[inline summarizeMessagePayload]
    runEmbeddedAttempt --> inlineSummarizeSessionContext[inline summarizeSessionContext]
  end
  subgraph After
    runEmbeddedAttempt2[runEmbeddedAttempt] -- imports summarizeSessionContext --> summaryModule[attempt-message-summary.ts]
    summaryModule -- private internal --> summarizePayload[summarizeMessagePayload]
  end

File map

ActionPathPurpose
addsrc/agents/pi-embedded-runner/run/attempt-message-summary.tsPure message-payload and session-context summarization helpers extracted from attempt.ts. Both exported so the seam is visible; PR 4 only consumes summarizeSessionContext from attempt.ts.
modifysrc/agents/pi-embedded-runner/run/attempt.tsRemove the two inline helper definitions, import summarizeSessionContext from the new module.

LOC change: attempt.ts 3,212 → 3,151 (−61 LOC at PR 4 baseline; PR 3 reduces it further to 3,091). New attempt-message-summary.ts is 90 LOC including documentation header.

Validation

pnpm check:architecture
# green: 0 runtime value cycles, 0 madge cycles

node scripts/run-vitest.mjs run --config test/vitest/vitest.agents.config.ts \
  src/agents/pi-embedded-runner/run/attempt.test.ts
# green: 1 file / 122 tests passed in 2.66s

# Note on `pnpm check:test-types`: 2 type errors surfaced in
#   `src/wizard/setup.test.ts(976,55)` and `src/wizard/setup.test.ts(983,48)`.
# These come from the upstream commits on `origin/main` since the PR 1 baseline
# (`edcb2326a1 test: cover setup provider auth selection` and adjacent), not from
# this PR. PR 4 does not touch `src/wizard/`. Per CLAUDE.md guidance, this PR
# gates on targeted suites and documents the unrelated baseline drift here.

Why the reduced scope

runEmbeddedAttempt is 2,400+ LOC of orchestration with the stream loop and the lifecycle finally-block sharing a single closure. Extracting them into separate files needs either:

  1. Bundling 30+ pieces of state into a context struct passed to each domain function (high mechanical risk of subtle ordering/timing bugs in the lifecycle).
  2. Inlining the seams into helpers but keeping the orchestration shape, gaining nothing.
  3. Splitting the closure properly across multiple modules with explicit data flow contracts (the RFC's intent, but a multi-day pass).

Approach 3 is the right end-state but does not fit a single review-able PR without full e2e regression coverage on cleanup ordering, abort handling, compaction, and prompt-cache observation parity. It will land as a dedicated follow-up with its own validation strategy.

PR 4 still delivers a real ownership-boundary improvement: pure summarization helpers no longer live in the orchestration module.

What did NOT change

  • Public API: zero. summarizeMessagePayload and summarizeSessionContext were private symbols in attempt.ts (not exported); they are now exported from attempt-message-summary.ts for visibility but attempt.ts does not re-export them.
  • attempt.test.ts: untouched. All 122 tests still pass.
  • Behavior: zero. Pure textual move.
  • Any of the leaf helpers in pi-embedded-runner/run/: untouched.
  • Plugin SDK: untouched.

Notes for reviewers

  • Linked RFC: openclaw/openclaw#72072. Predecessors: #72098, #72105, #72110.
  • This PR is intentionally a small slice. If you would prefer the larger stream-loop / lifecycle split here, please flag and I will reopen with a wider scope; otherwise the deferred work is tracked in PR 7's handoff doc.
  • The wizard type-check drift is unrelated upstream noise (src/wizard/setup.test.ts); PR 4 does not touch that surface.

Changed files

  • src/agents/pi-embedded-runner/run/attempt-message-summary.ts (added, +90/-0)
  • src/agents/pi-embedded-runner/run/attempt.ts (modified, +1/-62)

PR #72116: [refactor] Extract run orchestration helpers (RFC #72072 PR 5/7, reduced scope)

Description (problem / solution / changelog)

Refs #72072 (RFC) — PR 5 of 7 (reduced scope).

Summary

Extracts four pure orchestration helpers (createEmptyAuthProfileStore, buildTraceToolSummary, backfillSessionKey, buildHandledReplyPayloads) from pi-embedded-runner/run.ts into a new domain module run/run-orchestration-helpers.ts. run.ts imports them from the new module and the four no-longer-needed top-level imports (ReplyPayload, normalizeOptionalString, AuthProfileStore, resolveSessionKeyForRequest, resolveStoredSessionKeyForSessionId, ToolSummaryTrace) are removed. None of the four helpers were exported, so the public surface is unchanged.

This is a reduced PR 5 scope, mirroring the conservative cuts taken in #72110 (PR 3) and #72113 (PR 4). The RFC's PR 5 design called for four substantial new modules (model-auth-plan.ts ~150-270 LOC, runtime-plan-factory.ts ~575-600 LOC, lane-workspace.ts ~80-120 LOC, terminal-result.ts ~1,380-1,450 LOC). After deeper recon against current origin/main, the larger seams sit inside runEmbeddedPiAgent's closure with deep state dependencies, particularly terminal-result.ts which would need to thread 30+ pieces of state through a returned struct. Moving them safely needs a dedicated focused pass with full e2e regression coverage on auth profile rotation, lane workspace setup, and terminal result shape.

What is being fixed

Four pure helpers were inline in the embedded run orchestrator:

  • createEmptyAuthProfileStore (6 LOC): a one-liner factory.
  • buildTraceToolSummary (23 LOC): aggregates a tool-call slice into the trace summary observability shape.
  • backfillSessionKey (39 LOC): the read-only sessionId→sessionKey lookup that runs at the top of runEmbeddedPiAgent. Logs a warning on failure but has no other side effects.
  • buildHandledReplyPayloads (14 LOC): normalises an optional ReplyPayload into the array shape downstream delivery expects, defaulting to a silent reply token when the caller does not provide one.

Affected surface: src/agents/pi-embedded-runner/run.ts, new file src/agents/pi-embedded-runner/run/run-orchestration-helpers.ts.

Architecture diff

flowchart TD
  subgraph Before
    runEmbeddedPiAgent --> inlineEmptyAuthStore[inline createEmptyAuthProfileStore]
    runEmbeddedPiAgent --> inlineTraceToolSummary[inline buildTraceToolSummary]
    runEmbeddedPiAgent --> inlineBackfillSessionKey[inline backfillSessionKey]
    runEmbeddedPiAgent --> inlineHandledReply[inline buildHandledReplyPayloads]
  end
  subgraph After
    runEmbeddedPiAgent2[runEmbeddedPiAgent] -- imports --> orchHelpers[run/run-orchestration-helpers.ts]
  end

File map

ActionPathPurpose
addsrc/agents/pi-embedded-runner/run/run-orchestration-helpers.tsPure orchestration helpers extracted from run.ts. All four are exported so the seam is testable; no external caller imports them today, so run.ts does not re-export.
modifysrc/agents/pi-embedded-runner/run.tsRemove the four inline helper definitions. Import from the new module. Drop now-unused top-level imports (ReplyPayload, normalizeOptionalString, AuthProfileStore, resolveSessionKeyForRequest, resolveStoredSessionKeyForSessionId, ToolSummaryTrace).

LOC change: run.ts 2,347 → 2,259 (−88 LOC). New run/run-orchestration-helpers.ts is 123 LOC including documentation header.

Validation

pnpm check:architecture
# green: 0 runtime value cycles, 0 madge cycles

pnpm check:test-types
# green: tsgo:core:test + tsgo:extensions:test, 0 type errors

node scripts/run-vitest.mjs run --config test/vitest/vitest.agents.config.ts \
  src/agents/pi-embedded-runner/run.incomplete-turn.test.ts \
  src/agents/pi-embedded-runner/run.empty-error-retry.test.ts \
  src/agents/pi-embedded-runner/run.cross-provider-fallback-error-context.test.ts \
  src/agents/pi-embedded-runner/run.before-agent-reply-cron.test.ts
# green: 4 files / 82 tests passed in 1.42s

Why the reduced scope

runEmbeddedPiAgent is 2,100+ LOC with auth-plan resolution, runtime-plan construction, lane/workspace setup, the per-attempt loop, and terminal-result assembly all sharing one closure. The RFC's four-module split is the right end state but it threads enough closure state through return-shapes (especially the terminal-result phase, which references usage accumulator state, attempt history, fallback metadata, hook-runner output, replay-state observations, and live-model-switch state) that a single review-able PR can't credibly land it without broader e2e coverage on auth profile rotation and post-compaction reply policy. That work will land as a dedicated follow-up.

PR 5 still delivers a real ownership-boundary improvement: the four pure helpers no longer live in the orchestration module, and four top-level imports no longer pollute run.ts's import block.

What did NOT change

  • Public API: zero. All four helpers were private symbols in run.ts; they are now exported from the new module but run.ts does not re-export.
  • runEmbeddedPiAgent body: only the four call sites that use the moved helpers; no orchestration logic touched.
  • Behavior: zero. Pure textual move plus removal of unused imports.
  • All run/* test files (run.incomplete-turn.test.ts, run.empty-error-retry.test.ts, run.cross-provider-fallback-error-context.test.ts, run.before-agent-reply-cron.test.ts, etc.): untouched.
  • Other helpers in pi-embedded-runner/run/: untouched.

Notes for reviewers

  • Linked RFC: openclaw/openclaw#72072. Predecessors: #72098, #72105, #72110, #72113.
  • This PR is intentionally a small slice. PR 7 will track the deferred structural splits (model-auth-plan.ts, runtime-plan-factory.ts, lane-workspace.ts, terminal-result.ts) as named follow-up work for a focused dedicated pass.
  • The new module sits at run/run-orchestration-helpers.ts rather than reusing the existing run/helpers.ts because the two are different concerns: run/helpers.ts already owns retry/scrub/agent-meta utilities, while the four extracted helpers are session-key, auth-store, trace-summary, and reply-payload concerns. Keeping them in a sibling module reads better than overloading a single grab-bag file.

Changed files

  • src/agents/pi-embedded-runner/run.ts (modified, +6/-94)
  • src/agents/pi-embedded-runner/run/run-orchestration-helpers.ts (added, +123/-0)

PR #72118: [refactor] Add embedded runner directory barrels (RFC #72072 PR 6/7, reduced scope)

Description (problem / solution / changelog)

Refs #72072 (RFC) — PR 6 of 7 (reduced scope).

Summary

Adds the two missing directory barrels for the embedded runner public surface:

  • src/agents/embedded-runner/index.ts — canonical directory-barrel form. Re-exports the same neutral-named symbols (runEmbeddedAgent, compactEmbeddedAgentSession, etc.) already published by the existing flat barrel src/agents/embedded-runner.ts.
  • src/agents/pi-embedded-runner/index.ts — deprecated directory-barrel form. Re-exports through the canonical directory barrel, so old pi-embedded-runner/index.js imports keep working.

pi-embedded-runner/aliases.test.ts is extended with a second describe block that asserts identity through both new barrels in addition to the existing flat-barrel asserts. The bidirectional identity contract runEmbeddedAgentFromPiDirBarrel === runEmbeddedPiAgent etc. is now locked at the directory-barrel level too.

This is a reduced PR 6 scope because most of what the RFC originally scoped for PR 6 already shipped on origin/main between the RFC's recon snapshot and PR 1's baseline:

  • embedded-runner.ts (canonical flat barrel) — already exists.
  • pi-embedded-runner.ts already aliases Pi names to neutral names with as-aliasing.
  • aliases.test.ts already includes neutral-barrel identity asserts.
  • embedded-runner/index.ts and pi-embedded-runner/index.ts directory barrels — closed by this PR.
  • ⏸️ Deferred: @deprecated JSDoc on Pi-named exports, ESLint no-restricted-imports warn rule, doc additions for Pi-vs-Codex ownership in docs/pi.md / docs/concepts/agent-loop.md / docs/concepts/agent-runtimes.md / docs/plugins/sdk-runtime.md. These three pieces are tracked in PR 7's handoff doc.

What is being fixed

Third-party consumers and internal tooling sometimes reach for the directory shape (embedded-runner/) when scanning module ownership boundaries; the flat embedded-runner.ts alone left that shape unbacked. Plugin authors who follow the directory-barrel convention had no canonical entry for the embedded runner.

The deprecated pi-embedded-runner/index.ts chain is the symmetric backward-compat: anyone who was importing pi-embedded-runner/index.js (e.g. via tooling that auto-resolves directory barrels) keeps working.

Affected surface: only inside src/agents/{embedded-runner,pi-embedded-runner}/. No public plugin SDK change. No behavior change.

Architecture diff

flowchart TD
  flatNeutral[embedded-runner.ts<br/>canonical flat barrel] -.unchanged.- existingFlat[exports neutral names]
  newDirNeutral[embedded-runner/index.ts<br/>NEW canonical directory barrel] -- re-exports --> flatNeutral
  newDirPi[pi-embedded-runner/index.ts<br/>NEW deprecated directory barrel] -- re-exports --> newDirNeutral
  flatPi[pi-embedded-runner.ts<br/>existing flat alias barrel] -.unchanged.- existingPi[exports both PI and neutral names]
  aliasesTest[aliases.test.ts] -- identity asserts --> flatNeutral
  aliasesTest -- new identity asserts --> newDirNeutral
  aliasesTest -- new identity asserts --> newDirPi

File map

ActionPathPurpose
addsrc/agents/embedded-runner/index.tsCanonical directory-barrel form. Re-exports neutral symbols and types from ../embedded-runner.js.
addsrc/agents/pi-embedded-runner/index.tsDeprecated directory barrel chaining through ../embedded-runner/index.js. Carries @deprecated JSDoc on the file header pointing callers at the canonical directory barrel.
modifysrc/agents/pi-embedded-runner/aliases.test.tsAdd describe block asserting .toBe() identity from both new directory barrels through to pi-embedded-runner.ts. Pre-existing flat-barrel asserts kept as-is.

LOC: 3 files, 81 lines added.

Validation

pnpm check:architecture
# green: 0 runtime value cycles, 0 madge cycles

pnpm check:test-types
# green: tsgo:core:test + tsgo:extensions:test, 0 type errors

pnpm test src/agents/pi-embedded-runner/aliases.test.ts
# green: vitest.unit-fast.config.ts → 1 file / 2 tests passed in 6.57s
# (the existing test plus the new directory-barrel identity assertions)

What did NOT change

  • Public API: zero. All symbols already existed via the flat barrels; the new files are additive.
  • Behavior: zero. Every directory barrel just re-exports through the canonical neutral barrel.
  • pi-embedded-runner.ts (flat, 49 LOC alias barrel): untouched.
  • embedded-runner.ts (flat, 17 LOC canonical barrel): untouched.
  • All Pi-named symbols (runEmbeddedPiAgent, compactEmbeddedPiSession, etc.): identical, still exported from both flat and directory barrels.
  • RunEmbeddedAgentFn / RunEmbeddedPiAgentFn type aliases in src/plugins/runtime/types-core.ts: untouched (deferred to follow-up).
  • Plugin SDK at src/plugin-sdk/agent-harness-runtime.ts: untouched.
  • Docs (docs/pi.md, docs/concepts/agent-loop.md, docs/concepts/agent-runtimes.md, docs/plugins/sdk-runtime.md): untouched (deferred to follow-up after a fresh content audit).

Why the reduced scope

Three RFC-listed PR 6 deliverables were skipped in this PR and tracked in PR 7's handoff doc:

  1. @deprecated JSDoc on Pi-named exports. Adding JSDoc tags on every Pi-named symbol can produce noisy CI lint warnings if any expected internal callers still use those names. A targeted opt-in pattern needs design discussion before landing.
  2. ESLint no-restricted-imports warn rule against **/pi-embedded-runner outside compat barrels and tests. Wiring this through the existing lint pipeline needs careful exemption listing for compat barrels (pi-embedded-runner.ts, the new pi-embedded-runner/index.ts) and for the many pi-embedded-runner/run/* test files. Worth landing as its own focused PR once the directory barrels exist.
  3. Doc additions for Pi-vs-Codex ownership in four docs files. PR 1's recon noted these docs were already accurate at baseline time; a fresh content audit before edits avoids redundant churn. PR 7's handoff doc captures the audit task.

Notes for reviewers

  • Linked RFC: openclaw/openclaw#72072. Predecessors: #72098, #72105, #72110, #72113, #72116.
  • The directory barrel coexists with the flat barrel because import paths are distinct (./embedded-runner.js vs ./embedded-runner/index.js). Node ESM and TypeScript resolve them as separate modules; the identity asserts in aliases.test.ts confirm both paths reach the same canonical functions.
  • If maintainers prefer collapsing embedded-runner.ts into embedded-runner/index.ts (renaming the flat file under the directory), please flag — that is a breaking import-path change for anyone using import "./embedded-runner.js", so I left it alone here.

Changed files

  • src/agents/embedded-runner/index.ts (added, +30/-0)
  • src/agents/pi-embedded-runner/aliases.test.ts (modified, +21/-0)
  • src/agents/pi-embedded-runner/index.ts (added, +30/-0)

PR #72119: [refactor] Document RuntimePlan finalization handoff (RFC #72072 PR 7/7)

Description (problem / solution / changelog)

Refs #72072 (RFC) — PR 7 of 7. Closes RFC 72072 once all six predecessor PRs land.

Summary

Documentation only. Adds docs/refactor/runtime-plan-finalization-complete.md as the maintainer handoff doc for the seven-PR roadmap. Captures:

  • Series at a glance with each PR's effect.
  • Why several PRs landed at reduced scope (442 commits of origin/main drift since the RFC's recon snapshot, plus per-turn vs per-attempt seam analysis).
  • Deferred follow-up work, grouped by structural splits / plugin-side / naming-canonicalization follow-ups.
  • End-to-end verification command set the maintainer should run after merge.
  • GPT-5.4 smoke matrix with the four route checklists from the RFC.
  • Acceptance-criteria assessment against the RFC's verbatim list.

What is being fixed

Without a handoff doc, the deliberate reduced-scope decisions in PR 3-5-6 risk being lost or re-litigated when follow-up work lands. PR 7 anchors them, lists the deferred items concretely, and gives the next maintainer a single starting point.

Affected surface: docs/refactor/ only. No production or test code.

Architecture diff

flowchart LR
  pr1[#72098 baseline doc] --> pr7[#72124 handoff doc]
  pr2[#72105 native V2 factory] --> pr7
  pr3[#72110 attempt-tools.ts] --> pr7
  pr4[#72113 attempt-message-summary.ts] --> pr7
  pr5[#72116 run-orchestration-helpers.ts] --> pr7
  pr6[#72118 directory barrels] --> pr7
  pr7 -.-> deferred[deferred follow-ups]

File map

ActionPathPurpose
adddocs/refactor/runtime-plan-finalization-complete.mdMaintainer handoff doc summarizing the series, deferrals, verification, and smoke matrix. Companion to the baseline doc at docs/refactor/runtime-plan-finalization-baseline.md.

Validation

pnpm check:architecture
# green: 0 runtime value cycles, 0 madge cycles (no code changed)

# Doc-only PR: per CLAUDE.md "Docs/changelog-only and CI/workflow metadata-only
# changes are not changed-gate work by default. Use `git diff --check` plus the
# relevant formatter/docs/workflow sanity check."
git diff --check
# green

The maintainer-side verification commands (full vitest series, GPT-5.4 smoke matrix) are the subject of this handoff doc, not its own validation. They run after the structural PRs (#72098, #72105, #72110, #72113, #72116, #72118) merge.

What did NOT change

  • Production code: zero.
  • Test code: zero.
  • Public API: zero.
  • Any other docs: zero. Only the new handoff file is added.

Why this is the right fix

The alternative was to inline the "deferred follow-ups" lists at the bottom of each PR body and call it documented enough. That works for small follow-up sets but here the deferred items span all three categories (structural, plugin-side, naming) and connect across PRs. Centralising in one doc makes it discoverable from the RFC issue and from the baseline doc, and gives whoever picks up the structural splits a single index of what is left.

Notes for reviewers

Changed files

  • docs/refactor/runtime-plan-finalization-complete.md (added, +110/-0)

PR #72134: [codex] Consolidate RuntimePlan finalization cleanup package

Description (problem / solution / changelog)

Summary

@steipete @pashpashpash this consolidates the seven draft RuntimePlan finalization cleanup PRs into one maintainer-facing package so review effort lands on the PR we intend to merge.

This PR supersedes and carries forward the original draft PRs with traceable commits:

  • #72098 — baseline health document
  • #72105 — native AgentHarnessV2 factory registry for built-in Pi
  • #72110 — attempt tool-policy helper extraction
  • #72113 — attempt message-summary helper extraction
  • #72116 — run orchestration helper extraction
  • #72118 — canonical/deprecated embedded-runner directory barrels
  • #72119 — RuntimePlan finalization handoff document

It is tied to RFC #72072 and intentionally does not claim to finish the remaining large structural split. This is the cleanup base: it makes the safe pieces mergeable, documents what remains, and preserves compatibility while avoiding seven tiny draft PRs competing for maintainer attention.

The next PR series will be larger, structural, and will take time to review and test given they are the real split files.

Architecture Package

flowchart TD
  RFC["RFC #72072\nRuntimePlan finalization"] --> Baseline["#72098\nBaseline health doc"]
  Baseline --> Harness["#72105\nNative Pi Harness V2 factory registry"]
  Harness --> AttemptTools["#72110\nAttempt tool-policy helpers"]
  AttemptTools --> AttemptSummary["#72113\nAttempt message-summary helpers"]
  AttemptSummary --> RunHelpers["#72116\nRun orchestration helpers"]
  RunHelpers --> Barrels["#72118\nEmbedded-runner barrels + alias tests"]
  Barrels --> Handoff["#72119\nFinalization handoff doc"]
  Handoff --> Next["Next PRs\nCodex V2 + real structural splits"]

Commit Stack Map

CommitOriginPurpose
docs(refactor): capture RuntimePlan finalization baseline#72098Records current baseline commands, high-risk files, and pre-split state.
refactor(agents): add native AgentHarnessV2 factory registry#72105Lets internal harness selection prefer a native V2 lifecycle for built-in Pi while keeping public V1 plugin compatibility.
refactor(agents): extract attempt tool-policy helpers#72110Moves tool-policy helper logic out of attempt.ts without changing behavior.
refactor(agents): extract attempt message summary helpers#72113Moves message-summary diagnostics out of attempt.ts without changing behavior.
refactor(agents): extract run orchestration helpers#72116Moves small run orchestration helper functions out of run.ts without changing behavior.
refactor(agents): add canonical and deprecated embedded runner directory barrels#72118Adds neutral directory imports while keeping Pi compatibility paths alive.
docs(refactor): add RuntimePlan finalization handoff#72119Documents what is done, partial, deferred, and how to verify final smoke.
fix: stabilize runtime finalization cleanup packagethis PRFixes alias coverage, import hygiene, registry restore semantics, and stale doc wording found during consolidation audit.
chore: retrigger qa parity gatethis PREmpty commit used only because GitHub rejected direct failed-job rerun without admin rights.
fix: address runtime cleanup review commentsthis PRAddresses Copilot/Greptile feedback on registry test cleanup and consolidated-doc wording.

What Changed

  • Adds an internal native Harness V2 factory registry and a built-in Pi native V2 implementation that still bottoms out in the existing runEmbeddedAttempt path.
  • Extracts safe leaf helpers from attempt.ts / run.ts as preparatory cleanup only.
  • Adds canonical embedded-runner/index.js and deprecated pi-embedded-runner/index.js directory barrels.
  • Strengthens alias tests so Pi-named compatibility exports are protected, not just neutral names.
  • Adds baseline and handoff docs for maintainers to continue the structural cleanup without rediscovering the same map.

What Did Not Change

  • No public plugin API removal.
  • No Codex native V2 factory yet; Codex still reaches V2 through the existing V1 adapter path.
  • No stream-loop/lifecycle split yet.
  • No full run.ts table-of-contents split yet.
  • No pi-embedded-runner package rename beyond additive directory barrels.
  • No work on #70743 or #70772.

Deferred Follow-Ups

The next PR package should finish the actual structural split work, in this order:

  1. Native Codex Harness V2 factory and tests.
  2. Split attempt-prompt.ts and attempt-transport.ts.
  3. Split attempt-stream-loop.ts and attempt-lifecycle.ts.
  4. Split run.ts into model/auth plan, RuntimePlan factory, lane/workspace, and terminal-result modules.
  5. Add canonical embedded-runner docs/import guard once alias barrels are stable.
  6. Final smoke/handoff doc after structural splits land.

Validation

Passed locally:

  • node scripts/run-vitest.mjs run --config test/vitest/vitest.agents.config.ts src/agents/harness/v2.test.ts src/agents/harness/builtin-pi.test.ts src/agents/harness/selection.test.ts src/agents/pi-embedded-runner/aliases.test.ts src/agents/pi-embedded-runner/run/attempt.test.ts
  • ./node_modules/.bin/oxlint --tsconfig tsconfig.oxlint.core.json src/agents/harness/builtin-pi.ts src/agents/harness/selection.test.ts src/agents/harness/v2.ts src/agents/pi-embedded-runner/aliases.test.ts src/agents/pi-embedded-runner/index.ts src/agents/pi-embedded-runner/run/attempt.ts
  • git diff --check
  • pnpm check:architecture

Known local-only caveats observed:

  • Local pnpm check:test-types fails in files outside this PR's diff, mainly existing model compat/provider SDK typings and missing @vincentkoc/qrcode-tui types. The failing files include src/agents/openai-transport-stream.test.ts, src/config/types.models.ts, src/media/qr-runtime.ts, and src/plugin-sdk/provider-catalog-shared.ts; none are touched by this package. GitHub check-test-types is passing on the PR.
  • The qa-lab parity gate currently fails in thread-memory-isolation on this PR and unrelated PRs #71582 / #71862 from the same window (all 11/12 pass with the same scenario timing out). Direct rerun was unavailable without admin rights, so an empty commit retriggered CI once; the repeated failure is documented as a systemic qa-lab/mock timeout rather than a code regression in this package.

Review Notes

This is a replacement for the seven drafts above. After this PR is accepted as the review target, I will comment on and close the old drafts as superseded, preserving their references here for auditability.

Changed files

  • docs/refactor/runtime-plan-finalization-baseline.md (added, +148/-0)
  • docs/refactor/runtime-plan-finalization-complete.md (added, +110/-0)
  • src/agents/embedded-runner/index.ts (added, +31/-0)
  • src/agents/harness/builtin-pi.test.ts (added, +106/-0)
  • src/agents/harness/builtin-pi.ts (modified, +50/-2)
  • src/agents/harness/selection.test.ts (modified, +32/-0)
  • src/agents/harness/selection.ts (modified, +2/-2)
  • src/agents/harness/v2.test.ts (modified, +127/-1)
  • src/agents/harness/v2.ts (modified, +42/-0)
  • src/agents/pi-embedded-runner/aliases.test.ts (modified, +27/-0)
  • src/agents/pi-embedded-runner/index.ts (added, +42/-0)
  • src/agents/pi-embedded-runner/run.ts (modified, +6/-94)
  • src/agents/pi-embedded-runner/run/attempt-message-summary.ts (added, +90/-0)
  • src/agents/pi-embedded-runner/run/attempt-tools.ts (added, +151/-0)
  • src/agents/pi-embedded-runner/run/attempt.ts (modified, +13/-195)
  • src/agents/pi-embedded-runner/run/run-orchestration-helpers.ts (added, +123/-0)

PR #72259: [codex] Split embedded attempt prompt and transport preparation

Description (problem / solution / changelog)

Summary

@steipete @pashpashpash this is the first real structural split after the consolidated RuntimePlan cleanup package.

This PR is stacked after #72134 and implements the next slice from RFC #72072: move prompt/bootstrap preparation and transport/session-stream setup out of run/attempt.ts without changing runtime behavior.

Review note: until #72134 lands, this PR includes that cleanup package in the base diff. The unique diff after #72134 is one commit: 1cdfd2c5be refactor: split attempt prompt and transport preparation.

Diagram

flowchart TD
  A["runEmbeddedAttempt"] --> B["attempt-prompt.ts"]
  A --> C["attempt-transport.ts"]
  A --> D["remaining stream loop + lifecycle"]

  B --> B1["bootstrap routing"]
  B --> B2["context-file remap"]
  B --> B3["bootstrap budget + warning"]
  B --> B4["LLM-boundary message normalization"]

  C --> C1["provider stream selection"]
  C --> C2["OpenAI websocket decision"]
  C --> C3["provider text transforms"]
  C --> C4["RuntimePlan transport extra params"]
  C --> C5["effective cache retention + transport"]

  D --> E["future PR: stream-loop/lifecycle split"]

What Changed

FileChange
src/agents/pi-embedded-runner/run/attempt-prompt.tsNew prompt/bootstrap helper module for primary-bootstrap detection, injected-context remapping, LLM-boundary message normalization, bootstrap routing, context loading, budget analysis, and warning preparation.
src/agents/pi-embedded-runner/run/attempt-transport.tsNew transport helper module for provider stream selection, OpenAI WS selection, provider text transforms, RuntimePlan transport extra params, cache-retention resolution, and effective transport reporting.
src/agents/pi-embedded-runner/run/attempt.tsDelegates the two extracted policy clusters while preserving existing exports and behavior-compatible fallback paths.

What Did Not Change

  • No Harness V2 public API change.
  • No Codex adapter behavior change.
  • No pi-embedded-runner rename.
  • No stream-loop/lifecycle extraction yet.
  • No terminal-result/failover extraction yet.
  • No WS pooling behavior change.

Why This Slice

The previous reduced drafts intentionally avoided the closure-heavy portions of attempt.ts. This PR starts the actual split with two seams that are high-value but still reviewable:

  • Prompt/bootstrap preparation is a coherent ownership boundary and already has focused bootstrap/context tests.
  • Transport setup is a coherent ownership boundary and keeps wrapper ordering intact by mutating activeSession.agent.streamFn in the same sequence before the remaining stream wrappers run.

Validation

Passed locally:

node scripts/run-vitest.mjs run --config test/vitest/vitest.agents.config.ts \
  src/agents/pi-embedded-runner/run/attempt.test.ts \
  src/agents/pi-embedded-runner/run/attempt.spawn-workspace.bootstrap-routing.test.ts \
  src/agents/pi-embedded-runner/run/attempt.spawn-workspace.context-injection.test.ts

node scripts/run-vitest.mjs run --config test/vitest/vitest.agents.config.ts \
  src/agents/pi-embedded-runner/stream-resolution.test.ts \
  src/agents/pi-embedded-runner/run/attempt.tool-call-normalization.test.ts \
  src/agents/pi-embedded-runner/run/attempt.tool-call-argument-repair.test.ts \
  src/agents/pi-embedded-runner/run/llm-idle-timeout.test.ts \
  src/agents/pi-embedded-runner/run/attempt.stop-reason-recovery.test.ts \
  src/agents/pi-embedded-runner/prompt-cache-retention.test.ts \
  src/agents/pi-embedded-runner/prompt-cache-observability.test.ts

./node_modules/.bin/oxlint --tsconfig tsconfig.oxlint.core.json \
  src/agents/pi-embedded-runner/run/attempt.ts \
  src/agents/pi-embedded-runner/run/attempt-prompt.ts \
  src/agents/pi-embedded-runner/run/attempt-transport.ts

git diff --check
pnpm check:architecture

Known baseline/staked caveat:

pnpm check:test-types
pnpm check:changed

Both still fail on existing repo-baseline ModelCompatConfig / @vincentkoc/qrcode-tui drift in files outside this PR's unique diff, matching the current cleanup-package caveat. This PR no longer has attempt-specific type errors after the transport return narrowing.

Related Work

  • RFC: #72072
  • Consolidated cleanup package: #72134
  • Superseded draft sources carried by #72134: #72098, #72105, #72110, #72113, #72116, #72118, #72119

Follow-Up Structural PRs

  • Split the remaining stream loop and lifecycle cleanup state from attempt.ts.
  • Split run.ts by runtime-plan factory, lane/workspace, model/auth, and terminal-result helpers.
  • Add canonical embedded-runner import guard/docs after structural seams are stable.

Changed files

  • docs/refactor/runtime-plan-finalization-baseline.md (added, +148/-0)
  • docs/refactor/runtime-plan-finalization-complete.md (added, +110/-0)
  • src/agents/embedded-runner/index.ts (added, +31/-0)
  • src/agents/harness/builtin-pi.test.ts (added, +106/-0)
  • src/agents/harness/builtin-pi.ts (modified, +50/-2)
  • src/agents/harness/selection.test.ts (modified, +32/-0)
  • src/agents/harness/selection.ts (modified, +2/-2)
  • src/agents/harness/v2.test.ts (modified, +127/-1)
  • src/agents/harness/v2.ts (modified, +42/-0)
  • src/agents/pi-embedded-runner/aliases.test.ts (modified, +27/-0)
  • src/agents/pi-embedded-runner/index.ts (added, +42/-0)
  • src/agents/pi-embedded-runner/run.ts (modified, +6/-94)
  • src/agents/pi-embedded-runner/run/attempt-message-summary.ts (added, +90/-0)
  • src/agents/pi-embedded-runner/run/attempt-prompt.ts (added, +165/-0)
  • src/agents/pi-embedded-runner/run/attempt-tools.ts (added, +151/-0)
  • src/agents/pi-embedded-runner/run/attempt-transport.ts (added, +175/-0)
  • src/agents/pi-embedded-runner/run/attempt.ts (modified, +62/-439)
  • src/agents/pi-embedded-runner/run/run-orchestration-helpers.ts (added, +123/-0)

PR #72261: [codex] Extract embedded run attempt factory

Description (problem / solution / changelog)

@steipete @pashpashpash

Summary

This is the next structural cleanup PR under RFC #72072. It is stacked after #72134 and #72259, and its unique review surface is one commit: cc8c5f289f refactor: extract embedded run attempt factory.

It moves the embedded-run attempt-input construction out of src/agents/pi-embedded-runner/run.ts into a focused factory module. The goal is to make run.ts read more like orchestration without changing runtime behavior, harness selection, auth forwarding, RuntimePlan construction, retry prompt assembly, or attempt payload shape.

flowchart TD
  RFC["RFC #72072"] --> Cleanup["#72134 cleanup package"]
  Cleanup --> PromptTransport["#72259 prompt + transport prep split"]
  PromptTransport --> ThisPR["This PR: run attempt factory split"]

  RunTS["run.ts orchestration"] --> Factory["runtime-plan-factory.ts"]
  Factory --> Prompt["attempt prompt + retry instructions"]
  Factory --> Plan["AgentRuntimePlan build"]
  Factory --> Auth["model auth/header forwarding"]
  Factory --> AttemptInput["EmbeddedRunAttemptParams"]
  AttemptInput --> Backend["runEmbeddedAttemptWithBackend"]

File Map

FileChange
src/agents/pi-embedded-runner/run.tsDelegates attempt-input construction to the new factory. Keeps orchestration flow and retry loop in place.
src/agents/pi-embedded-runner/run/runtime-plan-factory.tsNew helper for prompt assembly, RuntimePlan construction, resolved API key handling, auth header application, and EmbeddedRunAttemptParams shaping.
src/agents/pi-embedded-runner/run/runtime-plan-factory.test.tsUnit-fast coverage for prompt instruction assembly and runtime-auth API key suppression.

What Changed

  • Extracted retry-instruction prompt assembly from the middle of run.ts.
  • Extracted buildAgentRuntimePlan(...) wiring into a named attempt factory helper.
  • Extracted the auth/header behavior that prevents leaking the pre-exchange API key after runtime auth takes over.
  • Kept the selected harness id threaded into the attempt so plugin-owned transports do not drift back to Pi during dispatch.

What Did Not Change

  • No Harness V2 API changes.
  • No public plugin API changes.
  • No package rename or import guard changes.
  • No stream loop extraction.
  • No lifecycle/cleanup extraction.
  • No attempt/result payload shape changes.
  • No work on the old prototype PRs #70743 or #70772.

Stack / Traceability

LayerPRRole
RFC#72072RuntimePlan finalization and embedded runner structural cleanup roadmap.
Cleanup package#72134Consolidated baseline docs, Harness V2 factory registry, reduced helper extracts, barrels, handoff docs.
Prior structural split#72259Extracts attempt prompt/transport preparation helpers.
This PRTBDExtracts the run.ts attempt-input factory seam.

Validation

Passed:

node scripts/run-vitest.mjs run --config test/vitest/vitest.unit-fast.config.ts \
  src/agents/pi-embedded-runner/run/runtime-plan-factory.test.ts --reporter verbose

node scripts/run-vitest.mjs run --config test/vitest/vitest.agents.config.ts \
  src/agents/pi-embedded-runner/run.overflow-compaction.test.ts \
  src/agents/pi-embedded-runner/run.incomplete-turn.test.ts \
  src/agents/pi-embedded-runner/run/setup.test.ts \
  src/agents/pi-embedded-runner/run/auth-controller.test.ts --reporter verbose

./node_modules/.bin/oxlint --tsconfig tsconfig.oxlint.core.json \
  src/agents/pi-embedded-runner/run.ts \
  src/agents/pi-embedded-runner/run/runtime-plan-factory.ts \
  src/agents/pi-embedded-runner/run/runtime-plan-factory.test.ts

git diff --check
pnpm check:architecture

Known baseline issue, not introduced by this PR:

pnpm check:test-types

still fails on existing main/stack drift around ModelCompatConfig / missing @vincentkoc/qrcode-tui types. The failure files are outside this PR's unique diff.

Review Guidance

Please review the unique diff after #72259. The important invariant is that the object passed to runEmbeddedAttemptWithBackend(...) is behavior-equivalent to the previous inline object while moving the policy-shaped construction into a named seam for the next orchestration split.

Changed files

  • docs/refactor/runtime-plan-finalization-baseline.md (added, +148/-0)
  • docs/refactor/runtime-plan-finalization-complete.md (added, +110/-0)
  • src/agents/embedded-runner/index.ts (added, +31/-0)
  • src/agents/harness/builtin-pi.test.ts (added, +106/-0)
  • src/agents/harness/builtin-pi.ts (modified, +50/-2)
  • src/agents/harness/selection.test.ts (modified, +32/-0)
  • src/agents/harness/selection.ts (modified, +2/-2)
  • src/agents/harness/v2.test.ts (modified, +127/-1)
  • src/agents/harness/v2.ts (modified, +42/-0)
  • src/agents/pi-embedded-runner/aliases.test.ts (modified, +27/-0)
  • src/agents/pi-embedded-runner/index.ts (added, +42/-0)
  • src/agents/pi-embedded-runner/run.ts (modified, +40/-241)
  • src/agents/pi-embedded-runner/run/attempt-message-summary.ts (added, +90/-0)
  • src/agents/pi-embedded-runner/run/attempt-prompt.ts (added, +165/-0)
  • src/agents/pi-embedded-runner/run/attempt-tools.ts (added, +151/-0)
  • src/agents/pi-embedded-runner/run/attempt-transport.ts (added, +175/-0)
  • src/agents/pi-embedded-runner/run/attempt.ts (modified, +62/-439)
  • src/agents/pi-embedded-runner/run/run-orchestration-helpers.ts (added, +123/-0)
  • src/agents/pi-embedded-runner/run/runtime-plan-factory.test.ts (added, +47/-0)
  • src/agents/pi-embedded-runner/run/runtime-plan-factory.ts (added, +242/-0)

PR #72269: [codex] Extract attempt stream wrappers and diagnostic lifecycle

Description (problem / solution / changelog)

@steipete @pashpashpash

Summary

This is the next structural cleanup PR under RFC #72072. It is stacked after #72134, #72259, and #72261. The unique review surface after #72261 is the attempt stream-wrapper and diagnostic-lifecycle extraction.

It moves the linear stream wrapper stack out of attempt.ts and extracts run diagnostic start/completion emission into a small lifecycle helper. This deliberately does not move the abort/subscription/cleanup block yet; that block still owns too much mutable loop state and should be split only after this wrapper seam is stable.

flowchart TD
  RFC["RFC #72072"] --> Cleanup["#72134 cleanup package"]
  Cleanup --> PromptTransport["#72259 prompt + transport prep split"]
  PromptTransport --> RunFactory["#72261 embedded run attempt factory"]
  RunFactory --> ThisPR["This PR: attempt stream + diagnostic lifecycle split"]

  Attempt["attempt.ts"] --> StreamWrappers["attempt-stream-wrappers.ts"]
  Attempt --> Lifecycle["attempt-lifecycle.ts"]

  StreamWrappers --> Cache["cache trace wrapper"]
  StreamWrappers --> Transcript["transcript/tool-call replay guards"]
  StreamWrappers --> Yield["sessions_yield abort response"]
  StreamWrappers --> Repair["tool-call normalize/repair wrappers"]
  StreamWrappers --> Provider["payload logger + stop-reason recovery"]

  Lifecycle --> RunStarted["run.started event"]
  Lifecycle --> RunCompleted["run.completed event"]

File Map

FileChange
src/agents/pi-embedded-runner/run/attempt.tsDelegates stream wrapper application and run diagnostic lifecycle setup. Keeps stream loop, abort handling, subscription cleanup, and result shaping in place.
src/agents/pi-embedded-runner/run/attempt-stream-wrappers.tsNew helper that applies the existing wrapper sequence in the same order: cache trace, transcript guards, sessions_yield response, malformed tool-call normalization/repair, payload logging, and stop-reason recovery.
src/agents/pi-embedded-runner/run/attempt-lifecycle.tsNew helper for run.started / once-only run.completed diagnostic events and trace contexts.
src/agents/pi-embedded-runner/run/attempt-lifecycle.test.tsTests started/completed event emission and once-only completion behavior.

What Changed

  • Extracted the linear stream wrapper stack from the middle of runEmbeddedAttempt(...).
  • Preserved wrapper order exactly, including cache trace before transcript/tool-call guards and stop-reason recovery last in this extracted segment.
  • Extracted run-level diagnostic event setup into a lifecycle helper.
  • Left idle-timeout, model-call diagnostics, abort, subscription, cleanup, and final result shaping in attempt.ts because they are still tightly coupled to mutable attempt state.

What Did Not Change

  • No behavior changes to tool execution, transcript repair, retry policy, delivery, or cleanup.
  • No Harness V2 API changes.
  • No public plugin API changes.
  • No package rename or import guard changes.
  • No full stream-loop extraction yet.
  • No work on old prototype PRs #70743 or #70772.

Stack / Traceability

LayerPRRole
RFC#72072RuntimePlan finalization and embedded runner structural cleanup roadmap.
Cleanup package#72134Consolidated baseline docs, Harness V2 factory registry, reduced helper extracts, barrels, handoff docs.
Prior structural split#72259Extracts attempt prompt/transport preparation helpers.
Prior structural split#72261Extracts the run.ts attempt-input factory seam.
This PRTBDExtracts attempt stream wrappers and run diagnostic lifecycle helpers.

Validation

Passed:

node scripts/run-vitest.mjs run --config test/vitest/vitest.agents.config.ts \
  src/agents/pi-embedded-runner/run/attempt-lifecycle.test.ts \
  src/agents/pi-embedded-runner/run/attempt.test.ts \
  src/agents/pi-embedded-runner/run/attempt.tool-call-normalization.test.ts \
  src/agents/pi-embedded-runner/run/attempt.tool-call-argument-repair.test.ts \
  src/agents/pi-embedded-runner/run/attempt.stop-reason-recovery.test.ts \
  src/agents/pi-embedded-runner/run/llm-idle-timeout.test.ts --reporter verbose

./node_modules/.bin/oxlint --tsconfig tsconfig.oxlint.core.json \
  src/agents/pi-embedded-runner/run/attempt.ts \
  src/agents/pi-embedded-runner/run/attempt-lifecycle.ts \
  src/agents/pi-embedded-runner/run/attempt-lifecycle.test.ts \
  src/agents/pi-embedded-runner/run/attempt-stream-wrappers.ts

git diff --check
pnpm check:architecture

Known baseline issue, not introduced by this PR:

pnpm check:test-types

still fails on existing main/stack drift around ModelCompatConfig / missing @vincentkoc/qrcode-tui types. After fixing PR-caused type errors locally, the remaining failures are outside this PR's unique diff.

Review Guidance

Please review the unique diff after #72261. The key invariant is wrapper order: cache trace, transcript replay guards, OpenAI Responses downgrade, sessions_yield abort response, malformed tool-call normalization/repair, XAI HTML-entity decode, Anthropic payload logging, then sensitive stop-reason recovery.

Changed files

  • docs/refactor/runtime-plan-finalization-baseline.md (added, +148/-0)
  • docs/refactor/runtime-plan-finalization-complete.md (added, +110/-0)
  • src/agents/embedded-runner/index.ts (added, +31/-0)
  • src/agents/harness/builtin-pi.test.ts (added, +106/-0)
  • src/agents/harness/builtin-pi.ts (modified, +50/-2)
  • src/agents/harness/selection.test.ts (modified, +32/-0)
  • src/agents/harness/selection.ts (modified, +2/-2)
  • src/agents/harness/v2.test.ts (modified, +127/-1)
  • src/agents/harness/v2.ts (modified, +42/-0)
  • src/agents/pi-embedded-runner/aliases.test.ts (modified, +27/-0)
  • src/agents/pi-embedded-runner/index.ts (added, +42/-0)
  • src/agents/pi-embedded-runner/run.ts (modified, +40/-241)
  • src/agents/pi-embedded-runner/run/attempt-lifecycle.test.ts (added, +67/-0)
  • src/agents/pi-embedded-runner/run/attempt-lifecycle.ts (added, +76/-0)
  • src/agents/pi-embedded-runner/run/attempt-message-summary.ts (added, +90/-0)
  • src/agents/pi-embedded-runner/run/attempt-prompt.ts (added, +165/-0)
  • src/agents/pi-embedded-runner/run/attempt-stream-wrappers.ts (added, +212/-0)
  • src/agents/pi-embedded-runner/run/attempt-tools.ts (added, +151/-0)
  • src/agents/pi-embedded-runner/run/attempt-transport.ts (added, +175/-0)
  • src/agents/pi-embedded-runner/run/attempt.ts (modified, +86/-639)
  • src/agents/pi-embedded-runner/run/run-orchestration-helpers.ts (added, +123/-0)
  • src/agents/pi-embedded-runner/run/runtime-plan-factory.test.ts (added, +47/-0)
  • src/agents/pi-embedded-runner/run/runtime-plan-factory.ts (added, +242/-0)

PR #72272: [codex] Extract embedded run lane and workspace helpers

Description (problem / solution / changelog)

@steipete @pashpashpash

Summary

This is the next structural cleanup slice in the RuntimePlan finalization package from RFC #72072. It keeps the behavior of runEmbeddedPiAgent intact while moving the lane/workspace/bootstrap-adjacent orchestration at the top of src/agents/pi-embedded-runner/run.ts into a tested helper module.

This PR is intentionally stacked on the existing finalization branches, so the GitHub diff against main is cumulative. The unique new tip commit is:

CommitScope
7b50588a14Extract queue lane planning, tool-result format selection, probe-session detection, abort normalization, and workspace fallback logging into run/lane-workspace.ts with focused tests.

Package Links

ItemRole
#72072RFC: RuntimePlan finalization and embedded runner structural cleanup
#72134Consolidated cleanup package that superseded the seven small drafts
#72259Prior structural split: attempt prompt and transport preparation
#72261Prior structural split: embedded run attempt factory
#72269Prior structural split: attempt stream wrappers and diagnostic lifecycle
This PRNext run-orchestration slice: lane, workspace, channel-format, and abort helpers

Architecture

flowchart TD
  Entry["runEmbeddedPiAgent"] --> SessionKey["session-key backfill"]
  SessionKey --> QueuePlan["lane-workspace.ts: queue plan"]
  QueuePlan --> SessionLane["session lane enqueue"]
  SessionLane --> GlobalLane["global lane enqueue"]
  GlobalLane --> Workspace["lane-workspace.ts: workspace context"]
  Workspace --> Plugins["runtime plugin load"]
  Plugins --> ExistingRun["existing model/auth/runtime-plan flow"]

  Entry --> Format["lane-workspace.ts: tool-result format"]
  Entry --> Abort["lane-workspace.ts: abort normalization"]
  Workspace --> FallbackLog["workspace fallback warning"]

File Map

FileChange
src/agents/pi-embedded-runner/run/lane-workspace.tsNew helper module for queue lane planning, tool-result format resolution, probe detection, abort normalization, workspace context resolution, and fallback logging.
src/agents/pi-embedded-runner/run/lane-workspace.test.tsFocused tests for the extracted orchestration behavior.
src/agents/pi-embedded-runner/run.tsReplaces inline lane/workspace/abort/channel-format logic with calls to the helper.

What Changed

  • The session/global queue decision is now an explicit EmbeddedRunQueuePlan rather than inline closure setup.
  • The cron deadlock guard (cron lane becomes nested global lane) remains unchanged and is now tested at the helper boundary.
  • Tool-result format selection still prefers caller input, then markdown-capable channel metadata, then markdown as the no-channel default.
  • Abort normalization is unchanged: Error reasons rethrow directly; non-Error reasons become AbortError with cause.
  • Workspace fallback resolution still uses resolveRunWorkspaceDir, canonical workspace comparison, and the same redacted fallback warning.

What Did Not Change

  • No RuntimePlan behavior changes.
  • No Harness V2 behavior changes.
  • No plugin API changes.
  • No pi-embedded-runner rename.
  • No file move or stream/lifecycle extraction beyond this helper seam.
  • No work on prototype PRs #70743 or #70772.

Validation

Passed:

node scripts/run-vitest.mjs run --config test/vitest/vitest.agents.config.ts \
  src/agents/pi-embedded-runner/run/lane-workspace.test.ts \
  src/agents/pi-embedded-runner/run/setup.test.ts \
  src/agents/pi-embedded-runner/run/auth-controller.test.ts \
  src/agents/pi-embedded-runner/run.incomplete-turn.test.ts --reporter verbose

./node_modules/.bin/oxlint --tsconfig tsconfig.oxlint.core.json \
  src/agents/pi-embedded-runner/run.ts \
  src/agents/pi-embedded-runner/run/lane-workspace.ts \
  src/agents/pi-embedded-runner/run/lane-workspace.test.ts

git diff --check
pnpm check:architecture

Known current-main / stack baseline:

pnpm check:test-types

Still fails on the existing ModelCompatConfig.supportsLongCacheRetention / @vincentkoc/qrcode-tui baseline drift. After the latest fix, there are no lane-workspace.* or this-PR errors in that output.

Deferred Work

  • Continue splitting run.ts into model/auth plan, RuntimePlan factory, and terminal-result modules.
  • Keep old Pi paths compatible until the neutral embedded-runner naming PR lands.
  • Leave WS pooling and public Harness V2 plugin API out of this package unless maintainers explicitly request them.

Changed files

  • docs/refactor/runtime-plan-finalization-baseline.md (added, +148/-0)
  • docs/refactor/runtime-plan-finalization-complete.md (added, +110/-0)
  • src/agents/embedded-runner/index.ts (added, +31/-0)
  • src/agents/harness/builtin-pi.test.ts (added, +106/-0)
  • src/agents/harness/builtin-pi.ts (modified, +50/-2)
  • src/agents/harness/selection.test.ts (modified, +32/-0)
  • src/agents/harness/selection.ts (modified, +2/-2)
  • src/agents/harness/v2.test.ts (modified, +127/-1)
  • src/agents/harness/v2.ts (modified, +42/-0)
  • src/agents/pi-embedded-runner/aliases.test.ts (modified, +27/-0)
  • src/agents/pi-embedded-runner/index.ts (added, +42/-0)
  • src/agents/pi-embedded-runner/run.ts (modified, +73/-294)
  • src/agents/pi-embedded-runner/run/attempt-lifecycle.test.ts (added, +67/-0)
  • src/agents/pi-embedded-runner/run/attempt-lifecycle.ts (added, +76/-0)
  • src/agents/pi-embedded-runner/run/attempt-message-summary.ts (added, +90/-0)
  • src/agents/pi-embedded-runner/run/attempt-prompt.ts (added, +165/-0)
  • src/agents/pi-embedded-runner/run/attempt-stream-wrappers.ts (added, +212/-0)
  • src/agents/pi-embedded-runner/run/attempt-tools.ts (added, +151/-0)
  • src/agents/pi-embedded-runner/run/attempt-transport.ts (added, +175/-0)
  • src/agents/pi-embedded-runner/run/attempt.ts (modified, +86/-639)
  • src/agents/pi-embedded-runner/run/lane-workspace.test.ts (added, +149/-0)
  • src/agents/pi-embedded-runner/run/lane-workspace.ts (added, +132/-0)
  • src/agents/pi-embedded-runner/run/run-orchestration-helpers.ts (added, +123/-0)
  • src/agents/pi-embedded-runner/run/runtime-plan-factory.test.ts (added, +47/-0)
  • src/agents/pi-embedded-runner/run/runtime-plan-factory.ts (added, +242/-0)

PR #72274: [codex] Extract embedded run terminal result shaping

Description (problem / solution / changelog)

@steipete @pashpashpash

Summary

This is the next structural cleanup slice for RFC #72072. It extracts the success-terminal result assembly at the bottom of runEmbeddedPiAgent into a focused helper so run.ts no longer owns the full metadata/result object shape inline.

This PR is stacked on the existing RuntimePlan finalization branches, so the GitHub diff against main is cumulative. The unique new tip commit is:

CommitScope
556cad03dfExtract success-terminal result shaping into run/terminal-result.ts with unit tests for stop reason priority, execution trace shape, pending hosted-tool calls, request shaping, completion trace, context-management trace, and silent-empty payloads.

Package Links

ItemRole
#72072RFC: RuntimePlan finalization and embedded runner structural cleanup
#72134Consolidated cleanup package
#72259Attempt prompt and transport preparation split
#72261Embedded run attempt factory split
#72269Attempt stream wrappers and diagnostic lifecycle split
#72272Embedded run lane/workspace helper split
This PRSuccess terminal-result shaping split

Architecture

flowchart TD
  AttemptLoop["runEmbeddedPiAgent loop"] --> Payloads["buildEmbeddedRunPayloads"]
  Payloads --> Liveness["replay/liveness resolution"]
  Liveness --> StopReason["terminal-result.ts: stop reason"]
  StopReason --> Terminal["terminal-result.ts: EmbeddedPiRunResult"]

  Terminal --> PayloadShape["payloads / silent-empty"]
  Terminal --> Meta["agent meta + lifecycle meta"]
  Terminal --> Trace["execution trace + request shaping"]
  Terminal --> Completion["completion + pending tool calls"]
  Terminal --> DeliveryState["messaging side-effect fields"]

File Map

FileChange
src/agents/pi-embedded-runner/run/terminal-result.tsNew helper for stop reason resolution, execution trace construction, and final success EmbeddedPiRunResult assembly.
src/agents/pi-embedded-runner/run/terminal-result.test.tsUnit tests for the extracted metadata/result behavior.
src/agents/pi-embedded-runner/run.tsReplaces the inline success-terminal object with buildEmbeddedRunTerminalResult(...). Retry/error branches remain in place.

What Changed

  • Hosted client tool calls still win stop reason priority as tool_calls.
  • sessions_yield still maps to end_turn before falling back to the provider stop reason.
  • Silent empty completions still emit the canonical NO_REPLY token and terminalReplyKind: silent-empty.
  • Pending hosted tool calls still receive a generated ID and JSON-serialized arguments.
  • Execution trace still preserves the legacy shape: a success attempt row is emitted only when prior attempts exist or the assistant surfaced an explicit winning provider/model.

What Did Not Change

  • No retry/error branch extraction in this PR.
  • No RuntimePlan policy changes.
  • No Harness V2 behavior changes.
  • No public plugin API changes.
  • No file moves, package rename, or WS pooling.
  • No work on prototype PRs #70743 or #70772.

Validation

Passed:

./node_modules/.bin/vitest run --config test/vitest/vitest.unit-fast.config.ts \
  src/agents/pi-embedded-runner/run/terminal-result.test.ts --reporter verbose

node scripts/run-vitest.mjs run --config test/vitest/vitest.agents.config.ts \
  src/agents/pi-embedded-runner/run.incomplete-turn.test.ts --reporter verbose

./node_modules/.bin/oxlint --tsconfig tsconfig.oxlint.core.json \
  src/agents/pi-embedded-runner/run.ts \
  src/agents/pi-embedded-runner/run/terminal-result.ts \
  src/agents/pi-embedded-runner/run/terminal-result.test.ts

git diff --check
pnpm check:architecture

Known current-main / stack baseline:

pnpm check:test-types

Still fails on the existing ModelCompatConfig.supportsLongCacheRetention / missing @vincentkoc/qrcode-tui baseline drift. No terminal-result.* or this-PR files appear in that failure list.

Deferred Work

  • Continue splitting the remaining run.ts model/auth and error-terminal seams cautiously.
  • Keep retry/error recovery inline until the mutable loop state has an explicit boundary.
  • Keep old Pi path compatibility until neutral embedded-runner naming lands.

Changed files

  • docs/refactor/runtime-plan-finalization-baseline.md (added, +148/-0)
  • docs/refactor/runtime-plan-finalization-complete.md (added, +110/-0)
  • src/agents/embedded-runner/index.ts (added, +31/-0)
  • src/agents/harness/builtin-pi.test.ts (added, +106/-0)
  • src/agents/harness/builtin-pi.ts (modified, +50/-2)
  • src/agents/harness/selection.test.ts (modified, +32/-0)
  • src/agents/harness/selection.ts (modified, +2/-2)
  • src/agents/harness/v2.test.ts (modified, +127/-1)
  • src/agents/harness/v2.ts (modified, +42/-0)
  • src/agents/pi-embedded-runner/aliases.test.ts (modified, +27/-0)
  • src/agents/pi-embedded-runner/index.ts (added, +42/-0)
  • src/agents/pi-embedded-runner/run.ts (modified, +108/-379)
  • src/agents/pi-embedded-runner/run/attempt-lifecycle.test.ts (added, +67/-0)
  • src/agents/pi-embedded-runner/run/attempt-lifecycle.ts (added, +76/-0)
  • src/agents/pi-embedded-runner/run/attempt-message-summary.ts (added, +90/-0)
  • src/agents/pi-embedded-runner/run/attempt-prompt.ts (added, +165/-0)
  • src/agents/pi-embedded-runner/run/attempt-stream-wrappers.ts (added, +212/-0)
  • src/agents/pi-embedded-runner/run/attempt-tools.ts (added, +151/-0)
  • src/agents/pi-embedded-runner/run/attempt-transport.ts (added, +175/-0)
  • src/agents/pi-embedded-runner/run/attempt.ts (modified, +86/-639)
  • src/agents/pi-embedded-runner/run/lane-workspace.test.ts (added, +149/-0)
  • src/agents/pi-embedded-runner/run/lane-workspace.ts (added, +132/-0)
  • src/agents/pi-embedded-runner/run/run-orchestration-helpers.ts (added, +123/-0)
  • src/agents/pi-embedded-runner/run/runtime-plan-factory.test.ts (added, +47/-0)
  • src/agents/pi-embedded-runner/run/runtime-plan-factory.ts (added, +242/-0)
  • src/agents/pi-embedded-runner/run/terminal-result.test.ts (added, +176/-0)
  • src/agents/pi-embedded-runner/run/terminal-result.ts (added, +158/-0)

PR #72276: [codex] Consolidate embedded runner structural splits

Description (problem / solution / changelog)

@steipete @pashpashpash

Summary

This PR is the single maintainer-facing RuntimePlan finalization and embedded-runner structural cleanup package for RFC #72072. It consolidates the earlier cleanup package plus the follow-up structural split PRs, then applies the actionable bot-review fixes from those PRs.

GitHub shows a cumulative diff against main; the tables below preserve the original PR/commit traceability so maintainers can review by logical slice instead of by archeology.

Superseded PRs

Superseded PRRole
#72134Consolidated RuntimePlan finalization cleanup package
#72259Split attempt prompt and transport preparation
#72261Extract embedded run attempt factory
#72269Extract attempt stream wrappers and diagnostic lifecycle
#72272Extract embedded run lane/workspace helpers, auto-closed by active PR limit
#72274Extract embedded run terminal result shaping, auto-closed by active PR limit

Commit Stack Map

CommitSourceScope
39b0afb525#72134Capture finalization baseline doc.
e754112b25#72134Add native AgentHarnessV2 factory registry.
fe4f272510#72134Extract attempt tool-policy helpers.
ecf2cd44fc#72134Extract attempt message summary helpers.
0689e41814#72134Extract run orchestration helpers.
755616970b#72134Add canonical/deprecated embedded-runner barrels.
9c6a61dc07#72134Add RuntimePlan finalization handoff doc.
83a1f03b52, 7dca2ad3e8, ae5a5cb7bb#72134Stabilization/review fixes and QA retrigger.
1cdfd2c5be#72259Split prompt/bootstrap-context and transport setup out of attempt.ts.
cc8c5f289f#72261Extract embedded run attempt input / RuntimePlan factory from run.ts.
96e08f50a5#72269Extract run diagnostic lifecycle emitter.
d1bf5e6e1a#72269Extract ordered stream wrapper stack from attempt.ts.
5b54353fc1#72269Relax diagnostic lifecycle params to match narrow production/test callers.
7b50588a14#72272Extract lane/workspace/channel-format/abort helpers from run.ts.
556cad03df#72274Extract success terminal-result shaping from run.ts.
c961ad9c25Review loopFix actionable bot comments from the superseded PRs.
42d927d7d7CI fixStringify the prior transport before logging transport overrides.
58ff685083Review loopRemove redundant attempt wrapper imports flagged on the consolidated PR.

Architecture

flowchart TD
  RFC["RFC #72072"] --> Contracts["RuntimePlan cleanup package (#72134)"]
  Contracts --> Harness["AgentHarnessV2 native factory registry"]
  Contracts --> Barrels["embedded-runner compatibility barrels"]
  Contracts --> Docs["baseline + handoff docs"]

  Harness --> AttemptPrep["attempt prompt + transport helpers"]
  AttemptPrep --> AttemptRuntime["attempt lifecycle + stream wrappers"]
  AttemptRuntime --> RunFactory["run attempt factory"]
  RunFactory --> LaneWorkspace["run lane/workspace helpers"]
  LaneWorkspace --> TerminalResult["run terminal result helper"]

  AttemptPrep --> AttemptTs["attempt.ts thinner orchestration"]
  AttemptRuntime --> AttemptTs
  RunFactory --> RunTs["run.ts thinner orchestration"]
  LaneWorkspace --> RunTs
  TerminalResult --> RunTs

File Map

File / AreaChange
docs/refactor/runtime-plan-finalization-baseline.mdDocuments baseline health and known drift before structural cleanup.
docs/refactor/runtime-plan-finalization-complete.mdMaintainer handoff for completed/deferred RuntimePlan finalization work.
src/agents/harness/v2.ts and harness testsAdds internal native V2 factory registry while preserving public V1 compatibility. Review fix: test comments now rely on explicit cleanup, not Vitest isolation.
src/agents/embedded-runner/index.ts, src/agents/pi-embedded-runner/index.ts, alias testsAdds neutral embedded-runner barrels while keeping Pi-named compatibility aliases.
src/agents/pi-embedded-runner/run/attempt-tools.tsTool allow-list/policy helper extraction.
src/agents/pi-embedded-runner/run/attempt-message-summary.tsDiagnostic message/transcript summary helper extraction.
src/agents/pi-embedded-runner/run/attempt-prompt.tsBootstrap routing/context injection and prompt-boundary preparation helper.
src/agents/pi-embedded-runner/run/attempt-transport.tsPer-turn streamFn, transport override, text-transform, extra-param, and prompt-cache-retention helper. Review fix: streamFn is required and activeSession.sessionId is the single source of truth.
src/agents/pi-embedded-runner/run/attempt-lifecycle.tsDiagnostic run.started / once-only run.completed lifecycle emitter. Review fix: hoisted mock and aborted outcome coverage.
src/agents/pi-embedded-runner/run/attempt-stream-wrappers.tsOrdered stream wrapper stack for cache tracing, transcript sanitation, yield abort, malformed tool-call cleanup/repair, payload logging, and stop-reason recovery.
src/agents/pi-embedded-runner/run/runtime-plan-factory.tsAttempt input builder and RuntimePlan wiring from run.ts.
src/agents/pi-embedded-runner/run/lane-workspace.tsQueue lane planning, tool-result format selection, probe-session detection, abort normalization, workspace context resolution, and fallback logging. Review fix: documents that caller-supplied enqueue opts out of lane routing.
src/agents/pi-embedded-runner/run/terminal-result.tsSuccess terminal EmbeddedPiRunResult shaping, stop-reason priority, execution trace, request shaping, completion trace, pending hosted tool calls, and silent-empty payloads.
src/agents/pi-embedded-runner/run.ts, run/attempt.tsDelegates extracted seams while keeping core run/recovery behavior in place.

Review Comments Addressed

SourceFix
#72134 Copilot commentReworded V2 registry test comment to avoid claiming Vitest test-file isolation; explicit cleanup is the invariant.
#72259/#72261 Copilot commentsMade AttemptTransportSession.agent.streamFn required and removed duplicated sessionId parameter.
#72269 Greptile/Copilot commentsAdded aborted lifecycle diagnostic test and switched to vi.hoisted mock setup.
#72272 Greptile commentAdded comment documenting that caller-provided enqueue opts out of lane routing and the cron deadlock guard.
#72276 Copilot commentRemoved redundant direct imports from attempt.ts; the existing re-export statements now carry those helpers alone.
#72276 check-lintFixed the restrict-template-expressions failure in attempt-transport.ts by normalizing the previous transport value before interpolation.

What Changed

  • The RuntimePlan finalization cleanup docs/barrels/Harness V2 registry and the structural helper extractions are in one package.
  • attempt.ts and run.ts are thinner by ownership boundary, not cosmetic convenience.
  • Extracted helpers have focused tests for behavior that used to be buried in long functions.
  • The most closure-heavy retry/error recovery paths remain inline to avoid behavioral drift.

What Did Not Change

  • No public plugin API removal.
  • No Harness V2 public plugin API expansion.
  • No pi-embedded-runner package rename beyond compatibility barrels.
  • No WS pooling.
  • No retry/error recovery extraction from the stream loop.
  • No work on prototype PRs #70743 or #70772.

Validation

Passed locally on this branch:

node scripts/run-vitest.mjs run --config test/vitest/vitest.agents.config.ts \
  src/agents/harness/v2.test.ts --reporter verbose

./node_modules/.bin/vitest run --config test/vitest/vitest.unit-fast.config.ts \
  src/agents/pi-embedded-runner/run/terminal-result.test.ts --reporter verbose

node scripts/run-vitest.mjs run --config test/vitest/vitest.agents.config.ts \
  src/agents/pi-embedded-runner/run/attempt-lifecycle.test.ts \
  src/agents/pi-embedded-runner/run/attempt.test.ts \
  src/agents/pi-embedded-runner/run/attempt.tool-call-normalization.test.ts \
  src/agents/pi-embedded-runner/run/llm-idle-timeout.test.ts \
  src/agents/pi-embedded-runner/run/lane-workspace.test.ts --reporter verbose

./node_modules/.bin/oxlint --tsconfig tsconfig.oxlint.core.json \
  src/agents/harness/v2.test.ts \
  src/agents/pi-embedded-runner/run/attempt.ts \
  src/agents/pi-embedded-runner/run/attempt-transport.ts \
  src/agents/pi-embedded-runner/run/attempt-lifecycle.test.ts \
  src/agents/pi-embedded-runner/run/lane-workspace.ts \
  src/agents/pi-embedded-runner/run.ts \
  src/agents/pi-embedded-runner/run/terminal-result.ts \
  src/agents/pi-embedded-runner/run/terminal-result.test.ts

git diff --check
pnpm check:architecture

OPENCLAW_OXLINT_SKIP_LOCK=1 OPENCLAW_OXLINT_SKIP_PREPARE=1 \
  node scripts/run-oxlint.mjs --tsconfig tsconfig.oxlint.core.json src ui packages --threads=8

Known current-main / stack baseline:

pnpm check:test-types

Still fails on existing ModelCompatConfig.supportsLongCacheRetention / missing @vincentkoc/qrcode-tui drift. No files introduced by this PR appear in the remaining failure list.

Deferred Work

  • Model/auth plan extraction from run.ts if maintainers want another structural slice.
  • Full neutral embedded-runner import guard after compatibility aliases settle.
  • Final GPT-5.4 smoke/handoff doc after this package lands.

Changed files

  • docs/refactor/runtime-plan-finalization-baseline.md (added, +148/-0)
  • docs/refactor/runtime-plan-finalization-complete.md (added, +123/-0)
  • extensions/codex/src/app-server/computer-use.ts (modified, +1/-1)
  • extensions/codex/src/command-formatters.ts (modified, +1/-1)
  • src/agents/embedded-runner/index.ts (added, +31/-0)
  • src/agents/harness/builtin-pi.test.ts (added, +106/-0)
  • src/agents/harness/builtin-pi.ts (modified, +51/-2)
  • src/agents/harness/selection.test.ts (modified, +32/-0)
  • src/agents/harness/selection.ts (modified, +2/-2)
  • src/agents/harness/v2.test.ts (modified, +128/-1)
  • src/agents/harness/v2.ts (modified, +42/-0)
  • src/agents/pi-embedded-runner/aliases.test.ts (modified, +27/-0)
  • src/agents/pi-embedded-runner/index.ts (added, +42/-0)
  • src/agents/pi-embedded-runner/run.ts (modified, +108/-379)
  • src/agents/pi-embedded-runner/run/attempt-lifecycle.test.ts (added, +89/-0)
  • src/agents/pi-embedded-runner/run/attempt-lifecycle.ts (added, +78/-0)
  • src/agents/pi-embedded-runner/run/attempt-message-summary.ts (added, +90/-0)
  • src/agents/pi-embedded-runner/run/attempt-prompt.ts (added, +165/-0)
  • src/agents/pi-embedded-runner/run/attempt-stream-wrappers.ts (added, +212/-0)
  • src/agents/pi-embedded-runner/run/attempt-tools.ts (added, +153/-0)
  • src/agents/pi-embedded-runner/run/attempt-transport.ts (added, +177/-0)
  • src/agents/pi-embedded-runner/run/attempt.ts (modified, +85/-648)
  • src/agents/pi-embedded-runner/run/lane-workspace.test.ts (added, +149/-0)
  • src/agents/pi-embedded-runner/run/lane-workspace.ts (added, +135/-0)
  • src/agents/pi-embedded-runner/run/run-orchestration-helpers.ts (added, +121/-0)
  • src/agents/pi-embedded-runner/run/runtime-plan-factory.test.ts (added, +47/-0)
  • src/agents/pi-embedded-runner/run/runtime-plan-factory.ts (added, +242/-0)
  • src/agents/pi-embedded-runner/run/terminal-result.test.ts (added, +176/-0)
  • src/agents/pi-embedded-runner/run/terminal-result.ts (added, +158/-0)

Code Example

flowchart TD
  RuntimePlan["AgentRuntimePlan\nOpenClaw-owned policy"] --> Pi["Pi embedded runner"]
  RuntimePlan --> Codex["Codex app-server adapter"]
  RuntimePlan --> Tools["Tools + schema + diagnostics"]
  RuntimePlan --> Transcript["Transcript policy"]
  RuntimePlan --> Delivery["Delivery + NO_REPLY"]
  RuntimePlan --> Outcome["Outcome + fallback"]
  RuntimePlan --> Transport["Transport params"]
  RuntimePlan --> Observability["resolvedRef events"]

  HarnessV2["Harness V2 lifecycle"] --> Pi
  HarnessV2 --> Codex
  Selection["Harness selection"] --> HarnessV2

  Pi --> SplitAttempt["attempt modules by domain"]
  Pi --> SplitRun["run orchestration modules"]
  SplitAttempt --> NeutralName["embedded-runner canonical naming"]
  SplitRun --> NeutralName

---

git fetch origin
git switch -c contract-finalization/baseline-health origin/main

---

pnpm check:architecture
pnpm check:test-types
node scripts/run-vitest.mjs run --config test/vitest/vitest.agents.config.ts \
  src/agents/runtime-plan/build.test.ts \
  src/agents/runtime-plan/types.test.ts \
  src/agents/runtime-plan/types.compat.test.ts \
  src/agents/runtime-plan/tools.test.ts \
  src/agents/runtime-plan/tools.diagnostics.test.ts \
  src/agents/harness/v2.test.ts \
  src/agents/harness/selection.test.ts
node scripts/run-vitest.mjs run --config test/vitest/vitest.extensions.config.ts \
  extensions/codex/src/app-server/run-attempt.test.ts \
  extensions/codex/src/app-server/event-projector.test.ts

---

git push -u 100yenadmin contract-finalization/baseline-health
gh pr create --repo openclaw/openclaw --base main \
  --head 100yenadmin:contract-finalization/baseline-health \
  --draft \
  --title "[codex] Document RuntimePlan finalization baseline" \
  --body-file /tmp/runtime-plan-finalization-baseline-pr.md

---

git fetch origin
git switch -c contract-finalization/native-harness-v2 origin/main

---

git push -u 100yenadmin contract-finalization/native-harness-v2
gh pr create --repo openclaw/openclaw --base main \
  --head 100yenadmin:contract-finalization/native-harness-v2 \
  --draft \
  --title "[codex] Promote Harness V2 as internal lifecycle boundary" \
  --body-file /tmp/native-harness-v2-pr.md

---

git fetch origin
git switch -c contract-finalization/attempt-prep-split origin/main

---

git push -u 100yenadmin contract-finalization/attempt-prep-split
gh pr create --repo openclaw/openclaw --base main \
  --head 100yenadmin:contract-finalization/attempt-prep-split \
  --draft \
  --title "[codex] Split embedded attempt preparation domains" \
  --body-file /tmp/attempt-prep-split-pr.md

---

git fetch origin
git switch -c contract-finalization/attempt-stream-lifecycle-split origin/main

---

git push -u 100yenadmin contract-finalization/attempt-stream-lifecycle-split
gh pr create --repo openclaw/openclaw --base main \
  --head 100yenadmin:contract-finalization/attempt-stream-lifecycle-split \
  --draft \
  --title "[codex] Split embedded attempt stream lifecycle" \
  --body-file /tmp/attempt-stream-lifecycle-split-pr.md

---

git fetch origin
git switch -c contract-finalization/run-orchestration-split origin/main

---

git push -u 100yenadmin contract-finalization/run-orchestration-split
gh pr create --repo openclaw/openclaw --base main \
  --head 100yenadmin:contract-finalization/run-orchestration-split \
  --draft \
  --title "[codex] Split embedded run orchestration" \
  --body-file /tmp/run-orchestration-split-pr.md

---

git fetch origin
git switch -c contract-finalization/embedded-runner-canonical-names origin/main

---

git push -u 100yenadmin contract-finalization/embedded-runner-canonical-names
gh pr create --repo openclaw/openclaw --base main \
  --head 100yenadmin:contract-finalization/embedded-runner-canonical-names \
  --draft \
  --title "[codex] Canonicalize embedded runner naming" \
  --body-file /tmp/embedded-runner-canonical-names-pr.md

---

git fetch origin
git switch -c contract-finalization/final-smoke-docs origin/main

---

git push -u 100yenadmin contract-finalization/final-smoke-docs
gh pr create --repo openclaw/openclaw --base main \
  --head 100yenadmin:contract-finalization/final-smoke-docs \
  --draft \
  --title "[codex] Document RuntimePlan finalization handoff" \
  --body-file /tmp/final-smoke-docs-pr.md
RAW_BUFFERClick to expand / collapse

Summary

This RFC tracks the post-#71722 finalization roadmap for the contract-first Pi/Codex runtime work.

The original RFC #71004 is implemented and closed. The core RuntimePlan boundary is now in main: OpenClaw-owned runtime policy is represented in AgentRuntimePlan, built before attempts, consumed by Pi/Codex paths, covered by parity contracts, and routed through the additive Harness V2 lifecycle.

This RFC is therefore not another GPT-5.4 bug-fix push. It is the cleanup ladder needed to make the merged architecture maintainable: native Harness V2 ownership, runner file splits by contract boundary, neutral embedded-runner naming, docs, and final smoke verification.

Original Evidence And Merged Work

ItemRole
#71004Original contract-first RFC, now closed as implemented
#70743GPT-5.4 runtime hardening point fixes
#70772Pi/Codex harness extension seams
#70760Harness selection decision observability
#70965Codex dynamic tools preserve OpenClaw hook contracts
#71096Full RuntimePlan contract suite and first implementation
#71722Consolidated RuntimePlan/Harness V2 follow-up package
#71196/#71197/#71201/#71220/#71222/#71223/#71224/#71238/#71239Superseded by #71722 and preserved there via cherry-pick -x

Current high-risk structural files:

  • src/agents/pi-embedded-runner/run/attempt.ts - about 3.1k LOC
  • src/agents/pi-embedded-runner/run.ts - about 2.3k LOC
  • extensions/codex/src/app-server/run-attempt.ts - about 1.0k LOC
  • src/agents/harness/v2.ts
  • src/agents/harness/selection.ts
  • src/agents/runtime-plan/{types.ts,build.ts,tools.ts,auth.ts}

Architecture Target

flowchart TD
  RuntimePlan["AgentRuntimePlan\nOpenClaw-owned policy"] --> Pi["Pi embedded runner"]
  RuntimePlan --> Codex["Codex app-server adapter"]
  RuntimePlan --> Tools["Tools + schema + diagnostics"]
  RuntimePlan --> Transcript["Transcript policy"]
  RuntimePlan --> Delivery["Delivery + NO_REPLY"]
  RuntimePlan --> Outcome["Outcome + fallback"]
  RuntimePlan --> Transport["Transport params"]
  RuntimePlan --> Observability["resolvedRef events"]

  HarnessV2["Harness V2 lifecycle"] --> Pi
  HarnessV2 --> Codex
  Selection["Harness selection"] --> HarnessV2

  Pi --> SplitAttempt["attempt modules by domain"]
  Pi --> SplitRun["run orchestration modules"]
  SplitAttempt --> NeutralName["embedded-runner canonical naming"]
  SplitRun --> NeutralName

Desired steady state:

  • OpenClaw owns runtime policy once through RuntimePlan.
  • Pi and Codex consume shared policy instead of reassembling it locally.
  • Harness V2 is the internal lifecycle boundary for selected harness execution.
  • Public plugin AgentHarness remains compatible.
  • pi-embedded-runner stays available as a compatibility path, but internal code and docs prefer neutral embedded-runner naming.

Roadmap

PR 1: Baseline Health Check

Purpose: confirm current main after #71722 is a clean base or document unrelated baseline drift before structural work.

Create:

git fetch origin
git switch -c contract-finalization/baseline-health origin/main

Change/add:

  • Add docs/refactor/runtime-plan-finalization-baseline.md
  • No production code changes unless a current-main check failure directly blocks the next PRs.

Detailed steps:

  1. Run pnpm check:architecture.
  2. Run pnpm check:test-types.
  3. Run RuntimePlan contract tests, Harness V2 tests, Codex app-server tests, and embedded-runner focused tests.
  4. Document exact failures and classify each as either current-main baseline drift or finalization-blocking.
  5. If baseline drift blocks every later PR, fix that drift in this PR only and keep it separate from runtime refactors.

Suggested validation commands:

pnpm check:architecture
pnpm check:test-types
node scripts/run-vitest.mjs run --config test/vitest/vitest.agents.config.ts \
  src/agents/runtime-plan/build.test.ts \
  src/agents/runtime-plan/types.test.ts \
  src/agents/runtime-plan/types.compat.test.ts \
  src/agents/runtime-plan/tools.test.ts \
  src/agents/runtime-plan/tools.diagnostics.test.ts \
  src/agents/harness/v2.test.ts \
  src/agents/harness/selection.test.ts
node scripts/run-vitest.mjs run --config test/vitest/vitest.extensions.config.ts \
  extensions/codex/src/app-server/run-attempt.test.ts \
  extensions/codex/src/app-server/event-projector.test.ts

Open PR:

git push -u 100yenadmin contract-finalization/baseline-health
gh pr create --repo openclaw/openclaw --base main \
  --head 100yenadmin:contract-finalization/baseline-health \
  --draft \
  --title "[codex] Document RuntimePlan finalization baseline" \
  --body-file /tmp/runtime-plan-finalization-baseline-pr.md

PR 2: Native Internal Harness V2 Adoption

Purpose: stop treating Harness V2 as only a generic V1 adapter while keeping public plugin AgentHarness compatible.

Create:

git fetch origin
git switch -c contract-finalization/native-harness-v2 origin/main

Change/add:

  • src/agents/harness/v2.ts
  • src/agents/harness/selection.ts
  • src/agents/harness/builtin-pi.ts
  • extensions/codex/harness.ts
  • src/agents/harness/v2.test.ts
  • src/agents/harness/selection.test.ts
  • extensions/codex/index.test.ts

Detailed steps:

  1. Add internal native V2 factories for built-in Pi and bundled Codex.
  2. Keep registerAgentHarness(...) and public AgentHarness unchanged.
  3. Update harness selection so selected harness execution uses native V2 when available, otherwise adapts V1.
  4. Preserve support priority, fallback behavior, compaction, reset, dispose, cleanup, and classification semantics.
  5. Add tests proving V1 harnesses still work and native V2 harnesses receive the same attempt params/result shape.

Open PR:

git push -u 100yenadmin contract-finalization/native-harness-v2
gh pr create --repo openclaw/openclaw --base main \
  --head 100yenadmin:contract-finalization/native-harness-v2 \
  --draft \
  --title "[codex] Promote Harness V2 as internal lifecycle boundary" \
  --body-file /tmp/native-harness-v2-pr.md

PR 3: Split Pi Attempt Preparation Domains

Purpose: reduce run/attempt.ts without changing behavior.

Create:

git fetch origin
git switch -c contract-finalization/attempt-prep-split origin/main

Change/add:

  • Add src/agents/pi-embedded-runner/run/attempt-tools.ts
  • Add src/agents/pi-embedded-runner/run/attempt-prompt.ts
  • Add src/agents/pi-embedded-runner/run/attempt-transport.ts
  • Keep/use src/agents/pi-embedded-runner/run/attempt.transcript-policy.ts
  • Update src/agents/pi-embedded-runner/run/attempt.ts
  • Add or adjust focused tests only when extraction needs importable helpers.

Detailed steps:

  1. Move tool normalization/diagnostics into attempt-tools.ts.
  2. Move RuntimePlan prompt contribution and cache-boundary preparation into attempt-prompt.ts.
  3. Move transport extra-param/session stream setup helpers into attempt-transport.ts.
  4. Leave the live stream loop in attempt.ts for this PR.
  5. RuntimePlan remains first source; legacy fallback remains.
  6. No package rename, no public API rename, and no behavior changes.

Open PR:

git push -u 100yenadmin contract-finalization/attempt-prep-split
gh pr create --repo openclaw/openclaw --base main \
  --head 100yenadmin:contract-finalization/attempt-prep-split \
  --draft \
  --title "[codex] Split embedded attempt preparation domains" \
  --body-file /tmp/attempt-prep-split-pr.md

PR 4: Split Pi Stream And Lifecycle Domains

Purpose: isolate the actual model loop and cleanup from transcript/session setup.

Create:

git fetch origin
git switch -c contract-finalization/attempt-stream-lifecycle-split origin/main

Change/add:

  • Add src/agents/pi-embedded-runner/run/attempt-stream-loop.ts
  • Add src/agents/pi-embedded-runner/run/attempt-lifecycle.ts
  • Use existing src/agents/pi-embedded-runner/run/attempt.subscription-cleanup.ts
  • Use existing src/agents/pi-embedded-runner/run/attempt.sessions-yield.ts
  • Use existing src/agents/pi-embedded-runner/run/attempt.stop-reason-recovery.ts
  • Update src/agents/pi-embedded-runner/run/attempt.ts

Detailed steps:

  1. Move send/yield/tool execution loop into attempt-stream-loop.ts.
  2. Keep recovery helpers near stream-loop state.
  3. Move terminal meta, liveness, and cleanup handoff into attempt-lifecycle.ts.
  4. Preserve all emitted events, payload/result shapes, retry behavior, and cleanup ordering.
  5. Do not rename files or paths in this PR.

Open PR:

git push -u 100yenadmin contract-finalization/attempt-stream-lifecycle-split
gh pr create --repo openclaw/openclaw --base main \
  --head 100yenadmin:contract-finalization/attempt-stream-lifecycle-split \
  --draft \
  --title "[codex] Split embedded attempt stream lifecycle" \
  --body-file /tmp/attempt-stream-lifecycle-split-pr.md

PR 5: Split Embedded Run Orchestration

Purpose: make run.ts a readable table-of-contents entrypoint.

Create:

git fetch origin
git switch -c contract-finalization/run-orchestration-split origin/main

Change/add:

  • Add src/agents/pi-embedded-runner/run/model-auth-plan.ts
  • Add src/agents/pi-embedded-runner/run/runtime-plan-factory.ts
  • Add src/agents/pi-embedded-runner/run/lane-workspace.ts
  • Add src/agents/pi-embedded-runner/run/terminal-result.ts
  • Update src/agents/pi-embedded-runner/run.ts

Detailed steps:

  1. Extract model/auth resolution and auth profile forwarding into model-auth-plan.ts.
  2. Extract RuntimePlan construction into runtime-plan-factory.ts.
  3. Extract lane/workspace/session setup into lane-workspace.ts.
  4. Extract terminal result shaping and fallback metadata into terminal-result.ts.
  5. Keep run.ts as the orchestration table of contents.
  6. No behavior changes and no API rename in this PR.

Open PR:

git push -u 100yenadmin contract-finalization/run-orchestration-split
gh pr create --repo openclaw/openclaw --base main \
  --head 100yenadmin:contract-finalization/run-orchestration-split \
  --draft \
  --title "[codex] Split embedded run orchestration" \
  --body-file /tmp/run-orchestration-split-pr.md

PR 6: Canonical Neutral Embedded Runner Naming

Purpose: make naming match reality: Pi and Codex both flow through the embedded runner orchestration.

Create:

git fetch origin
git switch -c contract-finalization/embedded-runner-canonical-names origin/main

Change/add:

  • src/agents/embedded-runner.ts
  • src/agents/embedded-runner/index.ts
  • src/agents/pi-embedded-runner.ts
  • src/agents/pi-embedded.ts
  • src/plugins/runtime/runtime-agent.ts
  • src/plugins/runtime/types-core.ts
  • src/agents/pi-embedded-runner/aliases.test.ts
  • docs/pi.md
  • docs/concepts/agent-loop.md
  • docs/concepts/agent-runtimes.md
  • docs/plugins/sdk-runtime.md

Detailed steps:

  1. Make neutral exports canonical internally.
  2. Keep Pi-named exports as deprecated compatibility aliases.
  3. Add or preserve alias identity tests.
  4. Add a warning-only import guard for new core imports from old Pi paths, excluding compatibility barrels and tests.
  5. Update docs to explain Pi vs Codex ownership accurately.
  6. Do not remove old import paths.

Open PR:

git push -u 100yenadmin contract-finalization/embedded-runner-canonical-names
gh pr create --repo openclaw/openclaw --base main \
  --head 100yenadmin:contract-finalization/embedded-runner-canonical-names \
  --draft \
  --title "[codex] Canonicalize embedded runner naming" \
  --body-file /tmp/embedded-runner-canonical-names-pr.md

PR 7: Final Verification And Maintainer Handoff

Purpose: prove final architecture is stable and document what remains optional.

Create:

git fetch origin
git switch -c contract-finalization/final-smoke-docs origin/main

Change/add:

  • Add docs/refactor/runtime-plan-finalization-complete.md
  • Update docs only; no runtime behavior changes.

Detailed steps:

  1. Smoke openai/*, openai-codex/*, codex/*, and codex-cli/* GPT-5.4 paths.
  2. Verify tools, auth profile routing, prompt overlays, transcript repair, delivery, fallback classification, schema normalization, transport params, and resolvedRef observability.
  3. Document optional deferred work: WS pooling default-on decision and public Harness V2 plugin API only if maintainers request it.
  4. Add final maintainer handoff summary with links to every finalization PR.

Open PR:

git push -u 100yenadmin contract-finalization/final-smoke-docs
gh pr create --repo openclaw/openclaw --base main \
  --head 100yenadmin:contract-finalization/final-smoke-docs \
  --draft \
  --title "[codex] Document RuntimePlan finalization handoff" \
  --body-file /tmp/final-smoke-docs-pr.md

Clean Mapping: Plan Items To Code

Plan ItemMain Code FilesTests/Docs
Baseline healthno production changes by defaultdocs/refactor/runtime-plan-finalization-baseline.md
Native Harness V2src/agents/harness/v2.ts, src/agents/harness/selection.ts, src/agents/harness/builtin-pi.ts, extensions/codex/harness.tssrc/agents/harness/v2.test.ts, src/agents/harness/selection.test.ts, extensions/codex/index.test.ts
Attempt prep splitsrc/agents/pi-embedded-runner/run/attempt.ts, new attempt-tools.ts, attempt-prompt.ts, attempt-transport.tsfocused attempt/runtime-plan tests
Attempt stream/lifecycle splitsrc/agents/pi-embedded-runner/run/attempt.ts, new attempt-stream-loop.ts, attempt-lifecycle.tsfocused attempt/incomplete-turn/cleanup tests
Run orchestration splitsrc/agents/pi-embedded-runner/run.ts, new model-auth-plan.ts, runtime-plan-factory.ts, lane-workspace.ts, terminal-result.tsoverflow/fallback/auth/runtime-plan tests
Neutral namingsrc/agents/embedded-runner.ts, src/agents/embedded-runner/index.ts, src/agents/pi-embedded-runner.ts, src/agents/pi-embedded.ts, plugin runtime typesalias tests and docs
Final handoffno production changesdocs/refactor/runtime-plan-finalization-complete.md

Review Rules

  • No PR combines behavior changes with file moves or naming changes.
  • No PR resumes #70743/#70772 prototype churn.
  • No public plugin API removal.
  • Keep old Pi import paths working.
  • Each PR must include a Mermaid diagram, file map, validation commands, and “what did not change.”
  • If pnpm check:test-types fails on unrelated repo baseline drift, document exact failures and keep targeted suites as the gate.
  • Start every PR from fresh origin/main; do not stack unless maintainers explicitly ask.

Acceptance Criteria

Final solution is complete when:

  • RuntimePlan contracts remain green.
  • Harness V2 is the internal lifecycle boundary for selected harness execution.
  • attempt.ts and run.ts are split by ownership boundary, not cosmetic convenience.
  • Neutral embedded-runner naming is canonical while Pi names remain compatibility aliases.
  • Docs accurately explain Pi vs Codex ownership.
  • GPT-5.4 smoke passes across OpenAI, OpenAI-Codex, Codex, and Codex CLI routes.

Follow-Up Actions After Filing

  1. Link this RFC from any new finalization PR body.
  2. Keep #71004 closed and reference it only as completed predecessor context.
  3. Create PR 1 first to establish the finalization baseline.
  4. Proceed through PRs in order unless maintainers request consolidation or reordering.
  5. Treat WS pooling default-on and public Harness V2 plugin API as optional future decisions, not part of this RFC’s required closure.

extent analysis

TL;DR

Create a series of targeted PRs to finalize the contract-first Pi/Codex runtime work, starting with a baseline health check and proceeding through native Harness V2 adoption, code splits, and neutral naming changes.

Guidance

  1. Start with a baseline health check: Create a PR to confirm the current state of the main branch after #71722, documenting any unrelated baseline drift before proceeding with structural work.
  2. Adopt native Harness V2: Update src/agents/harness/v2.ts and related files to promote Harness V2 as the internal lifecycle boundary, keeping public plugin AgentHarness compatible.
  3. Split code by ownership boundary: Split attempt.ts and run.ts into smaller, more manageable files, using ownership boundaries rather than cosmetic convenience as the guiding principle.
  4. Implement neutral embedded-runner naming: Make neutral exports canonical internally, keeping Pi-named exports as deprecated compatibility aliases, and update docs to reflect the change.

Example

No specific code example is provided, as the issue focuses on high-level architecture and process changes rather than specific code implementations.

Notes

The provided issue text is a detailed roadmap for finalizing the contract-first Pi/Codex runtime work, with a series of PRs and specific goals for each. The key is to proceed methodically, ensuring that each PR builds on the previous one and that the final solution meets the specified acceptance criteria.

Recommendation

Apply the proposed series of PRs, starting with the baseline health check and proceeding through the outlined steps, to ensure a clean and maintainable architecture for the Pi/Codex runtime work.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix RFC: RuntimePlan finalization and embedded runner structural cleanup [14 pull requests, 1 comments, 1 participants]