openclaw - ✅(Solved) Fix sessions_spawn has no warm-reuse path: every delegation pays cold-start tax [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#64347Fetched 2026-04-11 06:15:18
View on GitHub
Comments
0
Participants
1
Timeline
1
Reactions
0
Author
Participants
Timeline (top)
cross-referenced ×1

Root Cause

File: dist/pi-embedded-CNTNdlGw.js Line: 13901

const childSessionKey = `agent:${targetAgentId}:subagent:${crypto.randomUUID()}`;

crypto.randomUUID() is called unconditionally on every sessions_spawn. There is:

  • No check for an existing warm subagent session for the same target agent
  • No config flag to enable reuse
  • No session pool

Contrast with direct agent invocation via openclaw agent --agent gilfoyle --message ..., which uses the stable key agent:gilfoyle:main and benefits from warm cache reuse across calls.

Fix Action

Fixed

PR fix notes

PR #64468: fix(agents): persist bootstrap marker after clean sessions_yield

Description (problem / solution / changelog)

Summary

  • Problem: clean sessions_yield turns skipped persisting openclaw:bootstrap-context:full, so continuation-skip could miss bootstrap completion and trigger full reinjection on follow-up relay/user turns.
  • Why it matters: unnecessary prompt-cache invalidation and avoidable token/cost overhead on subagent relay flows.
  • What changed: extracted marker persistence gating into shouldPersistCompletedBootstrapTurn(...) and allowed clean yield exits while keeping prompt/abort/compaction safety guards.
  • What did NOT change (scope boundary): no new run kinds, no relay transport/protocol changes, no compaction policy changes.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #64346
  • Related #64347
  • This PR fixes a bug or regression

Root Cause (if applicable)

  • Root cause: marker persistence was blocked by yieldAborted, even when the abort was an intentional, clean sessions_yield handoff.
  • Missing detection / guardrail: no explicit unit guard around marker persistence semantics for clean yield exits.
  • Contributing context (if known): continuation-skip relies on completed-marker presence in session history.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/agents/pi-embedded-runner/run/attempt.spawn-workspace.bootstrap-marker.test.ts
  • Scenario the test should lock in: marker persists for clean sessions_yield exits and remains blocked for prompt/abort/compaction failure modes.
  • Why this is the smallest reliable guardrail: the regression is in the marker gate decision itself.
  • Existing test that already covers this (if any): none.
  • If no new test is added, why not: N/A

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation: N/A

Human Verification (required)

  • Verified scenarios:
    • pnpm test src/agents/pi-embedded-runner/run/attempt.spawn-workspace.bootstrap-marker.test.ts src/agents/pi-embedded-runner/run/attempt.spawn-workspace.context-injection.test.ts
    • New marker-gate assertions pass.
    • Existing context-injection tests still pass.
  • Edge cases checked:
    • prompt error
    • aborted run
    • compaction timeout
    • compaction occurred during attempt
  • What you did not verify:
    • live end-to-end relay flow against a real provider session.

Risks and Mitigations

  • Risk:
    • Marker could persist on a yield exit that is not continuation-safe.
    • Mitigation:
      • Existing blocks are retained for prompt errors, hard aborts, compaction timeout, and compaction-occurred attempts.
      • Added focused unit coverage for the gate contract.

Changed files

  • src/agents/pi-embedded-runner/run/attempt.spawn-workspace.bootstrap-marker.test.ts (added, +78/-0)
  • src/agents/pi-embedded-runner/run/attempt.thread-helpers.ts (modified, +19/-0)
  • src/agents/pi-embedded-runner/run/attempt.ts (modified, +9/-6)

Code Example

# Delegation 1 to Gilfoyle
# Session key generated: agent:gilfoyle:subagent:761a0cd4-3d94-4ff6-8653-12a0e6fa40cf
# cacheWrite: 15,377 tokens (~$0.058)
# Cost: $0.059

# Delegation 2 to Gilfoyle, 30 seconds later, identical workspace
# Session key generated: agent:gilfoyle:subagent:1bee5884-6595-4be3-aeaa-c6abe96230a5
# cacheWrite: 12,747 tokens (~$0.050)  ← cold again
# Cost: $0.050

---

const childSessionKey = `agent:${targetAgentId}:subagent:${crypto.randomUUID()}`;

---

subagents?: {
    allowAgents?: string[];
    maxConcurrent?: number;
    maxSpawnDepth?: number;
    maxChildrenPerAgent?: number;
    archiveAfterMinutes?: number;   // cleanup after N minutes
    model?: AgentModelConfig;
    thinking?: ...;
    runTimeoutSeconds?: number;
    announceTimeoutMs?: number;
    requireAgentId?: boolean;
};

---

subagents?: {
    ...existing fields...
    /** If set, sessions_spawn reuses an existing session with this key for the target agent instead of generating a new UUID. */
    reuseKey?: string;
};

---

const childSessionKey = config.subagents?.reuseKey
    ? `agent:${targetAgentId}:subagent:${config.subagents.reuseKey}`
    : `agent:${targetAgentId}:subagent:${crypto.randomUUID()}`;
RAW_BUFFERClick to expand / collapse

sessions_spawn has no warm-reuse path — every delegation pays cold-start tax

OpenClaw version: 2026.4.8 (9ece252) Severity: Cost / performance (per-delegation tax, ~$0.05 on Sonnet) Impact: Every call to sessions_spawn creates a brand-new ephemeral subagent session with a fresh UUID, forcing a full cold-start workspace bootstrap cache write on the subagent every single delegation — even when the same subagent was called seconds earlier with an identical task type.

Symptom

# Delegation 1 to Gilfoyle
# Session key generated: agent:gilfoyle:subagent:761a0cd4-3d94-4ff6-8653-12a0e6fa40cf
# cacheWrite: 15,377 tokens (~$0.058)
# Cost: $0.059

# Delegation 2 to Gilfoyle, 30 seconds later, identical workspace
# Session key generated: agent:gilfoyle:subagent:1bee5884-6595-4be3-aeaa-c6abe96230a5
# cacheWrite: 12,747 tokens (~$0.050)  ← cold again
# Cost: $0.050

Each delegation pays ~$0.05–0.06 of cold-start bootstrap tax on the subagent side, regardless of proximity in time.

Root cause

File: dist/pi-embedded-CNTNdlGw.js Line: 13901

const childSessionKey = `agent:${targetAgentId}:subagent:${crypto.randomUUID()}`;

crypto.randomUUID() is called unconditionally on every sessions_spawn. There is:

  • No check for an existing warm subagent session for the same target agent
  • No config flag to enable reuse
  • No session pool

Contrast with direct agent invocation via openclaw agent --agent gilfoyle --message ..., which uses the stable key agent:gilfoyle:main and benefits from warm cache reuse across calls.

Config schema (for reference)

File: dist/plugin-sdk/src/config/types.agent-defaults.d.ts

The current subagent config surface is limited to spawn policy, not session reuse:

subagents?: {
    allowAgents?: string[];
    maxConcurrent?: number;
    maxSpawnDepth?: number;
    maxChildrenPerAgent?: number;
    archiveAfterMinutes?: number;   // cleanup after N minutes
    model?: AgentModelConfig;
    thinking?: ...;
    runTimeoutSeconds?: number;
    announceTimeoutMs?: number;
    requireAgentId?: boolean;
};

No reuseExisting, no warmSessionKey, no pooling.

Impact examples

For a fleet that uses a Chief of Staff / CTO / technician orchestration pattern (parent → Gilfoyle → Dinesh), where each layer uses sessions_spawn:

WorkflowDelegations/callCold-start tax
User asks "review X" (lookup)1 × Gilfoyle + 0 × Dinesh$0.05
User asks "diagnose and fix Y" (multi-step)1 × Gilfoyle + 2 × Dinesh recon$0.15
Comparison-mode model selection (Gilfoyle → 3 executors)4 subagents$0.20

Nothing in these delegations requires session isolation — they're all routine delegations that could reuse a warm session pool per target agent.

Proposed fix

Option A (minimal change, opt-in): Add subagents.reuseKey?: string so callers can opt into a stable key:

subagents?: {
    ...existing fields...
    /** If set, sessions_spawn reuses an existing session with this key for the target agent instead of generating a new UUID. */
    reuseKey?: string;
};

Then pi-embedded-CNTNdlGw.js:13901 becomes:

const childSessionKey = config.subagents?.reuseKey
    ? `agent:${targetAgentId}:subagent:${config.subagents.reuseKey}`
    : `agent:${targetAgentId}:subagent:${crypto.randomUUID()}`;

Callers can set a per-agent reuse key (e.g. "warm-pool") and get cache reuse across delegations.

Option B (automatic pooling, bigger change): Detect that a subagent session for the target already exists in the sessions store, hasn't expired (per archiveAfterMinutes), and isn't currently running; reuse it. Safer semantics but requires locking to handle concurrent spawns.

Option C (treat subagent as main): Provide a sessions_spawn flag reuseMain: true that routes the task into the target agent's main session key (agent:<id>:main) instead of creating a subagent session at all. Caller accepts the shared-history tradeoff.

Estimated saving

On a fleet doing 20 delegations/hour, at $0.05 cold-start per delegation:

  • Current: $0.05 × 20 × 24 × 30 = ~$720/month of pure cold-start tax
  • After fix (any of the three options): ~$50/month (only the first delegation per hour pays)

For a light fleet doing 5 delegations/day, savings are ~$7.50/month. Small in absolute terms, but the compounding effect hurts scaling.

Related

  • Relay turns also pay a separate cold cache rewrite on the parent side — see 01-relay-cache-invalidation.md. Together, the two issues make Reeves → Gilfoyle → paragraph summary cost ~$0.25–0.40 per call when the underlying reasoning is worth maybe $0.03–0.05.
  • Documentation-lookup tasks can bypass this entirely by having the parent agent read directly rather than delegate. But reasoning tasks (diagnosis, design, review) still pay the tax.

extent analysis

TL;DR

Implementing a session reuse mechanism, such as adding a reuseKey option to the subagents config, can help reduce the cold-start tax by reusing existing warm subagent sessions.

Guidance

  • Identify the subagents config and consider adding a reuseKey option to enable session reuse for target agents.
  • Evaluate the tradeoffs between the proposed fix options: minimal change with opt-in reuseKey, automatic pooling, or treating subagent as main.
  • Assess the potential cost savings by estimating the number of delegations per hour and the cold-start tax per delegation.
  • Consider the related issue of relay cache invalidation and its impact on overall costs.

Example

subagents?: {
    ...existing fields...
    /** If set, sessions_spawn reuses an existing session with this key for the target agent instead of generating a new UUID. */
    reuseKey?: string;
};
const childSessionKey = config.subagents?.reuseKey
    ? `agent:${targetAgentId}:subagent:${config.subagents.reuseKey}`
    : `agent:${targetAgentId}:subagent:${crypto.randomUUID()}`;

Notes

The proposed fix options have different implications for session management and cold-start tax reduction. The choice of option depends on the specific use case and requirements.

Recommendation

Apply workaround by adding a reuseKey option to the subagents config, as it is a minimal change that allows callers to opt into session reuse and reduce cold-start tax. This approach provides a balance between simplicity and effectiveness.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix sessions_spawn has no warm-reuse path: every delegation pays cold-start tax [1 pull requests, 1 participants]