openclaw - ✅(Solved) Fix ACP: shouldReplaceEnsuredSession causes infinite replace loop with acpx 0.3.1 (sessions spawn fails) [1 pull requests, 5 comments, 4 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#56855Fetched 2026-04-08 01:46:50
View on GitHub
Comments
5
Participants
4
Timeline
11
Reactions
0
Author
Timeline (top)
commented ×5cross-referenced ×4mentioned ×1subscribed ×1

ACP sessions fail to initialize after updating to OpenClaw 2026.3.28. sessions_spawn and /acp spawn both return:

Session binding adapter failed to bind target conversation

The gateway logs show an infinite loop:

[plugins] acpx ensureSession replacing dead named session: session=agent:claude:acp:<uuid> cwd=... status=dead summary=queue owner unavailable
[plugins] acpx ensureSession replacing dead named session: session=agent:claude:acp:<uuid> cwd=... status=dead summary=queue owner unavailable
...

Root Cause

In acpx 0.3.1, sessions new / sessions ensure uses lazy start — the agent process is not spawned until the first prompt is sent. Until then, acpx claude status reports status: dead.

OpenClaw 2026.3.28's shouldReplaceEnsuredSession checks status immediately after sessions ensure completes, gets dead, and triggers a session replacement. The replacement creates a new session which is also immediately dead, causing the infinite loop.

PR fix notes

PR #58669: fix(acpx): repair queue owner session recovery

Description (problem / solution / changelog)

Summary

  • Problem: ACP gateway sessions could keep a named session in status=dead when acpx reported queue owner unavailable, then hand that dead handle to the first prompt.
  • Why it matters: sessions_spawn could still fail on the first turn for Claude and intermittently for other ACP agents even though the runtime logged that the dead session was "recoverable".
  • What changed: the acpx runtime now repairs that dead named session by creating a replacement owner, resuming the backend session when a stable session id is available, and falling back to a fresh named session when it is not.
  • What did NOT change (scope boundary): this does not change unrelated ACP status handling, config defaults, or non-queue-owner dead-session recovery.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #58659
  • Related #56855
  • This PR fixes a bug or regression

Root Cause / Regression History (if applicable)

  • Root cause: extensions/acpx/src/runtime.ts treated status=dead with summary~="queue owner unavailable" as a session to keep, so ensureSession() could return a dead handle that then failed on the first prompt.
  • Missing detection / guardrail: the regression tests only asserted that OpenClaw stopped replacing those dead sessions in a loop; they did not assert that the repaired session was usable for the next turn.
  • Prior context (git blame, prior PR, issue, or refactor if known): the queue-owner path was special-cased in 1c95c41c37 to avoid an infinite replace loop.
  • Why this regressed now: the earlier fix avoided the loop but still left the queue-owner recovery path returning the dead named session instead of repairing it.
  • If unknown, what was ruled out: ruled out a simple TTL-only fix because the reported failure still reproduces after increasing queueOwnerTtlSeconds.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: extensions/acpx/src/runtime.test.ts
  • Scenario the test should lock in: when status reports dead plus queue owner unavailable, the runtime should repair the named session, resume when a stable session id exists, and fall back to a fresh named session when it does not.
  • Why this is the smallest reliable guardrail: the bug lives entirely in the ACPX runtime control-flow around sessions ensure, status, and sessions new.
  • Existing test that already covers this (if any): the prior dead-session tests covered the queue-owner branch but asserted the wrong behavior for this issue.
  • If no new test is added, why not: N/A

User-visible / Behavior Changes

  • ACP sessions that hit queue owner unavailable during initialization now repair the dead named session instead of returning a dead handle to the first turn.
  • When ACPX exposes a stable session id for that dead session, OpenClaw resumes it to preserve continuity.

Diagram (if applicable)

Before:
[sessions ensure] -> [status=dead queue owner unavailable] -> [retain dead session] -> [first prompt fails]

After:
[sessions ensure] -> [status=dead queue owner unavailable] -> [repair named session owner] -> [first prompt uses repaired session]

Security Impact (required)

  • New permissions/capabilities? (Yes/No) No
  • Secrets/tokens handling changed? (Yes/No) No
  • New/changed network calls? (Yes/No) No
  • Command/tool execution surface changed? (Yes/No) No
  • Data access scope changed? (Yes/No) No
  • If any Yes, explain risk + mitigation:

Repro + Verification

Environment

  • OS: macOS 25.3.0
  • Runtime/container: Node 22 / pnpm workspace
  • Model/provider: ACPX mock runtime in tests; issue repro targets Claude and Codex ACP agents
  • Integration/channel (if any): ACP gateway sessions
  • Relevant config (redacted): acpx.permissionMode=approve-all, acpx.nonInteractivePermissions=deny, acpx.queueOwnerTtlSeconds=30

Steps

  1. Create or ensure an ACP named session.
  2. Return status=dead with summary=queue owner unavailable from status --session.
  3. Continue initialization and attempt the first turn.

Expected

  • The runtime repairs the dead named session before returning the handle.
  • If a stable session id is present, the repair resumes that session.
  • If no stable session id is present, the runtime falls back to a fresh named session.

Actual

  • Before this change, the runtime kept the dead named session and the first prompt could still fail with ACP_TURN_FAILED.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios: updated the queue-owner recovery tests, added the no-resumable-id fallback case, ran pnpm test -- extensions/acpx/src/runtime.test.ts -t "queue owner unavailable|ensure fallback|no resumable id", ran pnpm test:extension acpx, and ran pnpm build.
  • Edge cases checked: dead queue-owner status after sessions ensure, dead queue-owner status after ensure failure fallback, and missing ids in the status payload.
  • What you did not verify: a live Claude/Codex gateway repro in this environment, and pnpm check still reports unrelated pre-existing type failures in extensions/diffs/src/language-hints.test.ts and src/plugins/contracts/plugin-sdk-subpaths.test.ts.
  • AI assistance / testing note: prepared with AI assistance, manually reviewed before opening, and validated with the focused ACPX test lanes above.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes/No) Yes
  • Config/env changes? (Yes/No) No
  • Migration needed? (Yes/No) No
  • If yes, exact upgrade steps:

Risks and Mitigations

  • Risk: some ACPX backends may report queue-owner failures without stable ids, which forces a fresh named session instead of a resume.
    • Mitigation: the runtime only takes that fallback when status provides no resumable id at all, and the new regression test locks in that branch.

Made with Cursor

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • extensions/acpx/src/runtime.test.ts (modified, +35/-6)
  • extensions/acpx/src/runtime.ts (modified, +95/-41)
  • extensions/acpx/src/test-utils/runtime-fixtures.ts (modified, +5/-3)

Code Example

Session binding adapter failed to bind target conversation

---

[plugins] acpx ensureSession replacing dead named session: session=agent:claude:acp:<uuid> cwd=... status=dead summary=queue owner unavailable
[plugins] acpx ensureSession replacing dead named session: session=agent:claude:acp:<uuid> cwd=... status=dead summary=queue owner unavailable
...

---

# 1. Confirm acpx lazy-start behavior
ACPX=<path-to-acpx-binary>
SESSION="test-session-$(date +%s)"
$ACPX claude sessions new --name "$SESSION"
$ACPX claude status --session "$SESSION"
# → status: dead  (immediately after sessions new)

# 2. But the session works fine on first send:
$ACPX claude -s "$SESSION" "say hi"
$ACPX claude status --session "$SESSION"
# → status: running
RAW_BUFFERClick to expand / collapse

Bug Report

Environment

  • OpenClaw: 2026.3.28
  • acpx plugin: 2026.3.28
  • acpx binary: 0.3.1
  • OS: macOS (arm64)

Summary

ACP sessions fail to initialize after updating to OpenClaw 2026.3.28. sessions_spawn and /acp spawn both return:

Session binding adapter failed to bind target conversation

The gateway logs show an infinite loop:

[plugins] acpx ensureSession replacing dead named session: session=agent:claude:acp:<uuid> cwd=... status=dead summary=queue owner unavailable
[plugins] acpx ensureSession replacing dead named session: session=agent:claude:acp:<uuid> cwd=... status=dead summary=queue owner unavailable
...

Root Cause

In acpx 0.3.1, sessions new / sessions ensure uses lazy start — the agent process is not spawned until the first prompt is sent. Until then, acpx claude status reports status: dead.

OpenClaw 2026.3.28's shouldReplaceEnsuredSession checks status immediately after sessions ensure completes, gets dead, and triggers a session replacement. The replacement creates a new session which is also immediately dead, causing the infinite loop.

Steps to Reproduce

# 1. Confirm acpx lazy-start behavior
ACPX=<path-to-acpx-binary>
SESSION="test-session-$(date +%s)"
$ACPX claude sessions new --name "$SESSION"
$ACPX claude status --session "$SESSION"
# → status: dead  (immediately after sessions new)

# 2. But the session works fine on first send:
$ACPX claude -s "$SESSION" "say hi"
$ACPX claude status --session "$SESSION"
# → status: running

Expected Behavior

shouldReplaceEnsuredSession should not treat a freshly-created session with status: dead as needing replacement. A newly created session that has never received a prompt should be considered valid (lazy-start state), not dead/broken.

Possible fix: only replace when dead AND the session has had at least one prompt attempt, OR check whether the summary indicates a real failure (e.g. queue owner unavailable after a prompt attempt) rather than treating all dead as broken.

Additional Notes

  • acpx CLI direct execution works fine (acpx claude -s <name> "prompt" succeeds)
  • The bug was introduced in 2026.3.28 — ACP was working before today's update
  • Webhook-based display name (⚙ claude) also broken as a downstream effect (binding never completes)

/cc @steipete

extent analysis

Fix Plan

To resolve the issue, we need to modify the shouldReplaceEnsuredSession function in OpenClaw to handle the lazy-start behavior of acpx 0.3.1. We will add a check to ensure that a newly created session is not replaced immediately after creation.

Step-by-Step Solution

  • Modify the shouldReplaceEnsuredSession function to check the session's status and summary.
  • Add a condition to ignore sessions with status: dead if they have never received a prompt.
  • Use the summary field to determine if the session is truly broken (e.g., queue owner unavailable after a prompt attempt).

Example Code

def shouldReplaceEnsuredSession(session):
    # Check if the session is dead and has never received a prompt
    if session.status == 'dead' and session.prompt_attempts == 0:
        return False
    
    # Check if the session summary indicates a real failure
    if session.summary and 'queue owner unavailable' in session.summary:
        return True
    
    # Existing replacement logic
    # ...

Verification

To verify the fix, follow these steps:

  • Update OpenClaw with the modified shouldReplaceEnsuredSession function.
  • Run the steps to reproduce the issue.
  • Check the gateway logs for the infinite loop.
  • Verify that the session is created and replaced correctly.

Extra Tips

  • Ensure that the prompt_attempts field is accurately updated when a prompt is sent to the session.
  • Consider adding additional logging to track session creation and replacement.
  • Review the acpx plugin and OpenClaw documentation to ensure that the lazy-start behavior is properly documented and handled.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix ACP: shouldReplaceEnsuredSession causes infinite replace loop with acpx 0.3.1 (sessions spawn fails) [1 pull requests, 5 comments, 4 participants]