openclaw - ✅(Solved) Fix [Bug]: Discord thread binding not released after ACP task completion — channel messages misrouted to thread [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#59026Fetched 2026-04-08 02:29:43
View on GitHub
Comments
1
Participants
2
Timeline
1
Reactions
0
Timeline (top)
commented ×1

After an ACP task (Codex) completes in a Discord thread, subsequent messages sent by the user in the parent channel (not the thread) are still routed to the thread session. The channel session only recovers after the second message.

Root Cause

Root Cause Hypothesis

PR fix notes

PR #66407: fix(acp): bypass ACP dispatch for /acp text commands in bound threads

Description (problem / solution / changelog)

Summary

  • Problem: /acp close (and every other /acp text command) sent inside a Discord thread bound to an ACP session never reaches handleAcpCommand. It is handed off to the thread's ACP session and consumed as conversational input by the ACP agent, which replies with a hallucinated natural-language message while the session stays open.
  • Why it matters: Users cannot close, cancel, steer, switch mode, or check status of an ACP session from inside the bound thread — the most natural place to do so. Every attempt produces a confusing "it replied, but nothing changed" UX; the only workaround was hand-editing thread-bindings.json and restarting the gateway.
  • What changed: shouldBypassAcpDispatchForCommand now recognizes /acp ... as a bypass candidate in the same way /new and /reset already are. When the surface is allowed to handle text commands, the message is routed to handleAcpCommand instead of the ACP session.
  • What did NOT change (scope boundary): The /acp command grammar, handleAcpCommand itself, the ACP session lifecycle, thread-bindings.json schema, the !-prefix command path, and every non-/acp / non-/reset slash command still fall through to the ACP session exactly as before.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #66298
  • Related #59026 (inverse direction — parent-channel messages being misrouted to thread after task completion; not this PR)
  • This PR fixes a bug or regression

Root Cause

  • Root cause: shouldBypassAcpDispatchForCommand only whitelisted two slash-command shapes for the ACP-dispatch bypass: /new//reset (via isResetCommandCandidate) and !-prefix bang commands. Every other slash command — including /acp ... — fell through if (!normalized.startsWith("!")) return false; and was then dispatched to the thread's active ACP session as plain user input. The ACP agent (any model) then hallucinated a polite natural-language reply such as done/好的, making the command look like it worked even though handleAcpCommand had never been invoked.
  • Missing detection / guardrail: The existing regression test "returns false for ACP slash commands" in dispatch-acp-command-bypass.test.ts actively locked in the buggy behavior — it asserted that /acp cancel returns false from the bypass check. That test is flipped in this PR to assert the correct post-fix behavior.
  • Contributing context (if known): Unknown — I did not walk the git history to confirm the ordering. /acp is a registered command (commands-registry.shared.ts:342 registers textAlias: "/acp", commands-acp/shared.ts:15 exports COMMAND = "/acp"), so the grammar has been there for a while; only the bypass filter was missing the prefix.

Regression Test Plan

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/auto-reply/reply/dispatch-acp-command-bypass.test.ts
  • Scenario the test should lock in: shouldBypassAcpDispatchForCommand must return true for /acp ... when the surface is allowed to handle text commands, both for CommandSource: "text" (default) and CommandSource: "native" (Discord slash autocomplete).
  • Why this is the smallest reliable guardrail: The bug is purely in this single routing decision. A unit test on the routing function is sufficient — no need for an integration test that spins up a real Discord connection or a real ACP session, both of which already have their own coverage for the per-command handlers.
  • Existing test that already covers this (if any): None; the existing test asserted the opposite.
  • If no new test is added, why not: N/A — three new tests added in this PR.

User-visible / Behavior Changes

  • /acp close, /acp cancel, /acp status, /acp new, and every other /acp ... text/slash command issued from inside a thread bound to an ACP session now runs through handleAcpCommand as expected, instead of being swallowed by the thread's ACP session.
  • No new config, no new defaults, no command-grammar change.

Diagram

Before:
user types "/acp close" in bound thread
  → shouldBypassAcpDispatchForCommand → false  (falls through, not "!" or "/new"/"/reset")
  → ACP session receives "/acp close" as user turn
  → agent replies "done" (hallucination)
  → thread-bindings.json unchanged, session still open

After:
user types "/acp close" in bound thread
  → shouldBypassAcpDispatchForCommand → true   (matches isAcpCommandCandidate)
  → handleAcpCommand → handleAcpCloseAction
  → acpManager.closeSession → sessionBindingService.unbind
  → "✅ Closed ACP session <key>. Removed 1 binding."
  → thread-bindings.json count drops by 1

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No — the set of /acp subcommands is unchanged; only the routing of an already-registered command is corrected.
  • Data access scope changed? No
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: Ubuntu 22.04.5 LTS (aarch64, Oracle Cloud Always Free VM)
  • Runtime/container: Node 22, systemd openclaw.service
  • Model/provider: minimax/MiniMax-M2.7 as the bound ACP agent (not model-specific — any ACP agent would hallucinate a reply to /acp close)
  • Integration/channel: Discord thread bound via channels.discord.threadBindings with spawnAcpSessions: true
  • Relevant config (redacted): channels.discord.threadBindings.enabled: true, channels.discord.threadBindings.spawnAcpSessions: true

Steps

  1. Configure the thread bindings as above and restart the gateway.
  2. In a Discord guild text channel, /acp spawn --thread here with agent claude (or any ACP agent). Confirm /root/.openclaw/discord/thread-bindings.json gains the new entry.
  3. Inside the newly-created bound thread, send /acp close (plain text, Discord autocomplete, or slash-command dropdown).
  4. Observe the bot reply, the gateway logs, and /root/.openclaw/discord/thread-bindings.json.

Expected

  • handleAcpCloseAction runs.
  • The ACP session is closed and the binding entry is removed.
  • Bot replies with ✅ Closed ACP session <key>. Removed 1 binding.

Actual (before fix)

  • handleAcpCloseAction never runs: grep -E "Closed ACP|Removed.*binding|unbind" /root/clawd/logs/openclaw.log returns zero matches for the full session window.
  • thread-bindings.json entry is unchanged.
  • Bot replies with a natural-language message such as done from the ACP agent; the session stays open.

Actual (after fix)

  • handleAcpCloseActionacpManager.closeSessiongetSessionBindingService().unbind runs.
  • thread-bindings.json binding count drops from 2 → 1 as expected.
  • Bot replies with the canonical ✅ Closed ACP session <key>. Removed 1 binding. string.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Failing test before, passing after

The existing "returns false for ACP slash commands" test in dispatch-acp-command-bypass.test.ts asserted the buggy behavior (/acp cancel → bypass false). After this patch, that assertion is flipped to the correct post-fix expectation and renamed to tag the regression back to this issue. Two additional tests were added for the native-command-source path and for unrecognized slash commands.

Unit tests (local, this PR):

$ pnpm vitest run src/auto-reply/reply/dispatch-acp-command-bypass.test.ts
 Test Files  1 passed (1)
      Tests  10 passed (10)
   Duration  628ms

Full pipeline gates (local, this PR):

$ pnpm tsgo       # 0 errors, 0 warnings, 11,435 files in 20.9s
$ pnpm check      # all lint lanes clean
$ pnpm build      # see CI

Live-run evidence referenced from #66298

When I originally filed #66298 (same GitHub account, same day) I had already locally patched a similar bypass in the compiled dist/ build and reported the before/after trace in the issue body — grep -E "Closed ACP|Removed.*binding|unbind" /root/clawd/logs/openclaw.log returning no matches before, and handleAcpCloseActionacpManager.closeSessiongetSessionBindingService().unbind firing after, with thread-bindings.json dropping from 2 → 1 bindings. I have not re-run a live end-to-end smoke test with this exact PR branch built from source; the unit-test and type/lint/build gates above are what this PR carries as fresh evidence.

Human Verification (required)

What I personally verified for this PR branch:

  • Unit test: pnpm vitest run src/auto-reply/reply/dispatch-acp-command-bypass.test.ts — 10/10 green.
  • Type check: pnpm tsgo — 0 errors, 0 warnings across 11,435 files.
  • Lint/format: pnpm check — all lanes clean.
  • Build: pnpm build — see CI.
  • Regex boundary check (by inspection of the new isAcpCommandCandidate): ^\/acp(?:\s|$)/i accepts /acp, /acp close, /ACP close; rejects /acpfoo, /acp-close.

What was previously verified on a live deployment, as reported in #66298:

  • /acp close inside a bound Discord thread reaches handleAcpCommandhandleAcpCloseActionacpManager.closeSessionsessionBindingService.unbind.
  • thread-bindings.json binding count drops by 1.
  • That verification was done with a local patch to the compiled dist/ bundle on my own 2026.4.11 install, not with a from-source build of this PR branch.

What I did not verify for this PR:

  • End-to-end smoke test with a from-source build of this branch against a live Discord bound thread.
  • Non-Discord surfaces (Telegram, Slack, Matrix, WhatsApp, Mattermost, …). The fix lives in the surface-agnostic shouldBypassAcpDispatchForCommand path so the behavior should be identical, but I do not have those integrations running and have not re-exercised them. Reviewers with those surfaces should smoke-test /acp close before cutting a release.
  • CommandSource: "native" on a surface that actually exposes /acp through a plugin-registry native command UI — the new test covers the routing decision, but I did not run a real Discord slash-command autocomplete invocation against the from-source build.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? No
  • Migration needed? No
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: A downstream caller might already be counting on /acp ... reaching the ACP session as raw user input (for example, an agent that treats /acp ... as a prompt template literal).
    • Mitigation: The whole point of the /acp prefix is that it is a command, not content. There is no in-tree consumer that depends on receiving /acp ... as conversational input, and nothing in the ACP session grammar assigns meaning to a leading /acp. Any external agent that does rely on this would already have been broken by the Discord slash-command autocomplete surfacing /acp as a registered command.

AI-Assisted PR

  • AI-assisted — patch prepared with Claude (Opus 4.6) via Claude Code, operated by me (@kindomLee, original issue author).
  • Degree of testing: lightly tested — unit + tsgo + check + build gates are all green against this branch built from source. The original live-deployment repro was done by me against a dist/-level patch on 2026.4.11 and is reported in #66298; I did not re-run that exact end-to-end smoke test against the from-source build of this PR branch.
  • Session context: fixed inside the same repo where the original issue was filed. The one-line fix direction described in the issue body is preserved, but extracted into a named helper (isAcpCommandCandidate) to mirror the existing isResetCommandCandidate style rather than inlining the regex in the bypass flow. The existing test "returns false for ACP slash commands" explicitly locked in the buggy behavior; this PR flips it and tags it to #66298.
  • I understand what the code does and can explain every hunk.

Changed files

  • src/auto-reply/reply/commands-acp.test.ts (modified, +24/-0)
  • src/auto-reply/reply/commands-acp.ts (modified, +1/-5)
  • src/auto-reply/reply/dispatch-acp-command-bypass.test.ts (modified, +95/-8)
  • src/auto-reply/reply/dispatch-acp-command-bypass.ts (modified, +18/-0)
RAW_BUFFERClick to expand / collapse

Description

After an ACP task (Codex) completes in a Discord thread, subsequent messages sent by the user in the parent channel (not the thread) are still routed to the thread session. The channel session only recovers after the second message.

Environment

  • OpenClaw version: 2026.3.31
  • Channel: Discord (guild channels with threadBindings.enabled: true)
  • ACP backend: acpx with Codex
  • Config: session.threadBindings.idleHours: 2 (also reproduced with default 24h)

Steps to Reproduce

  1. In a Discord channel, initiate a task that spawns a Codex ACP session
  2. A thread "🤖 codex" is automatically created, bound to the ACP session
  3. Wait for Codex to complete the task
  4. The channel receives the completion notification
  5. Send a new message in the parent channel (not in the thread)

Observed Behavior

  • The first message sent in the parent channel after task completion is routed to the thread session — the agent replies inside the thread instead of in the channel
  • Only the second message sent in the channel is correctly handled by the channel session

Expected Behavior

  • After ACP task completion, the thread binding should be immediately released (or at least not "refreshed" by the completion event)
  • Messages sent in the parent channel should always be handled by the channel session, regardless of whether a recently-completed thread exists
  • The thread should only capture messages actually sent inside the thread

Root Cause Hypothesis

The task flow completion event is delivered through the parent session, which refreshes the thread binding's "last active" timestamp. This causes the session router to treat the thread as still active and route the next channel message into it. After that first (misrouted) message is processed, the binding state updates and subsequent messages route correctly.

Suggested Fix

  • Do not refresh thread binding timestamps when delivering internal task completion events
  • Or: immediately unbind the thread session when the associated ACP task reaches a terminal state (succeeded/failed/cancelled)
  • Or: add a completion_only flag to thread bindings that prevents user message routing but allows completion delivery

Related

  • #59025 (task completion metadata leakage)
  • #44492 (Telegram topic loses ACP routing after heavy bound turn)
  • #50690 (pre-send policy layer)

Additional Context

This issue is specific to the new task flow routing in 2026.3.31 ("route one-task ACP and subagent updates through a parent task-flow owner context"). The previous webhook-based delivery did not have this problem because completion notifications were sent via Discord webhook directly, bypassing session routing entirely.

extent analysis

TL;DR

Do not refresh thread binding timestamps when delivering internal task completion events to prevent misrouting of user messages in the parent channel.

Guidance

  • Review the task flow completion event handling to ensure it does not update the thread binding's "last active" timestamp, which could cause the session router to treat the thread as still active.
  • Consider immediately unbinding the thread session when the associated ACP task reaches a terminal state (succeeded/failed/cancelled) to prevent incorrect routing.
  • Evaluate adding a completion_only flag to thread bindings to prevent user message routing but allow completion delivery, as a potential workaround.

Example

No specific code snippet is provided, but the suggested fix implies modifying the task completion event handling logic to prevent thread binding timestamp updates.

Notes

This issue appears to be specific to the new task flow routing introduced in OpenClaw version 2026.3.31, and the suggested fixes may need to be adapted to the specific implementation details of the ACP backend and Discord channel integration.

Recommendation

Apply the workaround of not refreshing thread binding timestamps when delivering internal task completion events, as this is a targeted fix that addresses the root cause of the issue without requiring significant changes to the existing codebase.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING