openclaw - ✅(Solved) Fix [Bug]: Regression since 2026.3.22: agent replies “I’ll do it now” but execution stalls; once triggered it persists across new sessions until downgrade [2 pull requests, 3 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#58797Fetched 2026-04-08 02:32:32
View on GitHub
Comments
3
Participants
2
Timeline
10
Reactions
0
Assignees
Timeline (top)
commented ×3cross-referenced ×2labeled ×2assigned ×1

After updating from 2026.3.22 (also reproduced on 2026.3.23, 2026.3.24, 2026.3.31), the main agent intermittently enters a broken state:

  • It sends an acknowledgement like “I’ll do it now”
  • Then actual execution does not proceed (or appears stuck/misaligned)
  • Once this state appears, it does not recover by creating a new session
  • Gateway restart may temporarily recover once, but issue returns quickly
  • Only downgrading to an older version reliably recovers behavior

This is a high-impact regression because user-visible intent/response and actual execution diverge.

Root Cause

This is a high-impact regression because user-visible intent/response and actual execution diverge.

Fix Action

Fix / Workaround

  1. Start from a clean/stable state on OpenClaw >= 2026.3.22 (tested on 2026.3.22 / .23 / .24 / .31).
  2. Configure normal production-like usage:
    • Telegram connected
    • main agent active
    • cron jobs enabled (isolated agentTurn jobs, several per day)
    • exec-capable workflow (approval path can occur)
  3. Use the system normally for a while (manual requests + cron-triggered work).
  4. Send a task that requires actual multi-step execution (not just a text response).
  5. Observe that at some point the agent replies with a “starting now / I’ll do it” style message, but execution does not proceed correctly (or result is mismatched/stalled).
  6. After this onset, open a new session and retry similar tasks.
  7. Observe degraded behavior persists across new sessions.
  8. Restart gateway and retry:
    • sometimes one request works
    • issue quickly reappears
  9. Downgrade to pre-2026.3.22 version and retry same workflow.
  10. Observe behavior returns to normal after downgrade.

Repro characteristics

  • Intermittent onset, but persistent once triggered.

  • Persistence survives new sessions.

  • Gateway restart is not a durable fix.

  • Downgrade is a durable workaround.

  • Regression window: first observed on 2026.3.22; still reproducible on 2026.3.23, 2026.3.24, 2026.3.31.

  • Recovery behavior:

    • New session: no recovery
    • Gateway restart: temporary at best
    • Downgrade to pre-3.22: reliable recovery
  • Repro pattern:

    • System works normally for some time
    • Then suddenly enters degraded mode
    • Degraded mode persists until downgrade
  • Environment:

    • macOS 15.7.4 (x64), Node v22.22.0
    • Local gateway (loopback, token auth), Telegram-bound multi-agent setup
    • Cron + isolated agentTurn + sessions_send relay in active use
  • Mitigations attempted (not durable):

    • toggling exec notify-on-exit behavior
    • toggling internal command-logger hook
    • gateway restart
  • Suspected subsystem:

    • async execution follow-up / approval-timeout reconciliation / session state recovery after acknowledgement
  • Security/privacy:

    • logs sanitized (tokens and personal secrets redacted)

PR fix notes

PR #58860: fix(exec): resume agent session after approval completion

Description (problem / solution / changelog)

Summary

  • resume the original agent session when an approved async exec finishes and the followup still has a sessionKey
  • keep direct external sendMessage(...) delivery only as the no-session fallback
  • add regression coverage for external-route approval completions and continuation-oriented followup prompts

Change Type

  • Bug fix
  • New feature
  • Breaking change
  • Refactor
  • Docs
  • Test-only

Scope

This PR is intentionally narrow.

It only changes exec approval followup routing after an already-approved async exec completes.

It does not change:

  • exec approval request registration
  • exec allowlist / denylist policy evaluation
  • command execution behavior before approval
  • node pairing / control UI approval transport
  • unrelated skills / provider typing surfaces

Linked Issue/PR

  • Related #39648
  • Related #52850
  • Related #58797
  • Follow-up to merged PR #53702
  • This PR fixes a bug or regression

Root Cause / Regression History

Approved async execs could finish successfully, but channel-backed sessions preferred direct external followup delivery instead of resuming the original agent session.

In sendExecApprovalFollowup(), the presence of an externally deliverable origin route caused the code path to call sendMessage(...) directly, even when the original requester sessionKey still existed.

That produced a delivery-mirror style followup without waking the original session. In practice, the user could see an Exec finished ... message while the task stopped progressing until another user message arrived.

This patch makes the session continuation path authoritative whenever a sessionKey exists, while still preserving best-effort external delivery metadata on that resumed agent call.

Behavior Changes

Before:

  • approved async exec followups with an external route could bypass agent continuation
  • the user could receive an external completion message without the original task resuming
  • direct external delivery was preferred over session continuation whenever the route was deliverable

After:

  • approved async exec followups resume the original agent session whenever sessionKey exists
  • external route metadata is preserved on that continuation path via deliver, bestEffortDeliver, and route fields
  • direct external delivery remains only as the fallback when there is no session to resume

Regression Test Plan

Updated:

  • src/agents/bash-tools.exec-approval-followup.test.ts
  • src/agents/bash-tools.exec.approval-id.test.ts

Coverage:

  • followup success prompt tells the resumed agent to continue the task before replying
  • channel-backed followups with a live sessionKey use agent continuation instead of direct sendMessage(...)
  • no-session followups still fall back to direct external delivery
  • gateway exec approval flow with an external route resumes the original session and does not use direct outbound message delivery
  • delayed approval on a Discord session key does not trigger followup work before approval resolves, then auto-continues the same session after approval without requiring a second user turn

Repro + Verification

Observed in local runtime transcripts and gateway logs:

  • exec.approval.waitDecision resolved successfully
  • only an external Exec finished ... followup appeared
  • the original task did not continue until a new user turn arrived

With this patch:

  • focused regression tests pass for the exec-approval followup path
  • the new delayed-approval Discord continuation test verifies the real failure mode more directly:
    • the exec stays pending before approval resolves
    • no followup side effect occurs before approval
    • once approval resolves, the same Discord session is resumed automatically
    • no direct sendMessage(...) fallback is used while the session exists
  • manual validation on the installed runtime reproduced the original Discord approve flow and confirmed the task now continues after approval without needing an extra "好了嗎" message

Tests

Passed locally:

  • corepack pnpm exec vitest run src/agents/bash-tools.exec-approval-followup.test.ts src/agents/bash-tools.exec.approval-id.test.ts --reporter=verbose

Additional focused coverage added in this PR:

  • auto-continues the same Discord session after approval resolves without a second user turn

Did not pass end-to-end repo pre-commit:

  • pnpm check is currently blocked by unrelated upstream TypeScript issues in:
    • src/agents/skills.test-helpers.ts
    • src/agents/skills/local-loader.ts

Risks and Mitigations

Risk:

  • channel-backed approval completions now prefer session continuation whenever sessionKey exists, so any caller that relied on direct external-only followups in that state will now go through the resumed agent path

Mitigation:

  • the no-session fallback still uses direct external delivery
  • delivery metadata is preserved on the resumed path
  • regression coverage now locks the intended continuation behavior in place for both session-backed and no-session cases

AI assistance

AI-assisted: drafted and implemented with Codex, then locally reviewed and tested by me.

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • src/agents/bash-tools.exec-approval-followup.test.ts (modified, +39/-7)
  • src/agents/bash-tools.exec-approval-followup.ts (modified, +37/-45)
  • src/agents/bash-tools.exec.approval-id.test.ts (modified, +147/-0)

PR #58904: fix(agent): treat webchat exec approvals as native UI

Description (problem / solution / changelog)

Summary

  • treat webchat like Discord/Slack/Telegram for exec approval prompt guidance
  • stop telling agents to paste manual /approve commands when the runtime has native approval UI
  • add regression coverage for the webchat path

Why

This is a narrow residual after #58792 and #58860.

Those changes fixed the exec policy/config cluster and the async continuation path. One UI mismatch remained: webchat/control sessions still got the manual /approve system-prompt guidance even though approvals arrive through native UI.

Testing

  • pnpm test -- src/agents/system-prompt.test.ts
  • pnpm check

Related #52850 Related #58797

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • src/agents/system-prompt.test.ts (modified, +18/-1)
  • src/agents/system-prompt.ts (modified, +7/-2)

Code Example

- **Symptom evidence**
  - Agent sends “今からやる” style ack, but execution/result stalls or diverges.
  - Once triggered, issue persists across new sessions.
  - Gateway restart gives temporary recovery at best; downgrade restores stable behavior.

- **Log snippets to attach (exact strings observed)**
  - `An async command did not run`
  - `Exec denied (... approval-timeout)`
  - Async follow-up task events indicating completion/denial mismatch vs user-visible flow

- **Task/runtime evidence**
  - `openclaw tasks list --json`
  - `openclaw tasks audit --json`
  - Shows async follow-up/approval-related entries during affected window.

- **Gateway/runtime evidence**
  - `openclaw status`
  - `openclaw logs --plain --limit 200` (or larger window)
  - Include timeframe where first onset happened and persistence after new session/restart.

- **Screenshots to include**
  1) User message → agent says “starting now”
  2) No corresponding action completion (or wrong follow-up)
  3) Same behavior in a fresh session
  4) Temporary recovery after restart, then relapse
  5) Recovery after downgrade

- **Config/effective path evidence**
  - Effective path: `telegram -> openclaw-gateway(local) -> agent-router(binding) -> openai-codex(gpt-5.3-codex)`
  - Include relevant model/binding/cron settings excerpt (sanitized).

- **Sanitization note**
  - Redact bot tokens, auth tokens, wallet data, personal identifiers.
RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Beta release blocker

No

Summary

After updating from 2026.3.22 (also reproduced on 2026.3.23, 2026.3.24, 2026.3.31), the main agent intermittently enters a broken state:

  • It sends an acknowledgement like “I’ll do it now”
  • Then actual execution does not proceed (or appears stuck/misaligned)
  • Once this state appears, it does not recover by creating a new session
  • Gateway restart may temporarily recover once, but issue returns quickly
  • Only downgrading to an older version reliably recovers behavior

This is a high-impact regression because user-visible intent/response and actual execution diverge.

Steps to reproduce

  1. Start from a clean/stable state on OpenClaw >= 2026.3.22 (tested on 2026.3.22 / .23 / .24 / .31).
  2. Configure normal production-like usage:
    • Telegram connected
    • main agent active
    • cron jobs enabled (isolated agentTurn jobs, several per day)
    • exec-capable workflow (approval path can occur)
  3. Use the system normally for a while (manual requests + cron-triggered work).
  4. Send a task that requires actual multi-step execution (not just a text response).
  5. Observe that at some point the agent replies with a “starting now / I’ll do it” style message, but execution does not proceed correctly (or result is mismatched/stalled).
  6. After this onset, open a new session and retry similar tasks.
  7. Observe degraded behavior persists across new sessions.
  8. Restart gateway and retry:
    • sometimes one request works
    • issue quickly reappears
  9. Downgrade to pre-2026.3.22 version and retry same workflow.
  10. Observe behavior returns to normal after downgrade.

Repro characteristics

  • Intermittent onset, but persistent once triggered.
  • Persistence survives new sessions.
  • Gateway restart is not a durable fix.
  • Downgrade is a durable workaround.

Optional signals to capture during repro

  • task/log lines containing:
    • An async command did not run
    • Exec denied (... approval-timeout)
  • mismatch between user-visible “I’m doing it now” and actual completion behavior.

Expected behavior

When agent responds “starting now” / equivalent:

  1. execution should start immediately (or explicitly request approval)
  2. final result should be consistent with execution outcome
  3. failure/approval timeout should be surfaced deterministically
  4. state should not become permanently degraded across new sessions

Actual behavior

  • Agent sends “starting now” style response, then execution stalls or doesn’t follow through.
  • In degraded mode, this pattern repeats.
  • New session does not fix it.
  • Gateway restart gives at most temporary recovery.
  • Downgrade restores normal behavior.

OpenClaw version

OpenClaw versions tested: 2026.3.22, 2026.3.23, 2026.3.24, 2026.3.31

Operating system

macOS 15.7.4

Install method

npm global

Model

openai-codex/gpt-5.3-codex

Provider / routing chain

telegram -> openclaw-gateway(local) -> agent-router(binding) -> openai-codex(gpt-5.3-codex)

Additional provider/model setup details

  • Provider: openai-codex (OAuth profile)
  • Main model: openai-codex/gpt-5.3-codex
  • Global default: openai-codex/gpt-5.4 (fallbacks enabled)
  • Other configured provider: qwen-portal (qwen-portal/coder-model, qwen-portal/vision-model)
  • Routing: Telegram accountId bindings → agent-specific sessions (main, scout-agent, crypto-agent)
  • Gateway: local loopback (ws://127.0.0.1:18789, token auth), no external proxy
  • Cron: isolated agentTurn jobs, relay via sessions_send where applicable

Logs, screenshots, and evidence

- **Symptom evidence**
  - Agent sends “今からやる” style ack, but execution/result stalls or diverges.
  - Once triggered, issue persists across new sessions.
  - Gateway restart gives temporary recovery at best; downgrade restores stable behavior.

- **Log snippets to attach (exact strings observed)**
  - `An async command did not run`
  - `Exec denied (... approval-timeout)`
  - Async follow-up task events indicating completion/denial mismatch vs user-visible flow

- **Task/runtime evidence**
  - `openclaw tasks list --json`
  - `openclaw tasks audit --json`
  - Shows async follow-up/approval-related entries during affected window.

- **Gateway/runtime evidence**
  - `openclaw status`
  - `openclaw logs --plain --limit 200` (or larger window)
  - Include timeframe where first onset happened and persistence after new session/restart.

- **Screenshots to include**
  1) User message → agent says “starting now”
  2) No corresponding action completion (or wrong follow-up)
  3) Same behavior in a fresh session
  4) Temporary recovery after restart, then relapse
  5) Recovery after downgrade

- **Config/effective path evidence**
  - Effective path: `telegram -> openclaw-gateway(local) -> agent-router(binding) -> openai-codex(gpt-5.3-codex)`
  - Include relevant model/binding/cron settings excerpt (sanitized).

- **Sanitization note**
  - Redact bot tokens, auth tokens, wallet data, personal identifiers.

Impact and severity

  • Trust-breaking UX: agent says it is doing work, but no corresponding reliable action/result
  • Operational risk for automation workflows (cron + async + approval paths)
  • Requires rollback to continue stable usage

Additional information

  • Regression window: first observed on 2026.3.22; still reproducible on 2026.3.23, 2026.3.24, 2026.3.31.
  • Recovery behavior:
    • New session: no recovery
    • Gateway restart: temporary at best
    • Downgrade to pre-3.22: reliable recovery
  • Repro pattern:
    • System works normally for some time
    • Then suddenly enters degraded mode
    • Degraded mode persists until downgrade
  • Environment:
    • macOS 15.7.4 (x64), Node v22.22.0
    • Local gateway (loopback, token auth), Telegram-bound multi-agent setup
    • Cron + isolated agentTurn + sessions_send relay in active use
  • Mitigations attempted (not durable):
    • toggling exec notify-on-exit behavior
    • toggling internal command-logger hook
    • gateway restart
  • Suspected subsystem:
    • async execution follow-up / approval-timeout reconciliation / session state recovery after acknowledgement
  • Security/privacy:
    • logs sanitized (tokens and personal secrets redacted)

extent analysis

TL;DR

Downgrade to a version prior to 2026.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

When agent responds “starting now” / equivalent:

  1. execution should start immediately (or explicitly request approval)
  2. final result should be consistent with execution outcome
  3. failure/approval timeout should be surfaced deterministically
  4. state should not become permanently degraded across new sessions

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING