openclaw - ✅(Solved) Fix Agent silently hangs on tool-write rejection instead of surfacing error [2 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#77821Fetched 2026-05-06 06:20:51
View on GitHub
Comments
1
Participants
2
Timeline
3
Reactions
2
Author
Timeline (top)
cross-referenced ×2commented ×1

When openclaw's embedded agent calls a restricted tool (e.g., write to a path outside the allowed whitelist) during a pre-compaction/memory-flush phase, the agent silently hangs instead of reporting the tool failure back in chat. The TUI continues to show "conjuring…" for many minutes with no visible error. The user eventually has to kill the run manually.

Error Message

Agent silently hangs on tool-write rejection instead of surfacing error

When openclaw's embedded agent calls a restricted tool (e.g., write to a path outside the allowed whitelist) during a pre-compaction/memory-flush phase, the agent silently hangs instead of reporting the tool failure back in chat. The TUI continues to show "conjuring…" for many minutes with no visible error. The user eventually has to kill the run manually. 4. Expected: the agent receives the tool-failure result, surfaces an error message in chat ("I tried to write X but it's restricted because …"), offers an alternative, or gracefully aborts. 5. Actual: agent goes silent. TUI timer keeps incrementing. No chat output. No error surfaced. 16:45:27 WARN empty response detected: runId=ea815991… provider=amazon-bedrock/us.anthropic.claude-opus-4-7 — retrying 1/1 16:50:27 WARN Profile amazon-bedrock:default timed out. Trying next account... 16:50:27 WARN embedded_run_failover_decision: decision=surface_error failoverReason=timeout timedOut=true aborted=true fallbackConfigured=false 16:56:58 ERROR [tools] write failed: Memory flush writes are restricted to memory/2026-05-05.md; use that path only.

  1. Tool-rejection path must emit a tool-result turn back to the model, not silently absorb it. Typical LLM-agent harnesses return {role: "tool", content: "error: …"} so the model can react.
  2. Surface agent/embedded tool-failure events to the TUI as a visible error bubble (e.g. ⚠ tool 'write' rejected: …), not just as a backend log line.
  • The same run also hit an empty-response retry at 16:45:27 and a 5-minute Bedrock profile timeout at 16:50:27 — the failover code correctly surfaced that error. Only the later write-tool rejection at 16:56:58 went silent. So the silent-hang is specific to tool-level rejection, not to Bedrock-level failures.

Root Cause

  1. In a long-running embedded agent session (high context, post-auto-compact-threshold), prompt the agent to create files outside the memory-flush whitelist — e.g., ask it to write ~/.openclaw/bin/some-wrapper.sh.
  2. Agent plans, generates content, invokes write tool.
  3. The write tool handler rejects the path with:
    [tools] write failed: Memory flush writes are restricted to memory/2026-05-05.md; use that path only.
  4. Expected: the agent receives the tool-failure result, surfaces an error message in chat ("I tried to write X but it's restricted because …"), offers an alternative, or gracefully aborts.
  5. Actual: agent goes silent. TUI timer keeps incrementing. No chat output. No error surfaced.

Fix Action

Workaround

Until fixed: kill the stuck run via launchctl kickstart -k gui/$(id -u)/ai.openclaw.gateway or restart the TUI. Any valuable reasoning from the hung run is lost.

PR fix notes

PR #78108: fix: surface silent turn error payloads

Description (problem / solution / changelog)

Summary

Surfaces error payloads from silent/maintenance turns instead of filtering them away.

The memory-flush write guard was doing the right thing by rejecting writes outside the allowed memory file. The garbage part was downstream: the runner could convert the tool rejection into an isError payload, then silentExpected filtering dropped every non-voice payload. That makes a safety rail look like a hung agent.

Changes

  • Keep isError payloads during silent turns.
  • Preserve existing suppression of ordinary silent text/media payloads.
  • Add regression coverage for a restricted memory-flush write warning surviving silent-turn filtering.

Testing

  • git diff --check fork/main...HEAD — passed.
  • PATH="/tmp/openclaw-pnpm-shim:$PATH" pnpm test src/auto-reply/reply/agent-runner-payloads.test.ts src/agents/pi-embedded-runner/run/payloads.errors.test.ts -- --reporter=dot — passed, 2 files, 73 tests.
  • PATH="/tmp/openclaw-pnpm-shim:$PATH" node scripts/check-changed.mjs — passed.

Fixes #77821

Changed files

  • src/auto-reply/reply/agent-runner-payloads.test.ts (modified, +19/-0)
  • src/auto-reply/reply/agent-runner-payloads.ts (modified, +3/-0)

PR #78152: fix(auto-reply): keep silent turn errors visible

Description (problem / solution / changelog)

Summary

Describe the problem and fix in 2–5 bullets:

  • Problem: silent memory-flush maintenance turns could produce terminal error payloads, but the caller discarded them before chat delivery.
  • Why it matters: restricted memory-flush tool write failures could leave users looking at an apparently stuck run with no visible error.
  • What changed: memory-flush runs now report renderable isError payloads back to runReplyAgent, and runReplyAgent delivers those errors before starting the main reply run; the existing silent-turn payload filter also preserves isError payloads.
  • What did NOT change (scope boundary): normal non-error silent maintenance replies are still suppressed, and the memory-flush write restriction is unchanged.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #77821
  • Related #
  • This PR fixes a bug or regression

Real behavior proof (required for external PRs)

External contributors must show after-fix evidence from a real OpenClaw setup. Unit tests, mocks, lint, typechecks, snapshots, and CI are supplemental only. Screenshots are encouraged even for CLI, console, text, or log changes; terminal screenshots and copied live output count.

  • Behavior or issue addressed: #77821, where a silent memory-flush run could drop a restricted-write error and leave no visible chat error.
  • Real environment tested: local OpenClaw source checkout on Linux, Node 24.14.1, pnpm 10.33.2.
  • Exact steps or command run after this patch:
    • pnpm test src/auto-reply/reply/agent-runner-memory.test.ts src/auto-reply/reply/agent-runner-payloads.test.ts src/auto-reply/reply/agent-runner-direct-runtime-config.test.ts -- --reporter=verbose
    • pnpm check:changed --base refs/tmp/openclaw-pr-main
  • Evidence after fix (screenshot, recording, terminal capture, console output, redacted runtime log, linked artifact, or copied live output): copied terminal output showed [test] passed 1 Vitest shard in 44.15s and check:changed exited 0 after core/core-test typecheck, core lint, and runtime guards.
  • Observed result after fix: runMemoryFlushIfNeeded reports only renderable memory-flush isError payloads, runReplyAgent returns those errors before the main reply run, and the silent-turn payload filter keeps isError payloads while still dropping normal silent maintenance text.
  • What was not tested: the reporter's macOS launchd, Bedrock, and WebChat setup was not available locally.
  • Before evidence (optional but encouraged): previous source path used the memory-flush result only for metadata/session bookkeeping, so returned error payloads did not reach reply delivery.

Root Cause (if applicable)

For bug fixes or regressions, explain why this happened, not just what changed. Otherwise write N/A. If the cause is unclear, write Unknown.

  • Root cause: the memory-flush caller awaited runEmbeddedPiAgent with silentExpected: true, then used the result only for metadata/session bookkeeping. Renderable terminal isError payloads from the memory-flush run were not routed into reply delivery.
  • Missing detection / guardrail: there was no regression test for memory-flush error payload delivery through runMemoryFlushIfNeeded and the pre-run runReplyAgent path.
  • Contributing context (if known): silent-turn payload filtering also dropped text-only isError payloads, so even a routed error payload needed an explicit silent-turn exception.

Regression Test Plan (if applicable)

For bug fixes or regressions, name the smallest reliable test coverage that should catch this. Otherwise write N/A.

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/auto-reply/reply/agent-runner-memory.test.ts, src/auto-reply/reply/agent-runner-direct-runtime-config.test.ts, and src/auto-reply/reply/agent-runner-payloads.test.ts
  • Scenario the test should lock in: memory-flush error payloads are reported for visible delivery, the reply runner returns them before the main reply run, and silent-turn filtering keeps isError payloads while suppressing ordinary maintenance text.
  • Why this is the smallest reliable guardrail: the bug is in the memory-flush result handoff and final reply payload filtering, independent of provider runtime behavior.
  • Existing test that already covers this (if any): existing silent-turn tests covered dropping ordinary payloads and keeping voice media, but not error payloads or the memory-flush handoff.
  • If no new test is added, why not: N/A

User-visible / Behavior Changes

Restricted memory-flush write failures that produce terminal error payloads can now be delivered visibly instead of being discarded by the silent maintenance path.

Diagram (if applicable)

N/A

Before:
memory flush run -> error payload -> metadata-only handling -> no reply

After:
memory flush run -> error payload -> visible error handoff -> chat reply

Security Impact (required)

  • New permissions/capabilities? (Yes/No) No
  • Secrets/tokens handling changed? (Yes/No) No
  • New/changed network calls? (Yes/No) No
  • Command/tool execution surface changed? (Yes/No) No
  • Data access scope changed? (Yes/No) No
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: Linux
  • Runtime/container: Node 24.14.1, pnpm 10.33.2
  • Model/provider: N/A for the focused memory-flush handoff seam
  • Integration/channel (if any): N/A
  • Relevant config (redacted): N/A

Steps

  1. Run a memory-flush turn that returns one normal maintenance payload and one isError payload.
  2. Verify only the isError payload is reported for visible delivery.
  3. Verify runReplyAgent returns that error before starting the main reply run.
  4. Run the focused payload tests and changed gate.

Expected

  • Silent memory-flush maintenance mode keeps terminal error payloads visible.

Actual

  • The new regressions pass, and the changed gate passes for the exact PR diff.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios: memory-flush isError payload handoff, pre-run reply-agent return, silent-turn error filtering, and existing silent-turn voice-media behavior.
  • Edge cases checked: ordinary non-error silent maintenance text is still suppressed.
  • What you did not verify: the reporter's full macOS launchd, Bedrock, and WebChat runtime.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

  • Backward compatible? (Yes/No) Yes
  • Config/env changes? (Yes/No) No
  • Migration needed? (Yes/No) No
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: silent maintenance turns may now surface terminal error text that was previously hidden.
    • Mitigation: the change is limited to payloads already marked isError; ordinary silent maintenance output remains suppressed.

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • src/auto-reply/reply/agent-runner-direct-runtime-config.test.ts (modified, +47/-9)
  • src/auto-reply/reply/agent-runner-memory.test.ts (modified, +44/-0)
  • src/auto-reply/reply/agent-runner-memory.ts (modified, +28/-1)
  • src/auto-reply/reply/agent-runner-payloads.test.ts (modified, +20/-0)
  • src/auto-reply/reply/agent-runner-payloads.ts (modified, +3/-0)
  • src/auto-reply/reply/agent-runner.ts (modified, +39/-0)

Code Example

[tools] write failed: Memory flush writes are restricted to memory/2026-05-05.md; use that path only.

---

16:45:27 WARN  empty response detected: runId=ea815991… provider=amazon-bedrock/us.anthropic.claude-opus-4-7 — retrying 1/1
16:50:27 WARN  Profile amazon-bedrock:default timed out. Trying next account...
16:50:27 WARN  embedded_run_failover_decision: decision=surface_error failoverReason=timeout timedOut=true aborted=true fallbackConfigured=false
16:56:58 ERROR [tools] write failed: Memory flush writes are restricted to memory/2026-05-05.md; use that path only.
              raw_params={"content":"#!/bin/zsh\n# gateway-launchd-wrapper.sh — invoked by ai.openclaw.gateway.plist\n# Runs patch-bootstrap THEN bedrock_guard BEFORE exec'ing the gateway..."}
RAW_BUFFERClick to expand / collapse

Agent silently hangs on tool-write rejection instead of surfacing error

Summary

When openclaw's embedded agent calls a restricted tool (e.g., write to a path outside the allowed whitelist) during a pre-compaction/memory-flush phase, the agent silently hangs instead of reporting the tool failure back in chat. The TUI continues to show "conjuring…" for many minutes with no visible error. The user eventually has to kill the run manually.

Environment

  • openclaw version: 2026.5.2
  • @mariozechner/pi-ai: 0.71.1
  • macOS (launchd)
  • Gateway: ws://127.0.0.1:18789
  • Agent: main / session main via webchat
  • Provider: amazon-bedrock/us.anthropic.claude-opus-4-7
  • Context at hang: ~165k tokens (16% of 1M)

Reproduction

  1. In a long-running embedded agent session (high context, post-auto-compact-threshold), prompt the agent to create files outside the memory-flush whitelist — e.g., ask it to write ~/.openclaw/bin/some-wrapper.sh.
  2. Agent plans, generates content, invokes write tool.
  3. The write tool handler rejects the path with:
    [tools] write failed: Memory flush writes are restricted to memory/2026-05-05.md; use that path only.
  4. Expected: the agent receives the tool-failure result, surfaces an error message in chat ("I tried to write X but it's restricted because …"), offers an alternative, or gracefully aborts.
  5. Actual: agent goes silent. TUI timer keeps incrementing. No chat output. No error surfaced.

Evidence (from /tmp/openclaw/openclaw-2026-05-05.log)

16:45:27 WARN  empty response detected: runId=ea815991… provider=amazon-bedrock/us.anthropic.claude-opus-4-7 — retrying 1/1
16:50:27 WARN  Profile amazon-bedrock:default timed out. Trying next account...
16:50:27 WARN  embedded_run_failover_decision: decision=surface_error failoverReason=timeout timedOut=true aborted=true fallbackConfigured=false
16:56:58 ERROR [tools] write failed: Memory flush writes are restricted to memory/2026-05-05.md; use that path only.
              raw_params={"content":"#!/bin/zsh\n# gateway-launchd-wrapper.sh — invoked by ai.openclaw.gateway.plist\n# Runs patch-bootstrap THEN bedrock_guard BEFORE exec'ing the gateway..."}

After 16:56:58, no further agent/embedded events for the affected runId. The run went silent. TUI continued showing "conjuring 1m 46s" even though no inference was in flight.

Impact

  • User cannot tell whether the agent is thinking, stuck, or dead. Visible-state divergence between agent loop and TUI.
  • Wasted wall-clock waiting + context tokens spent on a run that can never complete.
  • Data-loss risk: whatever useful reasoning the agent did before the tool call is effectively orphaned.
  • Pairs badly with long-running debug sessions where memory-flush restrictions kick in late — exactly when you need the agent most.

Suggested fix

  1. Tool-rejection path must emit a tool-result turn back to the model, not silently absorb it. Typical LLM-agent harnesses return {role: "tool", content: "error: …"} so the model can react.
  2. Surface agent/embedded tool-failure events to the TUI as a visible error bubble (e.g. ⚠ tool 'write' rejected: …), not just as a backend log line.
  3. If the agent's tool result is blocked and no replacement response is generated within N seconds, terminate the run with decision=surface_error (same pattern the Bedrock-failover code already uses at 16:50:27 in the log above).

Related observations (for triage)

  • The same run also hit an empty-response retry at 16:45:27 and a 5-minute Bedrock profile timeout at 16:50:27 — the failover code correctly surfaced that error. Only the later write-tool rejection at 16:56:58 went silent. So the silent-hang is specific to tool-level rejection, not to Bedrock-level failures.
  • The memory-flush restriction itself is fine — it's a legitimate safety rail. The bug is purely the silent-hang response to the rejection.

Workaround

Until fixed: kill the stuck run via launchctl kickstart -k gui/$(id -u)/ai.openclaw.gateway or restart the TUI. Any valuable reasoning from the hung run is lost.

extent analysis

TL;DR

The agent silently hangs when a tool-write rejection occurs, instead of surfacing an error, and can be fixed by modifying the tool-rejection path to emit a tool-result turn back to the model and surfacing tool-failure events to the TUI.

Guidance

  • Modify the tool-rejection path to return a tool-result turn with an error message, such as {role: "tool", content: "error: …"}, to allow the model to react to the rejection.
  • Surface tool-failure events to the TUI as a visible error bubble, such as ⚠ tool 'write' rejected: …, to inform the user of the error.
  • Implement a timeout mechanism to terminate the run with decision=surface_error if the agent's tool result is blocked and no replacement response is generated within a certain time frame (e.g., N seconds).
  • Verify that the fix works by testing the agent with a restricted tool call and checking that an error message is surfaced in the TUI.

Example

// Example tool-result turn with error message
{
  "role": "tool",
  "content": "error: write failed: Memory flush writes are restricted to memory/2026-05-05.md; use that path only."
}

Notes

The fix requires modifying the agent's tool-rejection handling and TUI error surfacing. The suggested fix is specific to the tool-level rejection and does not affect Bedrock-level failures.

Recommendation

Apply the suggested fix to modify the tool-rejection path and surface tool-failure events to the TUI, as it addresses the root cause of the silent hang and provides a better user experience.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix Agent silently hangs on tool-write rejection instead of surfacing error [2 pull requests, 1 comments, 2 participants]