openclaw - ✅(Solved) Fix Agent silently hangs on tool-write rejection instead of surfacing error [2 pull requests, 1 comments, 2 participants]

praxstack · 2026-05-05T11:50:33Z

[openclaw] When openclaw's embedded agent calls a restricted tool e.g., write to a path outside the allowed whitelist during a pre-compaction/memory-flush phas… When openclaw's embedded agent calls a restricted tool (e.g., `write` to a path outside the allowed whitelist) during a pre-compaction/memory-flush phase, the agent **silently hangs** instead of reporting the tool failure back in chat. The TUI continues to show "conjuring…" for many minutes with no visible error. The user eventually has to kill the run manually. # PR #78108: fix: surface silent turn error payloads - Repository: openclaw/openclaw - Author: bryce-d-greybeard - State: closed | merged: False - Link: https://github.com/openclaw/openclaw/pull/78108 ## Description (problem / solution / changelog) ## Summary Surfaces error payloads from silent/maintenance turns instead of filtering them away. The memory-flush write guard was doing the right thing by rejecting writes outside the allowed memory file. The garbage part was downstream: the runner could convert the tool rejection into an `isError` payload, then `silentExpected` filtering dropped every non-voice payload. That makes a safety rail look like a hung agent. ## Changes - Keep `isError` payloads during silent turns. - Preserve existing suppression of ordinary silent text/media payloads. - Add regression coverage for a restricted memory-flush write warning surviving silent-turn filtering. ## Testing - `git diff --check fork/main...HEAD` — passed. - `PATH="/tmp/openclaw-pnpm-shim:$PATH" pnpm test src/auto-reply/reply/agent-runner-payloads.test.ts src/agents/pi-embedded-runner/run/payloads.errors.test.ts -- --reporter=dot` — passed, 2 files, 73 tests. - `PATH="/tmp/openclaw-pnpm-shim:$PATH" node scripts/check-changed.mjs` — passed. Fixes #77821 ## Changed files - `src/auto-reply/reply/agent-runner-payloads.test.ts` (modified, +19/-0) - `src/auto-reply/reply/agent-runner-payloads.ts` (modified, +3/-0) --- # PR #78152: fix(auto-reply): keep silent turn errors visible - Repository: openclaw/openclaw - Author: leonaIee - State: open | merged: False - Link: https://github.com/openclaw/openclaw/pull/78152 ## Description (problem / solution / changelog) ## Summary Describe the problem and fix in 2–5 bullets: - Problem: silent memory-flush maintenance turns could produce terminal error payloads, but the caller discarded them before chat delivery. - Why it matters: restricted memory-flush tool write failures could leave users looking at an apparently stuck run with no visible error. - What changed: memory-flush runs now report renderable `isError` payloads back to `runReplyAgent`, and `runReplyAgent` delivers those errors before starting the main reply run; the existing silent-turn payload filter also preserves `isError` payloads. - What did NOT change (scope boundary): normal non-error silent maintenance replies are still suppressed, and the memory-flush write restriction is unchanged. ## Change Type (select all) - [x] Bug fix - [ ] Feature - [ ] Refactor required for the fix - [ ] Docs - [ ] Security hardening - [ ] Chore/infra ## Scope (select all touched areas) - [x] Gateway / orchestration - [x] Skills / tool execution - [ ] Auth / tokens - [x] Memory / storage - [ ] Integrations - [ ] API / contracts - [ ] UI / DX - [ ] CI/CD / infra ## Linked Issue/PR - Closes #77821 - [ ] Related # - [x] This PR fixes a bug or regression ## Real behavior proof (required for external PRs) External contributors must show after-fix evidence from a real OpenClaw setup. Unit tests, mocks, lint, typechecks, snapshots, and CI are supplemental only. Screenshots are encouraged even for CLI, console, text, or log changes; terminal screenshots and copied live output count. - Behavior or issue addressed: #77821, where a silent memory-flush run could drop a restricted-write error and leave no visible chat error. - Real environment tested: local OpenClaw source checkout on Linux, Node 24.14.1, pnpm 10.33.2. - Exact steps or command run after this patch: - `pnpm test src/auto-reply/reply/agent-runner-memory.test.ts src/auto-reply/reply/agent-runner-payloads.test.ts src/auto-reply/reply/agent-runner-direct-runtime-config.test.ts -- --reporter=verbose` - `pnpm check:changed --base refs/tmp/openclaw-pr-main` - Evidence after fix (screenshot, recording, terminal capture, console output, redacted runtime log, linked artifact, or copied live output): copied terminal output showed `[test] passed 1 Vitest shard in 44.15s` and `check:changed` exited 0 after core/core-test typecheck, core lint, and runtime guards. - Observed result after fix: `runMemoryFlushIfNeeded` reports only renderable memory-flush `isError` payloads, `runReplyAgent` returns those errors before the main reply run, and the silent-turn payload filter keeps `isError` payloads while still dropping normal silent maintenance text. - What was not tested: the reporter's macOS launchd, Bedrock, and WebChat setup was not available locally. - Before evidence (optional but

openclaw2026-05-05 11:50:33

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#77821•Fetched 2026-05-06 06:20:51

View on GitHub

Comments

Participants

Timeline

Reactions

Author

praxstack

Participants

clawsweeper[bot]

praxstack

Timeline (top)

cross-referenced ×2commented ×1

Error Message

Agent silently hangs on tool-write rejection instead of surfacing error

When openclaw's embedded agent calls a restricted tool (e.g., write to a path outside the allowed whitelist) during a pre-compaction/memory-flush phase, the agent silently hangs instead of reporting the tool failure back in chat. The TUI continues to show "conjuring…" for many minutes with no visible error. The user eventually has to kill the run manually. 4. Expected: the agent receives the tool-failure result, surfaces an error message in chat ("I tried to write X but it's restricted because …"), offers an alternative, or gracefully aborts. 5. Actual: agent goes silent. TUI timer keeps incrementing. No chat output. No error surfaced. 16:45:27 WARN empty response detected: runId=ea815991… provider=amazon-bedrock/us.anthropic.claude-opus-4-7 — retrying 1/1 16:50:27 WARN Profile amazon-bedrock:default timed out. Trying next account... 16:50:27 WARN embedded_run_failover_decision: decision=surface_error failoverReason=timeout timedOut=true aborted=true fallbackConfigured=false 16:56:58 ERROR [tools] write failed: Memory flush writes are restricted to memory/2026-05-05.md; use that path only.

Tool-rejection path must emit a tool-result turn back to the model, not silently absorb it. Typical LLM-agent harnesses return {role: "tool", content: "error: …"} so the model can react.
Surface agent/embedded tool-failure events to the TUI as a visible error bubble (e.g. ⚠ tool 'write' rejected: …), not just as a backend log line.

The same run also hit an empty-response retry at 16:45:27 and a 5-minute Bedrock profile timeout at 16:50:27 — the failover code correctly surfaced that error. Only the later write-tool rejection at 16:56:58 went silent. So the silent-hang is specific to tool-level rejection, not to Bedrock-level failures.

Root Cause

In a long-running embedded agent session (high context, post-auto-compact-threshold), prompt the agent to create files outside the memory-flush whitelist — e.g., ask it to write ~/.openclaw/bin/some-wrapper.sh.
Agent plans, generates content, invokes write tool.

The write tool handler rejects the path with:

[tools] write failed: Memory flush writes are restricted to memory/2026-05-05.md; use that path only.

Expected: the agent receives the tool-failure result, surfaces an error message in chat ("I tried to write X but it's restricted because …"), offers an alternative, or gracefully aborts.
Actual: agent goes silent. TUI timer keeps incrementing. No chat output. No error surfaced.

Fix Action

Workaround

Until fixed: kill the stuck run via launchctl kickstart -k gui/$(id -u)/ai.openclaw.gateway or restart the TUI. Any valuable reasoning from the hung run is lost.

PR fix notes

PR #78108: fix: surface silent turn error payloads

Repository: openclaw/openclaw
Author: bryce-d-greybeard
State: closed | merged: False
Link: https://github.com/openclaw/openclaw/pull/78108

Description (problem / solution / changelog)

Summary

Surfaces error payloads from silent/maintenance turns instead of filtering them away.

The memory-flush write guard was doing the right thing by rejecting writes outside the allowed memory file. The garbage part was downstream: the runner could convert the tool rejection into an isError payload, then silentExpected filtering dropped every non-voice payload. That makes a safety rail look like a hung agent.

Changes

Keep isError payloads during silent turns.
Preserve existing suppression of ordinary silent text/media payloads.
Add regression coverage for a restricted memory-flush write warning surviving silent-turn filtering.

Testing

git diff --check fork/main...HEAD — passed.
PATH="/tmp/openclaw-pnpm-shim:$PATH" pnpm test src/auto-reply/reply/agent-runner-payloads.test.ts src/agents/pi-embedded-runner/run/payloads.errors.test.ts -- --reporter=dot — passed, 2 files, 73 tests.
PATH="/tmp/openclaw-pnpm-shim:$PATH" node scripts/check-changed.mjs — passed.

Fixes #77821

Changed files

src/auto-reply/reply/agent-runner-payloads.test.ts (modified, +19/-0)
src/auto-reply/reply/agent-runner-payloads.ts (modified, +3/-0)

PR #78152: fix(auto-reply): keep silent turn errors visible

Repository: openclaw/openclaw
Author: leonaIee
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/78152

Description (problem / solution / changelog)

Summary

Describe the problem and fix in 2–5 bullets:

Problem: silent memory-flush maintenance turns could produce terminal error payloads, but the caller discarded them before chat delivery.
Why it matters: restricted memory-flush tool write failures could leave users looking at an apparently stuck run with no visible error.
What changed: memory-flush runs now report renderable isError payloads back to runReplyAgent, and runReplyAgent delivers those errors before starting the main reply run; the existing silent-turn payload filter also preserves isError payloads.
What did NOT change (scope boundary): normal non-error silent maintenance replies are still suppressed, and the memory-flush write restriction is unchanged.

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Closes #77821
Related #
This PR fixes a bug or regression

Real behavior proof (required for external PRs)

External contributors must show after-fix evidence from a real OpenClaw setup. Unit tests, mocks, lint, typechecks, snapshots, and CI are supplemental only. Screenshots are encouraged even for CLI, console, text, or log changes; terminal screenshots and copied live output count.

Behavior or issue addressed: #77821, where a silent memory-flush run could drop a restricted-write error and leave no visible chat error.
Real environment tested: local OpenClaw source checkout on Linux, Node 24.14.1, pnpm 10.33.2.
Exact steps or command run after this patch:
- pnpm test src/auto-reply/reply/agent-runner-memory.test.ts src/auto-reply/reply/agent-runner-payloads.test.ts src/auto-reply/reply/agent-runner-direct-runtime-config.test.ts -- --reporter=verbose
- pnpm check:changed --base refs/tmp/openclaw-pr-main
Evidence after fix (screenshot, recording, terminal capture, console output, redacted runtime log, linked artifact, or copied live output): copied terminal output showed [test] passed 1 Vitest shard in 44.15s and check:changed exited 0 after core/core-test typecheck, core lint, and runtime guards.
Observed result after fix: runMemoryFlushIfNeeded reports only renderable memory-flush isError payloads, runReplyAgent returns those errors before the main reply run, and the silent-turn payload filter keeps isError payloads while still dropping normal silent maintenance text.
What was not tested: the reporter's macOS launchd, Bedrock, and WebChat setup was not available locally.
Before evidence (optional but encouraged): previous source path used the memory-flush result only for metadata/session bookkeeping, so returned error payloads did not reach reply delivery.

Root Cause (if applicable)

For bug fixes or regressions, explain why this happened, not just what changed. Otherwise write N/A. If the cause is unclear, write Unknown.

Root cause: the memory-flush caller awaited runEmbeddedPiAgent with silentExpected: true, then used the result only for metadata/session bookkeeping. Renderable terminal isError payloads from the memory-flush run were not routed into reply delivery.
Missing detection / guardrail: there was no regression test for memory-flush error payload delivery through runMemoryFlushIfNeeded and the pre-run runReplyAgent path.
Contributing context (if known): silent-turn payload filtering also dropped text-only isError payloads, so even a routed error payload needed an explicit silent-turn exception.

Regression Test Plan (if applicable)

For bug fixes or regressions, name the smallest reliable test coverage that should catch this. Otherwise write N/A.

Coverage level that should have caught this:
- Unit test
- Seam / integration test
- End-to-end test
- Existing coverage already sufficient
Target test or file: src/auto-reply/reply/agent-runner-memory.test.ts, src/auto-reply/reply/agent-runner-direct-runtime-config.test.ts, and src/auto-reply/reply/agent-runner-payloads.test.ts
Scenario the test should lock in: memory-flush error payloads are reported for visible delivery, the reply runner returns them before the main reply run, and silent-turn filtering keeps isError payloads while suppressing ordinary maintenance text.
Why this is the smallest reliable guardrail: the bug is in the memory-flush result handoff and final reply payload filtering, independent of provider runtime behavior.
Existing test that already covers this (if any): existing silent-turn tests covered dropping ordinary payloads and keeping voice media, but not error payloads or the memory-flush handoff.
If no new test is added, why not: N/A

User-visible / Behavior Changes

Restricted memory-flush write failures that produce terminal error payloads can now be delivered visibly instead of being discarded by the silent maintenance path.

Diagram (if applicable)

N/A

Before:
memory flush run -> error payload -> metadata-only handling -> no reply

After:
memory flush run -> error payload -> visible error handoff -> chat reply

Security Impact (required)

New permissions/capabilities? (Yes/No) No
Secrets/tokens handling changed? (Yes/No) No
New/changed network calls? (Yes/No) No
Command/tool execution surface changed? (Yes/No) No
Data access scope changed? (Yes/No) No
If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

OS: Linux
Runtime/container: Node 24.14.1, pnpm 10.33.2
Model/provider: N/A for the focused memory-flush handoff seam
Integration/channel (if any): N/A
Relevant config (redacted): N/A

Steps

Run a memory-flush turn that returns one normal maintenance payload and one isError payload.
Verify only the isError payload is reported for visible delivery.
Verify runReplyAgent returns that error before starting the main reply run.
Run the focused payload tests and changed gate.

Expected

Silent memory-flush maintenance mode keeps terminal error payloads visible.

Actual

The new regressions pass, and the changed gate passes for the exact PR diff.

Evidence

Attach at least one:

Failing test/log before + passing after
Trace/log snippets
Screenshot/recording
Perf numbers (if relevant)

Human Verification (required)

What you personally verified (not just CI), and how:

Verified scenarios: memory-flush isError payload handoff, pre-run reply-agent return, silent-turn error filtering, and existing silent-turn voice-media behavior.
Edge cases checked: ordinary non-error silent maintenance text is still suppressed.
What you did not verify: the reporter's full macOS launchd, Bedrock, and WebChat runtime.

Review Conversations

I replied to or resolved every bot review conversation I addressed in this PR.
I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

Backward compatible? (Yes/No) Yes
Config/env changes? (Yes/No) No
Migration needed? (Yes/No) No
If yes, exact upgrade steps: N/A

Risks and Mitigations

Risk: silent maintenance turns may now surface terminal error text that was previously hidden.
- Mitigation: the change is limited to payloads already marked isError; ordinary silent maintenance output remains suppressed.

Changed files

CHANGELOG.md (modified, +1/-0)
src/auto-reply/reply/agent-runner-direct-runtime-config.test.ts (modified, +47/-9)
src/auto-reply/reply/agent-runner-memory.test.ts (modified, +44/-0)
src/auto-reply/reply/agent-runner-memory.ts (modified, +28/-1)
src/auto-reply/reply/agent-runner-payloads.test.ts (modified, +20/-0)
src/auto-reply/reply/agent-runner-payloads.ts (modified, +3/-0)
src/auto-reply/reply/agent-runner.ts (modified, +39/-0)

Code Example

[tools] write failed: Memory flush writes are restricted to memory/2026-05-05.md; use that path only.

---

16:45:27 WARN  empty response detected: runId=ea815991… provider=amazon-bedrock/us.anthropic.claude-opus-4-7 — retrying 1/1
16:50:27 WARN  Profile amazon-bedrock:default timed out. Trying next account...
16:50:27 WARN  embedded_run_failover_decision: decision=surface_error failoverReason=timeout timedOut=true aborted=true fallbackConfigured=false
16:56:58 ERROR [tools] write failed: Memory flush writes are restricted to memory/2026-05-05.md; use that path only.
              raw_params={"content":"#!/bin/zsh\n# gateway-launchd-wrapper.sh — invoked by ai.openclaw.gateway.plist\n# Runs patch-bootstrap THEN bedrock_guard BEFORE exec'ing the gateway..."}

RAW_BUFFERClick to expand / collapse

Agent silently hangs on tool-write rejection instead of surfacing error

Summary

Environment

openclaw version: 2026.5.2
@mariozechner/pi-ai: 0.71.1
macOS (launchd)
Gateway: ws://127.0.0.1:18789
Agent: main / session main via webchat
Provider: amazon-bedrock/us.anthropic.claude-opus-4-7
Context at hang: ~165k tokens (16% of 1M)

Reproduction

In a long-running embedded agent session (high context, post-auto-compact-threshold), prompt the agent to create files outside the memory-flush whitelist — e.g., ask it to write ~/.openclaw/bin/some-wrapper.sh.
Agent plans, generates content, invokes write tool.

The write tool handler rejects the path with:

[tools] write failed: Memory flush writes are restricted to memory/2026-05-05.md; use that path only.

Expected: the agent receives the tool-failure result, surfaces an error message in chat ("I tried to write X but it's restricted because …"), offers an alternative, or gracefully aborts.
Actual: agent goes silent. TUI timer keeps incrementing. No chat output. No error surfaced.

Evidence (from `/tmp/openclaw/openclaw-2026-05-05.log`)

16:45:27 WARN  empty response detected: runId=ea815991… provider=amazon-bedrock/us.anthropic.claude-opus-4-7 — retrying 1/1
16:50:27 WARN  Profile amazon-bedrock:default timed out. Trying next account...
16:50:27 WARN  embedded_run_failover_decision: decision=surface_error failoverReason=timeout timedOut=true aborted=true fallbackConfigured=false
16:56:58 ERROR [tools] write failed: Memory flush writes are restricted to memory/2026-05-05.md; use that path only.
              raw_params={"content":"#!/bin/zsh\n# gateway-launchd-wrapper.sh — invoked by ai.openclaw.gateway.plist\n# Runs patch-bootstrap THEN bedrock_guard BEFORE exec'ing the gateway..."}

After 16:56:58, no further agent/embedded events for the affected runId. The run went silent. TUI continued showing "conjuring 1m 46s" even though no inference was in flight.

Impact

User cannot tell whether the agent is thinking, stuck, or dead. Visible-state divergence between agent loop and TUI.
Wasted wall-clock waiting + context tokens spent on a run that can never complete.
Data-loss risk: whatever useful reasoning the agent did before the tool call is effectively orphaned.
Pairs badly with long-running debug sessions where memory-flush restrictions kick in late — exactly when you need the agent most.

Suggested fix

Tool-rejection path must emit a tool-result turn back to the model, not silently absorb it. Typical LLM-agent harnesses return {role: "tool", content: "error: …"} so the model can react.
Surface agent/embedded tool-failure events to the TUI as a visible error bubble (e.g. ⚠ tool 'write' rejected: …), not just as a backend log line.
If the agent's tool result is blocked and no replacement response is generated within N seconds, terminate the run with decision=surface_error (same pattern the Bedrock-failover code already uses at 16:50:27 in the log above).

Related observations (for triage)

The same run also hit an empty-response retry at 16:45:27 and a 5-minute Bedrock profile timeout at 16:50:27 — the failover code correctly surfaced that error. Only the later write-tool rejection at 16:56:58 went silent. So the silent-hang is specific to tool-level rejection, not to Bedrock-level failures.
The memory-flush restriction itself is fine — it's a legitimate safety rail. The bug is purely the silent-hang response to the rejection.

Workaround

Until fixed: kill the stuck run via launchctl kickstart -k gui/$(id -u)/ai.openclaw.gateway or restart the TUI. Any valuable reasoning from the hung run is lost.

extent analysis

TL;DR

The agent silently hangs when a tool-write rejection occurs, instead of surfacing an error, and can be fixed by modifying the tool-rejection path to emit a tool-result turn back to the model and surfacing tool-failure events to the TUI.

Guidance

Modify the tool-rejection path to return a tool-result turn with an error message, such as {role: "tool", content: "error: …"}, to allow the model to react to the rejection.
Surface tool-failure events to the TUI as a visible error bubble, such as ⚠ tool 'write' rejected: …, to inform the user of the error.
Implement a timeout mechanism to terminate the run with decision=surface_error if the agent's tool result is blocked and no replacement response is generated within a certain time frame (e.g., N seconds).
Verify that the fix works by testing the agent with a restricted tool call and checking that an error message is surfaced in the TUI.

Example

// Example tool-result turn with error message
{
  "role": "tool",
  "content": "error: write failed: Memory flush writes are restricted to memory/2026-05-05.md; use that path only."
}

Notes

The fix requires modifying the agent's tool-rejection handling and TUI error surfacing. The suggested fix is specific to the tool-level rejection and does not affect Bedrock-level failures.

Recommendation

Apply the suggested fix to modify the tool-rejection path and surface tool-failure events to the TUI, as it addresses the root cause of the silent hang and provides a better user experience.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#autograd error #model save/load #optimization #mixed precision #training loop

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix Agent silently hangs on tool-write rejection instead of surfacing error [2 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Agent silently hangs on tool-write rejection instead of surfacing error

Root Cause

Fix Action

Workaround

PR fix notes

PR #78108: fix: surface silent turn error payloads

Description (problem / solution / changelog)

Summary

Changes

Testing

Changed files

PR #78152: fix(auto-reply): keep silent turn errors visible

Description (problem / solution / changelog)

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Real behavior proof (required for external PRs)

Root Cause (if applicable)

Regression Test Plan (if applicable)

User-visible / Behavior Changes

Diagram (if applicable)

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Human Verification (required)

Review Conversations

Compatibility / Migration

Risks and Mitigations

Changed files

Code Example

Agent silently hangs on tool-write rejection instead of surfacing error

Summary

Environment

Reproduction

Evidence (from /tmp/openclaw/openclaw-2026-05-05.log)

Impact

Suggested fix

Related observations (for triage)

Workaround

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING

Evidence (from `/tmp/openclaw/openclaw-2026-05-05.log`)