openclaw - 💡(How to fix) Fix pi-trajectory-flush timeout aborts entire agent run — should degrade gracefully [1 pull requests]

openclaw2026-05-31 05:09:21

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Error Message

agent cleanup timed out: step=pi-trajectory-flush timeoutMs=10000 with very small pending data (633–2534 bytes)
embedded abort settle timed out: timeoutMs=2000 follows
Session enters stuck state → automatic stuck session recovery aborts it
Multiple sessions affected in parallel → eventual Gateway stability bundle + shutdown timeout

Fix Action

Fixed

Fixed by PR: test(agents): cover nonfatal trajectory flush timeout (https://github.com/openclaw/openclaw/pull/88802)

Code Example

agent cleanup timed out: runId=dc4ff2b4 sessionId=699ab5b5 step=pi-trajectory-flush timeoutMs=10000 details=pendingWrites=1 queuedBytes=704 activeOperation=file-append
embedded abort settle timed out: runId=dc4ff2b4 sessionId=699ab5b5 timeoutMs=2000
stuck session recovery outcome: status=aborted action=abort_embedded_run sessionId=699ab5b5

RAW_BUFFERClick to expand / collapse

Problem

When the pi-trajectory-flush step exceeds its 10s hard timeout during agent run cleanup, the entire run is aborted. This happened today (2026-05-31 ~12:21–12:55 CST) across multiple sessions simultaneously, suggesting a transient I/O slowdown on macOS triggered cascading session aborts.

Observed behavior

agent cleanup timed out: step=pi-trajectory-flush timeoutMs=10000 with very small pending data (633–2534 bytes)
embedded abort settle timed out: timeoutMs=2000 follows
Session enters stuck state → automatic stuck session recovery aborts it
Multiple sessions affected in parallel → eventual Gateway stability bundle + shutdown timeout

Expected behavior

Trajectory flush is telemetry/audit data. A slow flush should not abort the primary agent run or cause session-level stuck recovery.

Suggested improvements

Graceful degradation — if trajectory flush times out, drop the pending trajectory batch but let the run complete normally
Configurable or adaptive timeout — 10s is tight when macOS has background I/O pressure (Time Machine, Spotlight, etc.)
Retry before abort — currently one timeout → immediate failure; a single retry with backoff would prevent cascading aborts during transient I/O spikes

Environment

OpenClaw 2026.5.27
macOS 12.7.6 (Intel)
Node v24.13.0
Disk: APFS SSD (standard MacBook Pro)

Logs (excerpt)

agent cleanup timed out: runId=dc4ff2b4 sessionId=699ab5b5 step=pi-trajectory-flush timeoutMs=10000 details=pendingWrites=1 queuedBytes=704 activeOperation=file-append
embedded abort settle timed out: runId=dc4ff2b4 sessionId=699ab5b5 timeoutMs=2000
stuck session recovery outcome: status=aborted action=abort_embedded_run sessionId=699ab5b5

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

Trajectory flush is telemetry/audit data. A slow flush should not abort the primary agent run or cause session-level stuck recovery.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix pi-trajectory-flush timeout aborts entire agent run — should degrade gracefully [1 pull requests]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Fix Action

Fixed

Code Example

Problem

Observed behavior

Expected behavior

Suggested improvements

Environment

Logs (excerpt)

FAQ

Expected behavior

Still need to ship something?

TRENDING