openclaw - 💡(How to fix) Fix Gateway crashes when background PTY output arrives after run is no longer active [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#62378Fetched 2026-04-08 03:05:13
View on GitHub
Comments
0
Participants
1
Timeline
1
Reactions
0
Author
Participants
Timeline (top)
cross-referenced ×1

The gateway crashes and is restarted by systemd when a background PTY-backed exec task continues to emit stdout/stderr after the associated agent run is no longer active, especially when the PTY is interrupted via CTRL+C.

Observed error:

Unhandled promise rejection: Error: Agent listener invoked outside active run

Error Message

Observed error: Unhandled promise rejection: Error: Agent listener invoked outside active run Core error in both cases: Error: Agent listener invoked outside active run

  • Unhandled promise rejection: Error: Agent listener invoked outside active run Unhandled promise rejection: Error: Agent listener invoked outside active run Error: Agent listener invoked outside active run
  • Unhandled promise rejection: Error: Agent listener invoked outside active run

Root Cause

  • Gateway process exits unexpectedly
  • Webchat / control UI disconnects temporarily
  • Long-running tasks are interrupted
  • Service is only recovered because systemd restarts it automatically

Fix Action

Fix / Workaround

Workarounds

Temporary mitigations that reduce occurrence:

  • avoid pty=true unless truly necessary
  • avoid long-lived background PTY tasks
  • avoid using CTRL+C on background PTY tasks when possible
  • prefer natural process exit
  • prefer non-PTY execution for non-interactive jobs

Code Example

python3 -u - <<'PY'
import time
print('repro-start', flush=True)
for i in range(300):
    print(f'tick-{i}', flush=True)
    time.sleep(0.2)
PY

---

python3 -u - <<'PY'
import time
print('repro-start', flush=True)
for i in range(300):
    print(f'tick-{i}', flush=True)
    time.sleep(0.2)
PY
RAW_BUFFERClick to expand / collapse

Bug: Gateway crashes when background PTY output arrives after run is no longer active / Bug:后台 PTY 输出在 run 非 active 后回流会导致 gateway 崩溃

English

Summary

The gateway crashes and is restarted by systemd when a background PTY-backed exec task continues to emit stdout/stderr after the associated agent run is no longer active, especially when the PTY is interrupted via CTRL+C.

Observed error:

Unhandled promise rejection: Error: Agent listener invoked outside active run

Environment

  • OpenClaw version: 2026.4.5
  • OS: Linux arm64
  • Node: 22.22.1
  • Gateway managed by: systemd user service

Impact

  • Gateway process exits unexpectedly
  • Webchat / control UI disconnects temporarily
  • Long-running tasks are interrupted
  • Service is only recovered because systemd restarts it automatically

Observed crash times

  • 2026-04-07 15:06:06 +08:00
  • 2026-04-07 15:12:51 +08:00
  • Reproduced again around 2026-04-07 15:19:47 +08:00 with a minimal PTY test

Relevant stack traces

Observed variants include both stderr and stdout paths:

Variant A

  • Agent.processEvents
  • agent.ts:533
  • agent-loop.ts:539
  • emitUpdate
  • handleStderr
  • stream socket callback

Variant B

  • Agent.processEvents
  • agent.ts:533
  • agent-loop.ts:539
  • emitUpdate
  • handleStdout
  • onSupervisorStdout
  • node-pty terminal stream callback

Core error in both cases:

Error: Agent listener invoked outside active run

Minimal reproduction

Step 1: Start a PTY-backed background command that keeps printing output

python3 -u - <<'PY'
import time
print('repro-start', flush=True)
for i in range(300):
    print(f'tick-{i}', flush=True)
    time.sleep(0.2)
PY

Run through OpenClaw exec with:

  • pty=true
  • background=true

Step 2: Confirm output is still being streamed

Poll the process and observe multiple tick-* lines.

Step 3: Interrupt the PTY via process send-keys

Send:

  • CTRL_C

Step 4: Observe gateway failure

Within a short interval, the gateway may exit with:

  • Unhandled promise rejection: Error: Agent listener invoked outside active run
  • systemd: Main process exited, code=exited, status=1/FAILURE
  • followed by automatic restart

Expected behavior

If a PTY process emits output after the initiating run is no longer active, the gateway should safely discard, detach, or reroute that output instead of crashing the whole process.

Actual behavior

Output is still sent into an agent listener that is no longer associated with an active run, producing an unhandled promise rejection that terminates the gateway.

Notes

This does not appear to be specific to any business script such as MediaCrawler. A minimal PTY loop reproduced the issue, which strongly suggests a lifecycle/race bug in the exec/PTY/event-bridging path.

Suspected area

Potentially in the interaction between:

  • PTY supervisor stdout/stderr forwarding
  • exec-defaults-*
  • agent run lifecycle / listener registration
  • Agent.processEvents / agent-loop

Workarounds

Temporary mitigations that reduce occurrence:

  • avoid pty=true unless truly necessary
  • avoid long-lived background PTY tasks
  • avoid using CTRL+C on background PTY tasks when possible
  • prefer natural process exit
  • prefer non-PTY execution for non-interactive jobs

Suggested fix direction

  • Guard output delivery when target run is no longer active
  • Detach PTY stream listeners on run completion/cancellation
  • Treat late stdout/stderr from old runs as ignorable instead of fatal
  • Prevent unhandled promise rejections from terminating the gateway

中文

问题概述

当一个带 PTY 的后台 exec 任务在其关联的 agent run 已经不再 active 之后,仍继续输出 stdout/stderr 时,gateway 会崩溃并被 systemd 自动拉起。这个问题在通过 CTRL+C 中断 PTY 时尤其容易触发。

典型报错:

Unhandled promise rejection: Error: Agent listener invoked outside active run

环境

  • OpenClaw 版本:2026.4.5
  • 操作系统:Linux arm64
  • Node:22.22.1
  • Gateway 托管方式:systemd user service

影响

  • gateway 进程异常退出
  • Webchat / 控制台会短暂断连
  • 长任务中断
  • 依赖 systemd 自动重启后才能恢复

观测到的崩溃时间

  • 2026-04-07 15:06:06 +08:00
  • 2026-04-07 15:12:51 +08:00
  • 使用最小 PTY 复现实验后,在 2026-04-07 15:19:47 +08:00 左右再次复现

相关堆栈

两个变体分别落在 stderr 和 stdout 回流路径上:

变体 A

  • Agent.processEvents
  • agent.ts:533
  • agent-loop.ts:539
  • emitUpdate
  • handleStderr
  • stream socket callback

变体 B

  • Agent.processEvents
  • agent.ts:533
  • agent-loop.ts:539
  • emitUpdate
  • handleStdout
  • onSupervisorStdout
  • node-pty terminal stream callback

两次核心错误一致:

Error: Agent listener invoked outside active run

最小复现步骤

步骤 1:启动一个带 PTY 的后台命令,并持续输出内容

python3 -u - <<'PY'
import time
print('repro-start', flush=True)
for i in range(300):
    print(f'tick-{i}', flush=True)
    time.sleep(0.2)
PY

通过 OpenClaw exec 启动时使用:

  • pty=true
  • background=true

步骤 2:确认输出正在持续回流

对该进程执行 poll,可以看到持续输出的 tick-*

步骤 3:通过 process send-keys 发送中断

发送:

  • CTRL_C

步骤 4:观察 gateway 崩溃

短时间内 gateway 可能出现:

  • Unhandled promise rejection: Error: Agent listener invoked outside active run
  • systemd 记录:Main process exited, code=exited, status=1/FAILURE
  • 随后自动重启

预期行为

如果一个 PTY 进程在原始 run 已不再 active 后仍有输出,gateway 应该安全地丢弃、解绑或重定向这部分输出,而不是让整个进程崩溃。

实际行为

输出仍然被发送到了一个已不属于 active run 的 agent listener,最终触发未处理的 promise rejection,并导致 gateway 退出。

补充说明

这个问题看起来并不依赖具体业务脚本(例如 MediaCrawler)。我用一个极简 PTY 循环就能复现,因此更像是 exec/PTY/event-bridging 路径上的生命周期竞争问题。

怀疑区域

很可能与以下模块的交互有关:

  • PTY supervisor 的 stdout/stderr 转发
  • exec-defaults-*
  • agent run 生命周期 / listener 注册与解绑
  • Agent.processEvents / agent-loop

临时规避

以下方式可以降低触发概率:

  • 非必要不要使用 pty=true
  • 避免长时间后台挂着 PTY 任务
  • 尽量避免对后台 PTY 任务使用 CTRL+C
  • 优先让进程自然退出
  • 对非交互任务优先使用非 PTY 执行

建议修复方向

  • 在目标 run 非 active 时,对输出分发增加保护
  • 在 run 完成 / 取消时,及时解绑 PTY 流监听器
  • 将旧 run 的迟到 stdout/stderr 视为可忽略事件,而不是 fatal
  • 避免未处理 promise rejection 直接导致 gateway 进程退出

extent analysis

TL;DR

To fix the gateway crash issue, guard output delivery when the target run is no longer active and detach PTY stream listeners on run completion or cancellation.

Guidance

  1. Identify and modify the relevant code paths: Focus on the interaction between PTY supervisor stdout/stderr forwarding, exec-defaults-*, agent run lifecycle, and Agent.processEvents to prevent unhandled promise rejections.
  2. Implement output guarding: Check if the target run is still active before delivering output to prevent crashes.
  3. Detach PTY stream listeners: Ensure that PTY stream listeners are detached when a run is completed or cancelled to prevent further output from being processed.
  4. Treat late output as ignorable: Modify the code to treat late stdout/stderr from old runs as ignorable instead of fatal to prevent gateway termination.

Example

A potential code modification could involve adding a check in the Agent.processEvents method to ensure the run is still active before processing output:

if (run.isActive()) {
    // Process output
} else {
    // Ignore or discard output
}

Notes

The provided guidance is based on the information given in the issue and may require further modifications based on the actual codebase and implementation details.

Recommendation

Apply a workaround by avoiding unnecessary use of pty=true and long-lived background PTY tasks until a permanent fix can be implemented. This will reduce the occurrence of the gateway crash issue.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

If a PTY process emits output after the initiating run is no longer active, the gateway should safely discard, detach, or reroute that output instead of crashing the whole process.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING