openclaw - 💡(How to fix) Fix [Bug]: Gateway hard-crashes with 0xC0000409 (STATUS_STACK_BUFFER_OVERRUN) on Windows during Mattermost streaming reply; auto-respawn frequently wedges [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#71699Fetched 2026-04-26 05:09:38
View on GitHub
Comments
1
Participants
2
Timeline
1
Reactions
0
Timeline (top)
commented ×1

Error Message

  1. No "post-mortem" log line. The runtime log just stops at the last gateway/ws RPC response or agent/embedded bootstrap warning. No stack trace, no error event in the runtime log. 2026-04-25T12:57:27.941-04:00 [WARN] plugins 1 plugin(s) failed to initialize (validation: device-pair). Run 'openclaw plugins list' for details. 2026-04-25T12:57:55.546-04:00 [WARN] agent/embedded workspace bootstrap file MEMORY.md is 18848 chars (limit 12000); truncating in injected context 2026-04-25T12:58:11.905-04:00 [WARN] agent/embedded workspace bootstrap file MEMORY.md is 18848 chars (limit 12000); truncating in injected context
  • Investigate device-pair plugin validation error (1 plugin(s) failed to initialize (validation: device-pair)) — appears in every restart even though device-pair is in the loaded list. Probably benign but adds noise.

Root Cause

  • Memory corruption / stack overrun likely originates in a native module or a large-buffer copy in the agent/embedded ↔ Ollama path. The repeated MEMORY.md truncation warning (running on every session bootstrap because the file exceeds the 12 KB injected limit) is a candidate hot path. Worth checking the truncation code for off-by-one / unsafe writes when input size > limit by ~50%.
  • The 48-second usage.cost / sessions.usage RPCs immediately before death suggest the event loop was stalled (likely on disk I/O or an Ollama HTTP call) while WS frames piled up. A blocked event loop combined with a corrupt Buffer write would line up with 0xC0000409.
  • Investigate device-pair plugin validation error (1 plugin(s) failed to initialize (validation: device-pair)) — appears in every restart even though device-pair is in the loaded list. Probably benign but adds noise.

Fix Action

Workaround

External watchdog scheduled task that probes /health every 60 s, kills lingering node …openclaw\dist\index.js gateway processes, and re-triggers the gateway task after 2 consecutive failures with a 5-minute restart cooldown. Recovers from both this crash and the post-crash wedging in #64253.

Code Example

{
  "name": "lab-1",
  "enabled": true,
  "botToken": "<redacted>",
  "baseUrl": "http://<mm-host>.<tailnet>.ts.net:8065",
  "network": { "dangerouslyAllowPrivateNetwork": true },
  "dmPolicy": "open",
  "groupPolicy": "open"
}

---

2026-04-25T12:57:09.894-04:00 [INFO] gateway/ws res "channels.status" 1839ms
2026-04-25T12:57:18.380-04:00 [INFO] plugins   mattermost: registered slash command callback at /api/channels/mattermost/command
2026-04-25T12:57:27.941-04:00 [WARN] plugins   1 plugin(s) failed to initialize (validation: device-pair). Run 'openclaw plugins list' for details.
2026-04-25T12:57:55.546-04:00 [WARN] agent/embedded   workspace bootstrap file MEMORY.md is 18848 chars (limit 12000); truncating in injected context
                                       (sessionKey=agent:main:mattermost:channel:<channel-id>)
2026-04-25T12:57:57.264-04:00 [INFO] gateway/ws res "usage.cost" 48239ms
2026-04-25T12:57:57.346-04:00 [INFO] gateway/ws res "sessions.usage" 48328ms
2026-04-25T12:58:11.848-04:00 [INFO] gateway/ws res "node.list" 52ms
2026-04-25T12:58:11.905-04:00 [WARN] agent/embedded   workspace bootstrap file MEMORY.md is 18848 chars (limit 12000); truncating in injected context
                                       (sessionKey=agent:main:mattermost:direct:<user-id>)
<<< process exits 0xC0000409, no further log lines >>>

---

8889f05c-.trajectory.jsonl    2,495,462 bytes   last write 12:58:58
704ce0ef-.trajectory.jsonl      341,880 bytes   last write 12:59:57
RAW_BUFFERClick to expand / collapse

Bug

The gateway crashes hard on Windows with exit code 3221226505 (0xC0000409STATUS_STACK_BUFFER_OVERRUN) during normal operation: incoming Mattermost channel events arriving while the embedded acpx runtime is mid-inference. The user-visible symptom is a Mattermost post that the bot stopped editing mid-stream — the bot creates a post, edits it once or twice with partial content, then dies before sending the rest. Mattermost is left holding the half-finished message.

This is distinct from #64253 (gateway alive but unresponsive) — here the process exits, with a memory-corruption status code. After the crash, the Windows Scheduled Task auto-respawns the gateway, but the respawned instance frequently fails to complete starting channels and sidecars… (CPU-pegged, no Mattermost connect log line, never replies to inbound). A hard kill + clean re-trigger is needed to recover, and stuck-session trajectories pile up under ~/.openclaw/agents/main/sessions/.

Symptoms

  1. Hard crash, exit 0xC0000409. Get-ScheduledTaskInfo -TaskName 'OpenClaw Gateway' reports LastTaskResult: 3221226505 after each crash.
  2. Half-finished Mattermost post. Final state has update_atcreate_at + a few seconds, never updated again, content cuts off mid-sentence. Example: bot reply finalized as "The current year according to the provided" (42 chars, no closing punctuation, update_at - create_at = 2562 ms).
  3. No "post-mortem" log line. The runtime log just stops at the last gateway/ws RPC response or agent/embedded bootstrap warning. No stack trace, no error event in the runtime log.
  4. Post-restart wedging. The auto-respawned gateway often binds the port, logs ready (6 plugins…), then sits at "starting channels and sidecars…" with no mattermost connect line. CPU stays >80% for a single core; node has 4 ESTABLISHED conns to Ollama (127.0.0.1:11434) but zero to Mattermost. After ~3 min it sometimes does connect, but only after multiple slow RPCs (chat.history, models.list) report 30+ second durations on the WS log.

Pattern: 5–15 minutes between restart and next crash under steady-state Mattermost activity.

Environment

  • OpenClaw: 2026.4.23 (a979721) (npm install)
  • Node.js: v24.15.0
  • OS: Windows 11
  • Gateway service: Windows Scheduled Task OpenClaw Gateway running node …\openclaw\dist\index.js gateway --port 18789, bind lan, auth token
  • Channels enabled: Mattermost only (Mattermost Team Edition v11.6.1 over Tailscale, plain HTTP)
  • Agent model: ollama-local/llama3.1:8b (local Ollama on the same host)
  • MEMORY.md: 18,848 chars (truncates to 12,000 every session bootstrap — warning fires for every channel + DM session)
  • Plugins loaded: acpx, browser, device-pair, mattermost, phone-control, talk-voice (6)
  • Cron: 1 enabled job (pcs-redfin-sync-daily, fires at 5 AM ET; not active during the crashes I observed)

Mattermost config (channels.mattermost):

{
  "name": "lab-1",
  "enabled": true,
  "botToken": "<redacted>",
  "baseUrl": "http://<mm-host>.<tailnet>.ts.net:8065",
  "network": { "dangerouslyAllowPrivateNetwork": true },
  "dmPolicy": "open",
  "groupPolicy": "open"
}

The bot is a member of 5 channels.

Reproduction

  1. Configure Mattermost channel as above. Set dmPolicy: open + groupPolicy: open so both DMs and channel messages flow.
  2. Set agent model to ollama-local/llama3.1:8b (or any local Ollama backend that produces multi-second streamed responses).
  3. From a Mattermost user, send @openclaw <prompt that produces multi-line output> to a channel the bot is in. Repeat across 5–10 messages over 5–15 min.
  4. Observe at least one bot reply in Mattermost where the post was created, edited a couple of times, then frozen mid-sentence with no further update_at changes.
  5. Check Get-ScheduledTaskInfo -TaskName 'OpenClaw Gateway'LastTaskResult will be 3221226505.

Log slice (last entries before death)

Trimmed from ~/.openclaw/Local/Temp/openclaw/openclaw-2026-04-25.log. Note the sequence: ANSI-escape-laden gateway/ws RPC responses with 48-second durations on routine usage.cost / sessions.usage calls, followed by an agent/embedded bootstrap for a Mattermost session, then nothing.

2026-04-25T12:57:09.894-04:00 [INFO] gateway/ws res "channels.status" 1839ms
2026-04-25T12:57:18.380-04:00 [INFO] plugins   mattermost: registered slash command callback at /api/channels/mattermost/command
2026-04-25T12:57:27.941-04:00 [WARN] plugins   1 plugin(s) failed to initialize (validation: device-pair). Run 'openclaw plugins list' for details.
2026-04-25T12:57:55.546-04:00 [WARN] agent/embedded   workspace bootstrap file MEMORY.md is 18848 chars (limit 12000); truncating in injected context
                                       (sessionKey=agent:main:mattermost:channel:<channel-id>)
2026-04-25T12:57:57.264-04:00 [INFO] gateway/ws res "usage.cost" 48239ms
2026-04-25T12:57:57.346-04:00 [INFO] gateway/ws res "sessions.usage" 48328ms
2026-04-25T12:58:11.848-04:00 [INFO] gateway/ws res "node.list" 52ms
2026-04-25T12:58:11.905-04:00 [WARN] agent/embedded   workspace bootstrap file MEMORY.md is 18848 chars (limit 12000); truncating in injected context
                                       (sessionKey=agent:main:mattermost:direct:<user-id>)
<<< process exits 0xC0000409, no further log lines >>>

Stuck session trajectories left behind:

8889f05c-….trajectory.jsonl    2,495,462 bytes   last write 12:58:58
704ce0ef-….trajectory.jsonl      341,880 bytes   last write 12:59:57

Suggested investigation

  • Memory corruption / stack overrun likely originates in a native module or a large-buffer copy in the agent/embedded ↔ Ollama path. The repeated MEMORY.md truncation warning (running on every session bootstrap because the file exceeds the 12 KB injected limit) is a candidate hot path. Worth checking the truncation code for off-by-one / unsafe writes when input size > limit by ~50%.
  • The 48-second usage.cost / sessions.usage RPCs immediately before death suggest the event loop was stalled (likely on disk I/O or an Ollama HTTP call) while WS frames piled up. A blocked event loop combined with a corrupt Buffer write would line up with 0xC0000409.
  • Investigate device-pair plugin validation error (1 plugin(s) failed to initialize (validation: device-pair)) — appears in every restart even though device-pair is in the loaded list. Probably benign but adds noise.

Workaround

External watchdog scheduled task that probes /health every 60 s, kills lingering node …openclaw\dist\index.js gateway processes, and re-triggers the gateway task after 2 consecutive failures with a 5-minute restart cooldown. Recovers from both this crash and the post-crash wedging in #64253.

Related

  • #64253 — post-crash unresponsive state (this issue's restart-wedge symptom is similar but the trigger here is a hard exit, not a hang)
  • #69693 — bonjour mDNS watchdog crashes (we see repeated bonjour: watchdog detected non-announced service warnings; not yet sure if related)
  • #56215 — WS connection leak (we see "handshake timeout" entries on every restart from control-UI clients)

extent analysis

TL;DR

The most likely fix for the OpenClaw gateway crash on Windows with exit code 0xC0000409 is to investigate and address potential memory corruption issues in the agent/embedded path, particularly in the truncation code for the MEMORY.md file.

Guidance

  1. Investigate memory corruption: Focus on the agent/embedded path, especially the truncation code for MEMORY.md, to identify potential off-by-one or unsafe writes that could cause memory corruption.
  2. Check event loop stalls: Examine the 48-second usage.cost and sessions.usage RPCs to determine if the event loop is being stalled due to disk I/O or Ollama HTTP calls, which could contribute to the crash.
  3. Verify plugin initialization: Although likely benign, investigate the device-pair plugin validation error to ensure it's not contributing to the issue.
  4. Implement the suggested workaround: Set up an external watchdog scheduled task to probe the /health endpoint, kill lingering processes, and re-trigger the gateway task after consecutive failures to recover from crashes and post-crash wedging.

Example

No specific code snippet is provided, as the issue requires investigation into the agent/embedded path and potential memory corruption.

Notes

The provided workaround may help recover from crashes, but a permanent fix will require addressing the underlying memory corruption issue. The device-pair plugin validation error, although likely unrelated, should be investigated to minimize potential noise.

Recommendation

Apply the suggested workaround to recover from crashes and post-crash wedging while investigating the underlying memory corruption issue.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: Gateway hard-crashes with 0xC0000409 (STATUS_STACK_BUFFER_OVERRUN) on Windows during Mattermost streaming reply; auto-respawn frequently wedges [1 comments, 2 participants]