- ACP sessions should not recursively ingest massive archived session/transcript payloads into their own durable history - Tool output should be aggressively bounded, summarized, truncated, or rejected when it comes from session archives/log stores - Re-loading a session should not replay enough historical content to drive `codex-acp` into runaway memory growth - A single bad search result should not be able to create a persistent OOM trap for future loads of the same session

openclaw - 💡(How to fix) Fix [Bug]: Codex ACP session archive self-ingestion can bloat ACP history, OOM codex-acp, and crash the host [1 participants]

Jackten · 2026-04-01T01:51:43Z

[openclaw] Codex ACP sessions can ingest enormous transcript payloads from OpenClaw's own archived session files, persist those payloads into ~/.acpx/sessions/… Codex ACP sessions can ingest enormous transcript payloads from OpenClaw's own archived session files, persist those payloads into `~/.acpx/sessions/*.stream*.ndjson`, and then repeatedly replay/load the bloated session state until `codex-acp` consumes tens of GB of RSS and gets OOM-killed. Under enough pressure, this can destabilize or reboot the host. This appears to be a session self-ingestion / replay-amplification failure mode, not just a generic memory leak. The pattern I found is: 1. A Codex ACP turn runs a broad recursive search over OpenClaw state, especially paths like `/root/.openclaw` or `/root/.openclaw/agents/main/sessions` 2. That search matches archived OpenClaw transcript files containing prior tool results and large text blobs 3. The huge search result is emitted as `tool_call_update` content and persisted into the ACP session stream under `~/.acpx/sessions/...stream*.ndjson` 4. Later `session/load` / replay of the same session forces `codex-acp` to reload or process an extremely large history 5. `codex-acp` memory usage grows to tens of GB and is killed by the OOM killer; after repeated pressure, the whole machine may crash/reboot I found multiple giant ACP session streams caused by this exact pattern, and the host I investigated also has journal evidence of repeated `codex-acp` OOM kills. ### Bug type Crash / resource-exhaustion bug ### Summary Codex ACP sessions can ingest enormous transcript payloads from OpenClaw's own archived session files, persist those payloads into `~/.acpx/sessions/*.stream*.ndjson`, and then repeatedly replay/load the bloated session state until `codex-acp` consumes tens of GB of RSS and gets OOM-killed. Under enough pressure, this can destabilize or reboot the host. This appears to be a session self-ingestion / replay-amplification failure mode, not just a generic memory leak. The pattern I found is: 1. A Codex ACP turn runs a broad recursive search over OpenClaw state, especially paths like `/root/.openclaw` or `/root/.openclaw/agents/main/sessions` 2. That search matches archived OpenClaw transcript files containing prior tool results and large text blobs 3. The huge search result is emitted as `tool_call_update` content and persisted into the ACP session stream under `~/.acpx/sessions/...stream*.ndjson` 4. Later `session/load` / replay of the same session forces `codex-acp` to reload or process an extremely large history 5. `codex-acp` memory usage grows to tens of GB and is killed by the OOM killer; after repeated pressure, the whole machine may crash/reboot I found multiple giant ACP session streams caused by this exact pattern, and the host I investigated also has journal evidence of repeated `codex-acp` OOM kills. ### Steps to reproduce I do not have a tiny synthetic repro yet, but this appears reproducible with a persistent Codex ACP session and a broad grep/rg over OpenClaw session history. Probable repro shape: 1. Run OpenClaw with ACP/Codex enabled 2. Start a persistent Codex ACP session bound to a thread/topic 3. From that session, run a broad recursive search that includes OpenClaw session archives, for example searching paths under: - `/root/.openclaw` - `/root/.openclaw/agents/main/sessions` 4. Let that tool call return a large result set containing historical transcript/tool-result content 5. Continue using the same ACP session or restart the gateway so the session is loaded again 6. Observe ACP session artifacts grow rapidly and `codex-acp` memory climb A stronger practical repro is to search for phrases known to appear in archived session history so that the search returns transcript blobs rather than just source hits. ### Expected behavior - ACP sessions should not recursively ingest massive archived session/transcript payloads into their own durable history - Tool output should be aggressively bounded, summarized, truncated, or rejected when it comes from session archives/log stores - Re-loading a session should not replay enough historical content to drive `codex-acp` into runaway memory growth - A single bad search result should not be able to create a persistent OOM trap for future loads of the same session ### Actual behavior Observed on a live host: - Repeated journal evidence of `codex-acp` being OOM-killed inside `openclaw-gateway.service` - Severe pre-crash memory pressure (`systemd-journald: Under memory pressure, flushing caches.`) - Hard host reboot with no clean shutdown sequence - Several ACP sessions with abnormally large persistent stream logs - Very large `tool_call_update` payloads containing text from OpenClaw archived sessions Representative on-disk evidence from `~/.acpx/sessions`: - `019ceddb-8f50-7511-b52a-3d430ea6297f` → `396261812` bytes of stream data - `019d4588-d5ea-7fe1-85cb-83af377caf39` → `373457587` bytes of stream data - `019cf1d0-78e6-7810-bf88-ee56e

openclaw2026-04-01 01:51:43

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#58657•Fetched 2026-04-08 01:59:40

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Jackten

Participants

Jackten

Codex ACP sessions can ingest enormous transcript payloads from OpenClaw's own archived session files, persist those payloads into ~/.acpx/sessions/*.stream*.ndjson, and then repeatedly replay/load the bloated session state until codex-acp consumes tens of GB of RSS and gets OOM-killed. Under enough pressure, this can destabilize or reboot the host.

This appears to be a session self-ingestion / replay-amplification failure mode, not just a generic memory leak.

The pattern I found is:

A Codex ACP turn runs a broad recursive search over OpenClaw state, especially paths like /root/.openclaw or /root/.openclaw/agents/main/sessions
That search matches archived OpenClaw transcript files containing prior tool results and large text blobs
The huge search result is emitted as tool_call_update content and persisted into the ACP session stream under ~/.acpx/sessions/...stream*.ndjson
Later session/load / replay of the same session forces codex-acp to reload or process an extremely large history
codex-acp memory usage grows to tens of GB and is killed by the OOM killer; after repeated pressure, the whole machine may crash/reboot

I found multiple giant ACP session streams caused by this exact pattern, and the host I investigated also has journal evidence of repeated codex-acp OOM kills.

Error Message

Exclude OpenClaw session archives/log stores from broad ACP search defaults, or warn/require explicit opt-in

Root Cause

I found multiple giant ACP session streams caused by this exact pattern, and the host I investigated also has journal evidence of repeated codex-acp OOM kills.

Code Example

rg -n "memory-lancedb|vector ready|provider: \"lancedb\"|plugin registered|auto-captured|injecting .* memories|bundled LanceDB runtime unavailable" /tmp/openclaw /root/.openclaw -g '!**/*.sqlite*' -g '!**/node_modules/**'

---

/root/.openclaw/agents/main/sessions

---

Mar 31 13:48:13 kernel: Out of memory: Killed process 3956731 (codex-acp) ... anon-rss:29226016kB
Mar 31 13:48:14 systemd[1463]: openclaw-gateway.service: Failed with result 'oom-kill'.

Mar 29 21:49:32 kernel: Out of memory: Killed process 1335063 (codex-acp) ... anon-rss:30094768kB
Mar 29 21:49:36 systemd[1463]: openclaw-gateway.service: Failed with result 'oom-kill'.

Mar 21 11:13:30 kernel: Out of memory: Killed process 357499 (codex-acp) ... anon-rss:30561300kB
Mar 21 11:13:31 systemd[1463]: openclaw-gateway.service: Failed with result 'oom-kill'.

---

systemd-journald: Under memory pressure, flushing caches.

RAW_BUFFERClick to expand / collapse

Bug type

Crash / resource-exhaustion bug

Summary

This appears to be a session self-ingestion / replay-amplification failure mode, not just a generic memory leak.

The pattern I found is:

A Codex ACP turn runs a broad recursive search over OpenClaw state, especially paths like /root/.openclaw or /root/.openclaw/agents/main/sessions
That search matches archived OpenClaw transcript files containing prior tool results and large text blobs
The huge search result is emitted as tool_call_update content and persisted into the ACP session stream under ~/.acpx/sessions/...stream*.ndjson
Later session/load / replay of the same session forces codex-acp to reload or process an extremely large history
codex-acp memory usage grows to tens of GB and is killed by the OOM killer; after repeated pressure, the whole machine may crash/reboot

I found multiple giant ACP session streams caused by this exact pattern, and the host I investigated also has journal evidence of repeated codex-acp OOM kills.

Steps to reproduce

I do not have a tiny synthetic repro yet, but this appears reproducible with a persistent Codex ACP session and a broad grep/rg over OpenClaw session history.

Probable repro shape:

Run OpenClaw with ACP/Codex enabled
Start a persistent Codex ACP session bound to a thread/topic
From that session, run a broad recursive search that includes OpenClaw session archives, for example searching paths under:
- /root/.openclaw
- /root/.openclaw/agents/main/sessions
Let that tool call return a large result set containing historical transcript/tool-result content
Continue using the same ACP session or restart the gateway so the session is loaded again
Observe ACP session artifacts grow rapidly and codex-acp memory climb

A stronger practical repro is to search for phrases known to appear in archived session history so that the search returns transcript blobs rather than just source hits.

Expected behavior

ACP sessions should not recursively ingest massive archived session/transcript payloads into their own durable history
Tool output should be aggressively bounded, summarized, truncated, or rejected when it comes from session archives/log stores
Re-loading a session should not replay enough historical content to drive codex-acp into runaway memory growth
A single bad search result should not be able to create a persistent OOM trap for future loads of the same session

Actual behavior

Observed on a live host:

Repeated journal evidence of codex-acp being OOM-killed inside openclaw-gateway.service
Severe pre-crash memory pressure (systemd-journald: Under memory pressure, flushing caches.)
Hard host reboot with no clean shutdown sequence
Several ACP sessions with abnormally large persistent stream logs
Very large tool_call_update payloads containing text from OpenClaw archived sessions

Representative on-disk evidence from ~/.acpx/sessions:

019ceddb-8f50-7511-b52a-3d430ea6297f → 396261812 bytes of stream data
019d4588-d5ea-7fe1-85cb-83af377caf39 → 373457587 bytes of stream data
019cf1d0-78e6-7810-bf88-ee56eba800ce → 123347001 bytes of stream data

Representative bad command pattern captured in ACP session history:

rg -n "memory-lancedb|vector ready|provider: \"lancedb\"|plugin registered|auto-captured|injecting .* memories|bundled LanceDB runtime unavailable" /tmp/openclaw /root/.openclaw -g '!**/*.sqlite*' -g '!**/node_modules/**'

That output included hits from archived OpenClaw session files like:

/root/.openclaw/agents/main/sessions/...topic-12088.jsonl...

and those results were persisted back into ACP session history as multi-megabyte tool_call_update chunks.

I also found another giant session caused by searching directly inside:

/root/.openclaw/agents/main/sessions

with similarly huge persisted output.

OpenClaw version

Observed on OpenClaw Gateway (v2026.3.28-beta.1)

Local source checkout used for investigation:

repo: openclaw/openclaw
commit checked out locally: fa339dbd92

Operating system

Ubuntu 25.10 (x86_64)

Kernel:

Linux netcup-clawd 6.17.0-14-generic

Install method

systemd user service, local source checkout / npm-style runtime

Model

Codex ACP (@zed-industries/codex-acp)

Provider / routing chain

Telegram / gateway -> OpenClaw ACP runtime (acpx) -> npx @zed-industries/codex-acp@^0.9.5 -> codex-acp

Representative live process chain:

openclaw-gateway
acpx ... codex prompt --session ...
acpx dist/cli.js __queue-owner
npm exec @zed-industries/codex-acp@^0.9.5
codex-acp

Config file / key location

No special config change seems required beyond using OpenClaw with ACP/Codex sessions.

The main issue seems to be broad searches over OpenClaw runtime/session storage paths combined with durable ACP session persistence.

Additional provider/model setup details

Representative queue-owner payload from a live session:

permissionMode: approve-all
nonInteractivePermissions: fail
ttlMs: 100
maxQueueDepth: 16
agent command: npx @zed-industries/codex-acp@^0.9.5

I do not think ttlMs=100 is the primary bug, but short queue-owner lifetimes may make session reload/replay churn more frequent.

Logs, screenshots, and evidence

Journal evidence of the same failure class on this host:

Mar 31 13:48:13 kernel: Out of memory: Killed process 3956731 (codex-acp) ... anon-rss:29226016kB
Mar 31 13:48:14 systemd[1463]: openclaw-gateway.service: Failed with result 'oom-kill'.

Mar 29 21:49:32 kernel: Out of memory: Killed process 1335063 (codex-acp) ... anon-rss:30094768kB
Mar 29 21:49:36 systemd[1463]: openclaw-gateway.service: Failed with result 'oom-kill'.

Mar 21 11:13:30 kernel: Out of memory: Killed process 357499 (codex-acp) ... anon-rss:30561300kB
Mar 21 11:13:31 systemd[1463]: openclaw-gateway.service: Failed with result 'oom-kill'.

Pre-reboot symptom tonight:

systemd-journald: Under memory pressure, flushing caches.

Then the machine rebooted without a clean shutdown sequence.

Representative giant persisted ACP output chunks were ~3.1 MB to ~3.4 MB each and were stored as session/update -> tool_call_update entries.

Representative huge session stats I measured:

019d3af8-e0b2-78e2-a92f-0c70b575886e.stream.ndjson
- 22,978,623 bytes
- 25,918 lines
- 24,315 agent_message_chunk updates
019d4588-d5ea-7fe1-85cb-83af377caf39.stream*
- 373,457,587 bytes total
- contained repeated tool_call_update entries ~3.4 MB each
019ceddb-8f50-7511-b52a-3d430ea6297f.stream*
- 396,261,812 bytes total
- contained repeated tool_call_update entries ~3.1 MB each

Impact and severity

High / potentially critical for hosts running persistent ACP sessions.

Impact:

codex-acp can reach ~29-30 GB RSS and be OOM-killed
openclaw-gateway.service can fail with oom-kill
the host can become unstable or reboot under sustained memory pressure
once a session is poisoned with giant persisted output, future loads may keep re-triggering the problem

This is especially dangerous on long-lived Telegram/ACP thread workflows where users may naturally search logs, sessions, or workspace state.

Additional information

Why I think this is distinct from the already-open orphan/zombie ACP issues:

Related but different issue: #44790 is about orphaned ACP child processes / swap exhaustion
Related but different issue: #48573 is about embedded-run zombie session state

This report is specifically about:

giant ACP durable session artifacts
self-ingestion of OpenClaw archived session output
replay/load amplification through session/load
codex-acp memory blow-up due to persisted transcript/tool output volume

Most likely fix areas:

Never allow broad log/session archive results to be persisted verbatim into ACP durable session history beyond a strict cap
Add hard truncation / summarization for tool_call_update payloads before they hit ~/.acpx/sessions
Exclude OpenClaw session archives/log stores from broad ACP search defaults, or warn/require explicit opt-in
Add defensive max-bytes / max-events limits on ACP session replay/load
Detect pathological session artifacts and refuse to load them without compaction / repair

If useful, I can provide a follow-up issue comment with more exact session IDs, offending commands, and example raw snippets from the oversized tool_call_update records.

extent analysis

TL;DR

To fix the crash/resource-exhaustion bug, limit the size of tool_call_update payloads persisted in ACP session history and implement defensive measures against replay/load amplification.

Guidance

Implement payload size limits: Enforce a strict cap on the size of tool_call_update payloads before they are persisted into ~/.acpx/sessions.
Truncate or summarize large payloads: Automatically truncate or summarize large tool_call_update payloads to prevent them from causing memory blow-up.
Exclude archived sessions from searches: Modify ACP search defaults to exclude OpenClaw session archives/log stores, or require explicit opt-in to search these areas.
Add defensive limits on session replay/load: Introduce max-bytes or max-events limits on ACP session replay/load to prevent pathological session artifacts from causing issues.
Detect and refuse pathological artifacts: Develop a mechanism to detect oversized session artifacts and refuse to load them without compaction or repair.

Example

A possible implementation could involve modifying the tool_call_update handling code to truncate payloads exceeding a certain size (e.g., 1 MB) before persisting them:

const maxSize = 1024 * 1024; // 1 MB
const payload = /* tool_call_update payload */;
if (payload.length > maxSize) {
  const truncatedPayload = payload.substring(0, maxSize);
  // Persist truncatedPayload instead of original payload
}

Notes

The provided solution focuses on limiting the size of persisted payloads and implementing defensive measures. However, a more comprehensive fix might require additional changes, such as optimizing ACP session management or improving memory handling in codex-acp.

Recommendation

Apply a workaround by implementing payload size limits and defensive measures against replay/load amplification. This approach can help mitigate the issue until a more comprehensive fix is available.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

ACP sessions should not recursively ingest massive archived session/transcript payloads into their own durable history
Tool output should be aggressively bounded, summarized, truncated, or rejected when it comes from session archives/log stores
Re-loading a session should not replay enough historical content to drive codex-acp into runaway memory growth
A single bad search result should not be able to create a persistent OOM trap for future loads of the same session

#api #task chaining #parallel task #integration issue #memory leak

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - 💡(How to fix) Fix [Bug]: Codex ACP session archive self-ingestion can bloat ACP history, OOM codex-acp, and crash the host [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Bug type

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Config file / key location

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING