openclaw - 💡(How to fix) Fix [Bug]: Active Memory Telegram preflight retains local embedding model mapping after timeout

openclaw2026-05-18 22:28:30

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Consolidated replacement for #83773 and original #83752, with the later live profiling evidence folded into the main issue body.

On a live Linux VPS running OpenClaw 2026.5.18 (50a2481), Telegram group-topic turns that trigger Active Memory preflight can sharply increase gateway parent RSS and leave it elevated after the turn completes, even when /readyz is healthy and OpenClaw reports 0 queued · 0 running.

The newest profiling datapoint narrows the retained RSS from a generic gateway memory symptom to a specific retained file-backed local embedding model mapping:

rss_kb=314504 pss_kb=314504 anon_kb=0 priv_clean_kb=314504 priv_dirty_kb=0 path=/home/ubuntu/.node-llama-cpp/models/hf_ggml-org_embeddinggemma-300m-qat-Q8_0.gguf

Important: this does not appear to be a local chat-model fallback. The live config showed chat routing as openai/gpt-5.5 primary with google/gemini-2.5-flash fallback. The local GGUF model is used by Memory Search / memory-core as the local embedding backend. Active Memory runs before the normal reply, performs memory search, and that memory-search path loads or touches the local embedding model in the gateway parent process. After the Active Memory timeout, that mapping remains resident while the gateway is otherwise idle.

Root Cause

Severity: Medium. The gateway remained healthy on this VPS because the host has enough RAM, but RSS crossed OpenClaw's own diagnostic threshold before restart and can grow back quickly after user-visible turns.

Fix Action

Fix / Workaround

No gateway restart, config change, hotfix, or heap snapshot was performed during this capture. A 2s sampler recorded /proc/<gateway-pid>/status, smaps_rollup, child RSS, and cgroup memory from 2026-05-18T22:16:30Z to 2026-05-18T22:22:30Z.

Code Example

rss_kb=314504 pss_kb=314504 anon_kb=0 priv_clean_kb=314504 priv_dirty_kb=0 path=/home/ubuntu/.node-llama-cpp/models/hf_ggml-org_embeddinggemma-300m-qat-Q8_0.gguf

---

node=v22.22.0
npm=10.9.4
install root=/usr/lib/node_modules/openclaw
service=/home/ubuntu/.config/systemd/user/openclaw-gateway.service
ExecStart=/usr/bin/node /usr/lib/node_modules/openclaw/dist/index.js gateway --port 18789

---

agents.defaults.model.primary: openai/gpt-5.5
agents.defaults.model.fallbacks[0]: google/gemini-2.5-flash

---

agents.defaults.memorySearch.provider: local
agents.defaults.memorySearch.fallback: none
plugins.slots.memory: memory-core

---

Memory Search (main)
Provider: local (requested: local)
Model: hf:ggml-org/embeddinggemma-300m-qat-q8_0-GGUF/embeddinggemma-300m-qat-Q8_0.gguf
Sources: memory
Indexed: 240/240 files · 2184 chunks
Store: ~/.openclaw/memory/main.sqlite
Embeddings: ready
Vector store: ready
Semantic vectors: ready
Vector dims: 768
Vector path: /usr/lib/node_modules/openclaw/node_modules/sqlite-vec-linux-arm64/vec0.so
Embedding cache: enabled (2884 entries)
Recall store: 9011 entries

---

Memory Search (codex)
Provider: local (requested: local)
Model: hf:ggml-org/embeddinggemma-300m-qat-q8_0-GGUF/embeddinggemma-300m-qat-Q8_0.gguf
Indexed: 0/42 files · 0 chunks
Store: ~/.openclaw/memory/codex.sqlite

---

plugins list: 92 known, 11 enabled/loaded, 0 errors
enabled: active-memory, anthropic, file-transfer, google, memory-core, memory-wiki, ollama, openai, telegram, tokenjuice, codex
codex plugin source: ~/.openclaw/npm/node_modules/@openclaw/codex
codex plugin version: 2026.5.6

---

~/.openclaw total: 7.7G
~/.openclaw/agents/main/sessions: 376M, 687 jsonl files
~/.openclaw/agents/codex/sessions: 368K, 10 jsonl files
~/.openclaw/agents/main/agent/codex-home/logs_2.sqlite: 714,862,592 bytes

---

{
  "agents": ["main"],
  "allowedChatTypes": ["direct", "group", "channel"],
  "enabled": true,
  "logging": true,
  "maxSummaryChars": 220,
  "persistTranscripts": false,
  "promptStyle": "contextual",
  "queryMode": "full",
  "setupGraceTimeoutMs": 30000,
  "timeoutMs": 30000
}

---

{
  "queryMode": "recent",
  "promptStyle": "balanced",
  "timeoutMs": 15000,
  "setupGraceTimeoutMs": 15000,
  "allowedChatTypes": ["direct", "group", "channel"]
}

---

{
  "queryMode": "message",
  "timeoutMs": 5000,
  "setupGraceTimeoutMs": 5000,
  "allowedChatTypes": ["direct", "group", "channel"],
  "persistTranscripts": false,
  "logging": true
}

---

readyz: healthy
task pressure: 0 queued / 0 running
gateway parent PID: 1289215
parent RSS: 600848 kB
parent RssAnon: 540188 kB
parent RssFile: 60660 kB
parent PSS: 546904 kB
child Codex app-server PID: 1341301
child RSS: 46032 kB

---

22:16:45 inbound Telegram group/topic command, 17 chars
22:16:46 outbound send ok

22:16:58 inbound Telegram group/topic message, 58 chars
22:17:00 main embedded agent started
22:17:01 active-memory start timeoutMs=5000 queryChars=58 searchQueryChars=58
22:17:01 active-memory embedded run started
22:17:11 before_prompt_build handler from active-memory failed: timed out after 10000ms
22:17:12 active-memory done status=timeout elapsedMs=10236 summaryChars=0
22:17:40 Telegram sendMessage ok

---

samples: 175
first_ts: 2026-05-18T22:16:30Z
last_ts: 2026-05-18T22:22:30Z

rss_kb_min: 578956 at 22:16:59
rss_kb_max: 1029036 at 22:17:12
rss_kb_last: 997392 at 22:22:30

pss_kb_min: 524332 at 22:16:59
pss_kb_max: 976556 at 22:17:12
pss_kb_last: 942823 at 22:22:30

rss_anon_kb_min: 518232 at 22:16:59
rss_anon_kb_max: 648528 at 22:17:12
rss_anon_kb_last: 616420 at 22:22:30

rss_file_kb_min: 60724 at 22:16:30
rss_file_kb_max: 380972 at 22:17:09
rss_file_kb_last: 380972 at 22:22:30

vmdata_kb_min: 610664 at 22:16:59
vmdata_kb_max: 842720 at 22:17:12
vmdata_kb_last: 809992 at 22:22:30

child_rss_kb_min/max/last: 46032
cgroup_current_bytes_min: 604041216 at 22:16:59
cgroup_current_bytes_max: 879431680 at 22:17:26
cgroup_current_bytes_last: 725692416 at 22:22:30
cgroup_peak_bytes_max/last: 944881664

---

2026-05-18T22:23:02Z
readyz: healthy
task pressure: 0 queued / 0 running
parent gateway RSS: 997424 kB
parent gateway RssAnon: 616452 kB
parent gateway RssFile: 380972 kB
child Codex app-server RSS: 46032 kB
threads: 12
swap: 0

---

rss_kb=314504 pss_kb=314504 anon_kb=0 priv_clean_kb=314504 priv_dirty_kb=0 shared_clean_kb=0 path=/home/ubuntu/.node-llama-cpp/models/hf_ggml-org_embeddinggemma-300m-qat-Q8_0.gguf

---

Before clean restart on 2026.5.18:
RSS: ~1.4-1.6 GB
Memory diagnostic fired: rssBytes=1651253248 heapUsedBytes=498389504 thresholdBytes=1610612736

After clean restart:
~446 MB RSS shortly after ready
~509 MB RSS after ~90s
~570 MB RSS after ~6m45s
~566 MB RSS after ~9m27s

After one Telegram weather ask plus a follow-up log-check turn:
~1,001,404 kB RSS (~978 MiB)
readyz healthy
0 queued / 0 running
gateway process threads: 12
no child processes observed
swap: 0

---

20:25:15.970 inbound Telegram message received
20:25:21.200 embedded agent started (~5.2s after inbound)
20:25:23.381 Active Memory started
20:25:40.285 Active Memory finished: 16.9s, no relevant memory
20:25:44.319 Codex task started
20:26:26.677 wttr.in curl finished in ~80ms
20:26:43.561 final answer generated
20:26:47.728 Telegram sendMessage ok
Total inbound-to-Telegram-send: ~91.8s

---

Clean post-restart baseline:
~447 MB RSS shortly after ready
~495 MB RSS after ~90s
readyz healthy
0 queued / 0 running

---

20:39:18 inbound Telegram weather message
20:39:26 active-memory start timeoutMs=15000 queryChars=58 searchQueryChars=58
20:39:47 active-memory done status=ok elapsedMs=21269 summaryChars=131
20:40:55 Telegram sendMessage ok

20:41:00 inbound Telegram log-check follow-up
20:41:03 active-memory start timeoutMs=15000 queryChars=593 searchQueryChars=288
20:41:21 active-memory done status=ok elapsedMs=18058 summaryChars=181

---

PID     ELAPSED RSS     VSZ      %MEM %CPU CMD
1249003 04:24   1080144 44832448 4.3  37.2 /usr/bin/node /usr/lib/node_modules/openclaw/dist/index.js gateway --port 18789
VmRSS:  1080144 kB
RssAnon: 699168 kB
RssFile: 380976 kB
readyz healthy
0 queued / 0 running

---

2026-05-18T21:01:19.922Z inbound Telegram group/topic message, 19 chars
2026-05-18T21:01:21.727Z main embedded agent started
2026-05-18T21:01:22.577Z active-memory start timeoutMs=5000 queryChars=19
2026-05-18T21:01:32.577Z hook failed: timed out after 10000ms
2026-05-18T21:01:33.941Z active-memory done status=timeout elapsedMs=10016 summaryChars=0

---

2026-05-18T21:01:41.064Z inbound Telegram group/topic message, 18 chars
2026-05-18T21:01:42.244Z outbound send ok

---

2026-05-18T21:01:58.576Z inbound Telegram group/topic message, 58 chars
2026-05-18T21:02:00.205Z main embedded agent started
no active-memory start/done lines for this request
2026-05-18T21:02:33.492Z Telegram sendMessage ok

---

gateway parent RSS: 1,020,480 kB (~996 MiB)
process tree RSS: 1,179,660 kB (~1.13 GiB)
children:
  gateway parent: 1,020,480 kB
  codex app-server node child: 46,028 kB
  codex native app-server child: 113,152 kB
OpenClaw task pressure: 0 queued · 0 running
readyz healthy

---

immediately after restart: parent RSS 697,624 kB, service peak 667.1M
~90s after restart: parent RSS 483,656 kB, systemd service memory 428.3M, readyz healthy

RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (latency and retained process memory after completed or timed-out Active Memory preflight)

Beta release blocker

Summary

Consolidated replacement for #83773 and original #83752, with the later live profiling evidence folded into the main issue body.

The newest profiling datapoint narrows the retained RSS from a generic gateway memory symptom to a specific retained file-backed local embedding model mapping:

rss_kb=314504 pss_kb=314504 anon_kb=0 priv_clean_kb=314504 priv_dirty_kb=0 path=/home/ubuntu/.node-llama-cpp/models/hf_ggml-org_embeddinggemma-300m-qat-Q8_0.gguf

Steps to reproduce

Run OpenClaw 2026.5.18 as a systemd user gateway with Telegram and Active Memory enabled.
Use a Telegram group topic/session where Active Memory is allowed for group/channel style sessions.
Configure Active Memory with queryMode: "message", timeoutMs: 5000, setupGraceTimeoutMs: 5000, and allowedChatTypes: ["direct", "group", "channel"].
Restart the gateway cleanly and wait for /readyz.
Record gateway parent RSS, RssAnon, RssFile, PSS, child RSS, cgroup memory, and OpenClaw task pressure while idle.
Send /active-memory on in the Telegram topic.
Send one short normal Telegram message in that topic.
Wait for the reply and then leave the gateway idle.
Re-check /readyz, task pressure, parent RSS/RssAnon/RssFile/PSS, child RSS, and top smaps mappings.
Compare against the clean post-restart baseline and/or repeat with Active Memory disabled in the same topic.

Expected behavior

Completed Telegram turns should not leave the gateway retaining hundreds of MB of extra RSS after the system is idle.

If Active Memory times out, it should release/clean up transient recall resources it owns and degrade the reply path without leaving a high retained RSS footprint. If the local memory-search embedding model is intentionally cached, that should be explicit and bounded so operators do not see an unexpected ~300 MB retained mapping after a timed-out Telegram preflight.

Actual behavior

The affected VPS repeatedly showed:

clean post-restart gateway parent RSS around 430-570 MB after settling;
Active Memory Telegram turns increasing parent RSS to around 1.0-1.08 GB;
/readyz healthy and task pressure 0 queued · 0 running while RSS stayed elevated;
a clean restart bringing the gateway back to the lower baseline;
during the newest controlled message / 5s test, the largest retained mapping after the timeout was the local Memory Search embedding GGUF file.

OpenClaw version

OpenClaw 2026.5.18 (50a2481)

Operating system

Ubuntu 24.04.3 LTS, Linux 6.17.0-1011-oracle, aarch64

Install method

System-global npm install:

node=v22.22.0
npm=10.9.4
install root=/usr/lib/node_modules/openclaw
service=/home/ubuntu/.config/systemd/user/openclaw-gateway.service
ExecStart=/usr/bin/node /usr/lib/node_modules/openclaw/dist/index.js gateway --port 18789

Model and routing

Normal chat/model routing from the live gateway config:

agents.defaults.model.primary: openai/gpt-5.5
agents.defaults.model.fallbacks[0]: google/gemini-2.5-flash

No local chat/model fallback was found in the checked config.

Memory Search configuration from the same live gateway:

agents.defaults.memorySearch.provider: local
agents.defaults.memorySearch.fallback: none
plugins.slots.memory: memory-core

openclaw memory status --deep confirmed the local embedding model used by Memory Search:

Memory Search (main)
Provider: local (requested: local)
Model: hf:ggml-org/embeddinggemma-300m-qat-q8_0-GGUF/embeddinggemma-300m-qat-Q8_0.gguf
Sources: memory
Indexed: 240/240 files · 2184 chunks
Store: ~/.openclaw/memory/main.sqlite
Embeddings: ready
Vector store: ready
Semantic vectors: ready
Vector dims: 768
Vector path: /usr/lib/node_modules/openclaw/node_modules/sqlite-vec-linux-arm64/vec0.so
Embedding cache: enabled (2884 entries)
Recall store: 9011 entries

For the codex agent, the same local embedding backend was configured, though with no indexed chunks at the time checked:

Memory Search (codex)
Provider: local (requested: local)
Model: hf:ggml-org/embeddinggemma-300m-qat-q8_0-GGUF/embeddinggemma-300m-qat-Q8_0.gguf
Indexed: 0/42 files · 0 chunks
Store: ~/.openclaw/memory/codex.sqlite

Enabled plugins and local state size

plugins list: 92 known, 11 enabled/loaded, 0 errors
enabled: active-memory, anthropic, file-transfer, google, memory-core, memory-wiki, ollama, openai, telegram, tokenjuice, codex
codex plugin source: ~/.openclaw/npm/node_modules/@openclaw/codex
codex plugin version: 2026.5.6

~/.openclaw total: 7.7G
~/.openclaw/agents/main/sessions: 376M, 687 jsonl files
~/.openclaw/agents/codex/sessions: 368K, 10 jsonl files
~/.openclaw/agents/main/agent/codex-home/logs_2.sqlite: 714,862,592 bytes

Active Memory configurations tested

Initial heavy configuration:

{
  "agents": ["main"],
  "allowedChatTypes": ["direct", "group", "channel"],
  "enabled": true,
  "logging": true,
  "maxSummaryChars": 220,
  "persistTranscripts": false,
  "promptStyle": "contextual",
  "queryMode": "full",
  "setupGraceTimeoutMs": 30000,
  "timeoutMs": 30000
}

Reduced configuration, still reproduced:

{
  "queryMode": "recent",
  "promptStyle": "balanced",
  "timeoutMs": 15000,
  "setupGraceTimeoutMs": 15000,
  "allowedChatTypes": ["direct", "group", "channel"]
}

Lowest-latency controlled Telegram-topic configuration, still reproduced:

{
  "queryMode": "message",
  "timeoutMs": 5000,
  "setupGraceTimeoutMs": 5000,
  "allowedChatTypes": ["direct", "group", "channel"],
  "persistTranscripts": false,
  "logging": true
}

Evidence: controlled profiling run with local embedding model mapping

Baseline immediately before the controlled run:

readyz: healthy
task pressure: 0 queued / 0 running
gateway parent PID: 1289215
parent RSS: 600848 kB
parent RssAnon: 540188 kB
parent RssFile: 60660 kB
parent PSS: 546904 kB
child Codex app-server PID: 1341301
child RSS: 46032 kB

Test sequence:

Sent /active-memory on in the same Telegram group topic.
Sent one short normal Telegram message in that topic.
Waited for the reply and then left the gateway idle.

Relevant journal lines, redacted to behavior and timing:

22:16:45 inbound Telegram group/topic command, 17 chars
22:16:46 outbound send ok

22:16:58 inbound Telegram group/topic message, 58 chars
22:17:00 main embedded agent started
22:17:01 active-memory start timeoutMs=5000 queryChars=58 searchQueryChars=58
22:17:01 active-memory embedded run started
22:17:11 before_prompt_build handler from active-memory failed: timed out after 10000ms
22:17:12 active-memory done status=timeout elapsedMs=10236 summaryChars=0
22:17:40 Telegram sendMessage ok

Sampler summary:

samples: 175
first_ts: 2026-05-18T22:16:30Z
last_ts: 2026-05-18T22:22:30Z

rss_kb_min: 578956 at 22:16:59
rss_kb_max: 1029036 at 22:17:12
rss_kb_last: 997392 at 22:22:30

pss_kb_min: 524332 at 22:16:59
pss_kb_max: 976556 at 22:17:12
pss_kb_last: 942823 at 22:22:30

rss_anon_kb_min: 518232 at 22:16:59
rss_anon_kb_max: 648528 at 22:17:12
rss_anon_kb_last: 616420 at 22:22:30

rss_file_kb_min: 60724 at 22:16:30
rss_file_kb_max: 380972 at 22:17:09
rss_file_kb_last: 380972 at 22:22:30

vmdata_kb_min: 610664 at 22:16:59
vmdata_kb_max: 842720 at 22:17:12
vmdata_kb_last: 809992 at 22:22:30

child_rss_kb_min/max/last: 46032
cgroup_current_bytes_min: 604041216 at 22:16:59
cgroup_current_bytes_max: 879431680 at 22:17:26
cgroup_current_bytes_last: 725692416 at 22:22:30
cgroup_peak_bytes_max/last: 944881664

Idle state after the sampler finished:

2026-05-18T22:23:02Z
readyz: healthy
task pressure: 0 queued / 0 running
parent gateway RSS: 997424 kB
parent gateway RssAnon: 616452 kB
parent gateway RssFile: 380972 kB
child Codex app-server RSS: 46032 kB
threads: 12
swap: 0

Top retained mapping after the timeout:

rss_kb=314504 pss_kb=314504 anon_kb=0 priv_clean_kb=314504 priv_dirty_kb=0 shared_clean_kb=0 path=/home/ubuntu/.node-llama-cpp/models/hf_ggml-org_embeddinggemma-300m-qat-Q8_0.gguf

Other large mappings included [heap] around 59948 kB, /usr/bin/node around 59008 kB, and anonymous blocks. The 314 MB file-backed GGUF mapping was the largest single retained mapping.

Interpretation from this capture: a timed-out Active Memory preflight appears to load or local-touch the node-llama-cpp Memory Search embedding model in the gateway parent process; that model mapping remains resident after the Active Memory timeout, after the Telegram reply, and while /readyz is healthy with task pressure idle. This does not by itself prove the final fix, but it narrows the retained-RSS evidence from generic gateway RSS growth to a specific retained file-backed model mapping plus a smaller anonymous-memory increase.

Evidence: original full-context observation

Before clean restart on 2026.5.18:
RSS: ~1.4-1.6 GB
Memory diagnostic fired: rssBytes=1651253248 heapUsedBytes=498389504 thresholdBytes=1610612736

After clean restart:
~446 MB RSS shortly after ready
~509 MB RSS after ~90s
~570 MB RSS after ~6m45s
~566 MB RSS after ~9m27s

After one Telegram weather ask plus a follow-up log-check turn:
~1,001,404 kB RSS (~978 MiB)
readyz healthy
0 queued / 0 running
gateway process threads: 12
no child processes observed
swap: 0

Full-context timing:

20:25:15.970 inbound Telegram message received
20:25:21.200 embedded agent started (~5.2s after inbound)
20:25:23.381 Active Memory started
20:25:40.285 Active Memory finished: 16.9s, no relevant memory
20:25:44.319 Codex task started
20:26:26.677 wttr.in curl finished in ~80ms
20:26:43.561 final answer generated
20:26:47.728 Telegram sendMessage ok
Total inbound-to-Telegram-send: ~91.8s

Evidence: `recent` mode still reproduced

After switching from full/contextual/30000ms to recent/balanced/15000ms while keeping group/channel allowed:

Clean post-restart baseline:
~447 MB RSS shortly after ready
~495 MB RSS after ~90s
readyz healthy
0 queued / 0 running

Then one Telegram weather ask plus one log-check follow-up:

20:39:18 inbound Telegram weather message
20:39:26 active-memory start timeoutMs=15000 queryChars=58 searchQueryChars=58
20:39:47 active-memory done status=ok elapsedMs=21269 summaryChars=131
20:40:55 Telegram sendMessage ok

20:41:00 inbound Telegram log-check follow-up
20:41:03 active-memory start timeoutMs=15000 queryChars=593 searchQueryChars=288
20:41:21 active-memory done status=ok elapsedMs=18058 summaryChars=181

RSS after those turns:

PID     ELAPSED RSS     VSZ      %MEM %CPU CMD
1249003 04:24   1080144 44832448 4.3  37.2 /usr/bin/node /usr/lib/node_modules/openclaw/dist/index.js gateway --port 18789
VmRSS:  1080144 kB
RssAnon: 699168 kB
RssFile: 380976 kB
readyz healthy
0 queued / 0 running

Evidence: `message` mode with 5s timeout still reproduced

After tuning to queryMode: "message", timeoutMs: 5000, setupGraceTimeoutMs: 5000, group/channel still allowed:

2026-05-18T21:01:19.922Z inbound Telegram group/topic message, 19 chars
2026-05-18T21:01:21.727Z main embedded agent started
2026-05-18T21:01:22.577Z active-memory start timeoutMs=5000 queryChars=19
2026-05-18T21:01:32.577Z hook failed: timed out after 10000ms
2026-05-18T21:01:33.941Z active-memory done status=timeout elapsedMs=10016 summaryChars=0

RSS moved from roughly 590-607 MB before this turn to a peak around 1.0-1.07 GB during/after the Active Memory timeout.

Then /active-memory off was sent in the same Telegram topic:

2026-05-18T21:01:41.064Z inbound Telegram group/topic message, 18 chars
2026-05-18T21:01:42.244Z outbound send ok

The follow-up weather-style request in the same topic did not show Active Memory hook/log lines:

2026-05-18T21:01:58.576Z inbound Telegram group/topic message, 58 chars
2026-05-18T21:02:00.205Z main embedded agent started
no active-memory start/done lines for this request
2026-05-18T21:02:33.492Z Telegram sendMessage ok

Inbound-to-send was about 34.9s with Active Memory disabled for the topic, versus about 77.7s in the prior message / 5s Active Memory test and about 91.8s before tuning.

Evidence: retained high RSS until restart

At 2026-05-18T21:05:39Z after the Active Memory timeout tests:

gateway parent RSS: 1,020,480 kB (~996 MiB)
process tree RSS: 1,179,660 kB (~1.13 GiB)
children:
  gateway parent: 1,020,480 kB
  codex app-server node child: 46,028 kB
  codex native app-server child: 113,152 kB
OpenClaw task pressure: 0 queued · 0 running
readyz healthy

After restarting an idle gateway:

immediately after restart: parent RSS 697,624 kB, service peak 667.1M
~90s after restart: parent RSS 483,656 kB, systemd service memory 428.3M, readyz healthy

So the high value was not the normal clean-start baseline on this host. It was retained runtime state after the Telegram/Active Memory tests, and a clean restart brought it back to the 430-480 MB range.

Current-code notes from previous ClawSweeper review

A previous ClawSweeper review on #83773 noted these source-level facts against current main at the time:

Active Memory computes the embedded recall run timeout/watchdog as config.timeoutMs + config.setupGraceTimeoutMs, matching the observed 5000ms + 5000ms path surfacing as a 10000ms timeout.
Hook timeout does not cancel underlying plugin work by itself; timed-out modifying hooks are logged and skipped while the plugin's underlying work is not automatically cancelled, so cleanup must come from Active Memory and embedded-run abort handling.
The prompt-build hook is fail-open; replies continue while latency and RSS are the problem.
Comparing v2026.5.18 to current main showed no Active Memory behavior change that would obviously resolve retained RSS.
The adjacent latency PR #73667 was draft/conflicting and did not prove this retained-RSS failure mode.

Impact and severity

Affected: live gateways using Telegram group topics plus Active Memory on persistent conversations, especially where Active Memory is allowed for group/channel sessions and Memory Search uses the local embedding backend.

Frequency: Observed repeatedly as high peaks across multiple recent versions on this VPS. The most recent controlled run reproduced the RSS jump and retained local embedding model mapping with a single Telegram message after /active-memory on.

Consequence: higher steady-state memory footprint, possible memory pressure on smaller hosts, and slow Telegram replies because Active Memory is a blocking pre-reply step.

Related / not duplicate notes

Supersedes #83773 and #83752.
Related open memory issue mentioned by ClawSweeper: #69451, but this report has a narrower Telegram + Active Memory + local Memory Search embedding trigger and should not be closed as a duplicate of session-file memory growth without further proof.
Adjacent open PR found during contributor duplicate scan: #73667 (Bound active-memory recall latency and jitter QMD startup). It was draft/conflicting and ClawSweeper flagged a timeout regression/no real behavior proof, so it should not currently be treated as the canonical fix for this report.

What would help validate a fix

A good fix/proof should ideally capture before/after values for:

RSS, PSS, RssAnon, RssFile, heapUsed, external, arrayBuffers, active handles, child RSS, and task pressure before and after idle;
Active Memory enabled vs disabled in the same Telegram topic;
queryMode: message, recent, and ideally full if safe;
whether configured timeoutMs vs setupGraceTimeoutMs behavior is intentional or accidentally doubling the user-visible timeout;
whether timed-out Active Memory recall work is actually cancelled or merely skipped by the hook layer;
whether local Memory Search embedding resources are intentionally cached in the gateway parent and, if so, whether there is a configurable/bounded unload or cache policy;
whether the retained embeddinggemma-300m-qat-Q8_0.gguf mapping returns near the idle baseline after completed/timed-out Active Memory recall runs.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

Completed Telegram turns should not leave the gateway retaining hundreds of MB of extra RSS after the system is idle.

#vector store #authentication issue #prompt issue #agent setup #task chaining

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - 💡(How to fix) Fix [Bug]: Active Memory Telegram preflight retains local embedding model mapping after timeout

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Code Example

Bug type

Beta release blocker

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model and routing

Enabled plugins and local state size

Active Memory configurations tested

Evidence: controlled profiling run with local embedding model mapping

Evidence: original full-context observation

Evidence: recent mode still reproduced

Evidence: message mode with 5s timeout still reproduced

Evidence: retained high RSS until restart

Current-code notes from previous ClawSweeper review

Impact and severity

Related / not duplicate notes

What would help validate a fix

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING

Evidence: `recent` mode still reproduced

Evidence: `message` mode with 5s timeout still reproduced