openclaw - ✅(Solved) Fix Gateway becoming very slow . CPU 100% - Versions 4.24 - 5.2 [2 pull requests, 6 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#76382Fetched 2026-05-04 05:07:44
View on GitHub
Comments
6
Participants
3
Timeline
27
Reactions
3
Author
Assignees
Timeline (top)
commented ×6subscribed ×6cross-referenced ×5mentioned ×4

On OpenClaw 2026.5.2 (8b2a6e5), the gateway still pins one Node.js CPU thread even after reducing the runtime to a minimal CPU test profile:

  • no external chat channels
  • no memory plugins
  • no browser
  • no Discord / Telegram
  • no extra skills
  • minimal tools profile
  • LM Studio kept as the model provider target

The gateway starts and reports ready, but idle CPU remains around 100% of one core. A simple agent turn fails through the gateway websocket and falls back to embedded mode.

Error Message

gateway connect failed: Error: gateway closed (1000): EMBEDDED FALLBACK: Gateway agent failed; running embedded agent: GatewayTransportError: gateway closed (1000 normal closure): no close reason

Root Cause

On OpenClaw 2026.5.2 (8b2a6e5), the gateway still pins one Node.js CPU thread even after reducing the runtime to a minimal CPU test profile:

  • no external chat channels
  • no memory plugins
  • no browser
  • no Discord / Telegram
  • no extra skills
  • minimal tools profile
  • LM Studio kept as the model provider target

The gateway starts and reports ready, but idle CPU remains around 100% of one core. A simple agent turn fails through the gateway websocket and falls back to embedded mode.

Fix Action

Fix / Workaround

[trace:embedded-run] startup stages:
totalMs=9174
runtime-plugins=8668ms
model-resolution=165ms
auth=333ms
context-engine=1ms
attempt-dispatch=5ms

PR fix notes

PR #76517: fix(gateway): keep models list read-only fast

Description (problem / solution / changelog)

Summary

  • keep the read-only models.list catalog fallback on persisted/current metadata/configured rows only, without falling into provider registry discovery or a fresh metadata scan
  • let gateway models.list use static auth checks so manifest catalog visibility cannot import provider runtime discovery
  • add regressions for the read-only catalog fallback and static auth visibility path

Fixes https://github.com/openclaw/openclaw/issues/76382 Refs https://github.com/openclaw/openclaw/issues/76360 Refs https://github.com/openclaw/openclaw/issues/75707 Refs https://github.com/openclaw/openclaw/pull/76406

Verification

  • pnpm test:serial src/agents/model-catalog.test.ts src/agents/model-catalog-visibility.test.ts src/agents/model-provider-auth.test.ts src/gateway/server-methods/models.test.ts src/gateway/server-model-catalog.test.ts
  • OPENCLAW_BUILD_CACHE=0 pnpm build
  • temp gateway repro with channels/browser/memory/context disabled and no models.json: models.list server response 174ms, provider_discovery_delta=0, model_catalog_delta=0
  • OPENCLAW_TESTBOX=1 OPENCLAW_TESTBOX_ID=tbx_01kqp9xctm73f5q4zeya131acd pnpm testbox:run -- pnpm check:changed

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • src/agents/model-auth.ts (modified, +5/-1)
  • src/agents/model-catalog-visibility.test.ts (added, +43/-0)
  • src/agents/model-catalog-visibility.ts (modified, +3/-0)
  • src/agents/model-catalog.test.ts (modified, +91/-34)
  • src/agents/model-catalog.ts (modified, +50/-13)
  • src/agents/model-provider-auth.ts (modified, +15/-3)
  • src/gateway/server-methods/models.ts (modified, +1/-0)

Code Example

channels.enabled=
plugins.allow=lmstudio
plugins.slots={"memory":"none","contextEngine":"none"}
browser.enabled=False
tools.profile=minimal
tools.deny=*
tools.exec.security=deny
enabled.skills=

---

http server listening (0 plugins, 6.2s)
gateway ready
ActiveState=active
SubState=running
NRestarts=0

---

105 105 105 104 104 104 104 104 104 104

---

104 104 104 104 104 104 104 104 103 103 103 103 103 103 103 103 103 103 103 103 103 103 103 103 103

---

102 102 102 102 102 102 102 102 103 103

---

{
  "ready": true,
  "failing": [],
  "eventLoop": {
    "degraded": true,
    "reasons": ["event_loop_delay", "event_loop_utilization", "cpu"]
  }
}

---

openclaw agent --agent main --message 'Reply with exactly: OK' --timeout 120 --json

---

gateway connect failed: Error: gateway closed (1000):
EMBEDDED FALLBACK: Gateway agent failed; running embedded agent:
GatewayTransportError: gateway closed (1000 normal closure): no close reason

---

[trace:embedded-run] startup stages:
totalMs=9174
runtime-plugins=8668ms
model-resolution=165ms
auth=333ms
context-engine=1ms
attempt-dispatch=5ms

---

node:fs                         12.1%
discovery-B19Xdk1_.js            8.0%
boundary-file-read-*.js          6.8%
parse-json-compat-*.js           6.3%
manifest-registry-*.js           1.3%
installed-plugin-index-store     0.9%

---

lstat
readFileUtf8
readFileSync
realpathSync
parseJsonWithJson5Fallback
RAW_BUFFERClick to expand / collapse

Follow-up to #75598 after testing the current package requested in the maintainer comment.

Original issue: https://github.com/openclaw/openclaw/issues/75598

Summary

On OpenClaw 2026.5.2 (8b2a6e5), the gateway still pins one Node.js CPU thread even after reducing the runtime to a minimal CPU test profile:

  • no external chat channels
  • no memory plugins
  • no browser
  • no Discord / Telegram
  • no extra skills
  • minimal tools profile
  • LM Studio kept as the model provider target

The gateway starts and reports ready, but idle CPU remains around 100% of one core. A simple agent turn fails through the gateway websocket and falls back to embedded mode.

Environment

  • OpenClaw: 2026.5.2 (8b2a6e5)
  • Node: 22.22.2
  • OS: Ubuntu 24.04 VM under Proxmox
  • Hardware: Intel N355 mini PC, low-power VM environment
  • Service: user systemd service, openclaw-gateway.service
  • Model/provider: LM Studio local provider target
  • Related regression window from original issue: 2026.4.23 works; 2026.4.24/4.25+ through 2026.5.2 regress

Minimal config tested

Effective minimal test state after validation:

channels.enabled=
plugins.allow=lmstudio
plugins.slots={"memory":"none","contextEngine":"none"}
browser.enabled=False
tools.profile=minimal
tools.deny=*
tools.exec.security=deny
enabled.skills=

Gateway log after restart:

http server listening (0 plugins, 6.2s)
gateway ready
ActiveState=active
SubState=running
NRestarts=0

CPU measurements

Idle samples from the gateway PID:

105 105 105 104 104 104 104 104 104 104

During one simple chat turn:

104 104 104 104 104 104 104 104 103 103 103 103 103 103 103 103 103 103 103 103 103 103 103 103 103

After chat turn:

102 102 102 102 102 102 102 102 103 103

/ready still reports degraded event loop:

{
  "ready": true,
  "failing": [],
  "eventLoop": {
    "degraded": true,
    "reasons": ["event_loop_delay", "event_loop_utilization", "cpu"]
  }
}

Simple chat turn result

Command:

openclaw agent --agent main --message 'Reply with exactly: OK' --timeout 120 --json

Result:

gateway connect failed: Error: gateway closed (1000):
EMBEDDED FALLBACK: Gateway agent failed; running embedded agent:
GatewayTransportError: gateway closed (1000 normal closure): no close reason

Fresh trace line from fallback:

[trace:embedded-run] startup stages:
totalMs=9174
runtime-plugins=8668ms
model-resolution=165ms
auth=333ms
context-engine=1ms
attempt-dispatch=5ms

Additional profiling signal

A 15s Node inspector CPU profile taken during the degraded state showed repeated synchronous filesystem/config/plugin discovery work in the hot path, including:

node:fs                         12.1%
discovery-B19Xdk1_.js            8.0%
boundary-file-read-*.js          6.8%
parse-json-compat-*.js           6.3%
manifest-registry-*.js           1.3%
installed-plugin-index-store     0.9%

Top sampled functions included repeated:

lstat
readFileUtf8
readFileSync
realpathSync
parseJsonWithJson5Fallback

Expected

With 0 loaded plugins, no external channels, browser disabled, memory/context disabled, and minimal tools, the gateway should idle near 0% CPU and should be able to handle a trivial local agent turn through the gateway.

Actual

The gateway remains alive but consumes around 100% of one CPU core at idle. The trivial agent turn does not complete through the gateway websocket and falls back to embedded mode.

Notes

  • openclaw config validate passed for the final minimal config.
  • openclaw doctor --fix was run during the test process.
  • The issue is still reproducible without Discord, Telegram, browser, memory-core, memory-wiki, or lossless-claw enabled.
  • This suggests the remaining issue is not only channel/network related; there appears to be a core gateway/runtime plugin discovery or reload loop that persists even under a 0-plugin gateway startup.

extent analysis

TL;DR

The gateway's high CPU usage and failure to handle agent turns may be caused by a plugin discovery or reload loop, and modifying the configuration to disable plugin discovery or adjusting the filesystem access patterns could potentially resolve the issue.

Guidance

  • Review the plugin discovery mechanism and consider disabling it or adjusting its configuration to reduce the frequency of filesystem access.
  • Investigate the discovery-B19Xdk1_.js file and its role in the plugin discovery process to determine if it can be optimized or disabled.
  • Consider implementing a caching mechanism for plugin metadata to reduce the need for repeated filesystem access.
  • Verify that the openclaw config validate and openclaw doctor --fix commands are run regularly to ensure the configuration is valid and optimized.

Example

No code example is provided as the issue is related to configuration and plugin discovery, and modifying the code without further investigation may not be effective.

Notes

The issue appears to be related to the gateway's plugin discovery mechanism, and resolving it may require adjustments to the configuration or optimization of the plugin discovery process. Further investigation is needed to determine the root cause and develop a comprehensive solution.

Recommendation

Apply a workaround by disabling plugin discovery or adjusting the filesystem access patterns, as this may help reduce the CPU usage and allow the gateway to handle agent turns.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING