openclaw - 💡(How to fix) Fix Active Memory with Cerebras gpt-oss-120b times out and can pin gateway CPU [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#73801Fetched 2026-04-29 06:15:01
View on GitHub
Comments
1
Participants
2
Timeline
1
Reactions
0
Author
Timeline (top)
commented ×1

Active Memory reliably times out and can leave the local gateway CPU-bound when configured to use cerebras/gpt-oss-120b, even though the same Cerebras key/model works through direct /v1/chat/completions calls, including streaming and tool calls.

This looks related to reasoning-model handling inside the embedded Active Memory sub-agent path, not a Cerebras auth/model-access failure.

Root Cause

Active Memory reliably times out and can leave the local gateway CPU-bound when configured to use cerebras/gpt-oss-120b, even though the same Cerebras key/model works through direct /v1/chat/completions calls, including streaming and tool calls.

This looks related to reasoning-model handling inside the embedded Active Memory sub-agent path, not a Cerebras auth/model-access failure.

Fix Action

Workaround

Disable Active Memory entirely, or avoid cerebras/gpt-oss-120b as the Active Memory model until the embedded Active Memory path handles reasoning-model responses/timeouts cleanly.

Code Example

{
  "plugins": {
    "entries": {
      "active-memory": {
        "enabled": true,
        "config": {
          "enabled": true,
          "agents": ["main", "kong"],
          "allowedChatTypes": ["direct", "group", "channel"],
          "queryMode": "message",
          "promptStyle": "balanced",
          "thinking": "off",
          "maxSummaryChars": 220,
          "timeoutMs": 5000,
          "persistTranscripts": false,
          "logging": true,
          "model": "cerebras/gpt-oss-120b"
        }
      }
    }
  }
}

---

2026-04-29T04:58:04.636+08:00 [plugins] active-memory: agent=main session=agent:main:discord:channel:1468250966536228866 activeProvider=cerebras activeModel=gpt-oss-120b start timeoutMs=5000 queryChars=882
2026-04-29T04:58:29.215+08:00 [plugins] active-memory: agent=main session=agent:main:discord:channel:1468250966536228866 activeProvider=cerebras activeModel=gpt-oss-120b done status=timeout elapsedMs=24580 summaryChars=0

---

openclaw-gateway CPU: ~100%
curl http://127.0.0.1:18789/health: timed out after 5-8s, or returned only after several seconds
openclaw gateway status: websocket probe timeout

---

{
  "plugins": {
    "entries": {
      "active-memory": {
        "enabled": false,
        "config": {
          "enabled": false
        }
      }
    }
  }
}

---

Gateway health: ~2.5ms
openclaw-gateway CPU: ~0.1%

---

gpt-oss-120b plain chat: 200 OK, content="ok", ~639ms
gpt-oss-120b streaming tools: 200 OK, finish_reason="tool_calls", ~432ms
gpt-oss-120b tool-result followup: 200 OK, visible content returned, ~370ms
RAW_BUFFERClick to expand / collapse

Summary

Active Memory reliably times out and can leave the local gateway CPU-bound when configured to use cerebras/gpt-oss-120b, even though the same Cerebras key/model works through direct /v1/chat/completions calls, including streaming and tool calls.

This looks related to reasoning-model handling inside the embedded Active Memory sub-agent path, not a Cerebras auth/model-access failure.

Environment

  • OpenClaw: 2026.4.26 (be8c246)
  • OS: macOS / Darwin 25.4.0 arm64
  • Node: v22.22.0
  • Install method: npm/Homebrew global install
  • Gateway mode: local LaunchAgent, port 18789
  • Channel: Discord channel session
  • Active Memory model: cerebras/gpt-oss-120b
  • Cerebras provider API: openai-completions

Active Memory config used for reproduction

{
  "plugins": {
    "entries": {
      "active-memory": {
        "enabled": true,
        "config": {
          "enabled": true,
          "agents": ["main", "kong"],
          "allowedChatTypes": ["direct", "group", "channel"],
          "queryMode": "message",
          "promptStyle": "balanced",
          "thinking": "off",
          "maxSummaryChars": 220,
          "timeoutMs": 5000,
          "persistTranscripts": false,
          "logging": true,
          "model": "cerebras/gpt-oss-120b"
        }
      }
    }
  }
}

Steps to reproduce

  1. Configure a Cerebras provider with gpt-oss-120b.
  2. Configure Active Memory to use cerebras/gpt-oss-120b.
  3. Enable Active Memory for an interactive persistent Discord channel session.
  4. Send a normal lightweight message in the channel.
  5. Watch gateway logs and process health.

Expected behavior

Active Memory should either:

  • return a compact summary or NONE within the configured timeout, or
  • fail fast and non-disruptively, allowing the main reply path and gateway health/admin endpoints to continue working.

Actual behavior

Active Memory starts, then times out much later than the configured timeoutMs, returns no summary, and the gateway process can become CPU-bound. During the stall, /health and websocket/admin probes may time out.

Observed logs:

2026-04-29T04:58:04.636+08:00 [plugins] active-memory: agent=main session=agent:main:discord:channel:1468250966536228866 activeProvider=cerebras activeModel=gpt-oss-120b start timeoutMs=5000 queryChars=882
2026-04-29T04:58:29.215+08:00 [plugins] active-memory: agent=main session=agent:main:discord:channel:1468250966536228866 activeProvider=cerebras activeModel=gpt-oss-120b done status=timeout elapsedMs=24580 summaryChars=0

Process/health symptoms observed after reproduction:

openclaw-gateway CPU: ~100%
curl http://127.0.0.1:18789/health: timed out after 5-8s, or returned only after several seconds
openclaw gateway status: websocket probe timeout

Disabling only plugins.entries.active-memory.config.enabled=false was not enough to return the gateway to a clean state. Disabling the plugin entry itself and restarting restored stability:

{
  "plugins": {
    "entries": {
      "active-memory": {
        "enabled": false,
        "config": {
          "enabled": false
        }
      }
    }
  }
}

After disabling the plugin entry and restarting:

Gateway health: ~2.5ms
openclaw-gateway CPU: ~0.1%

Direct Cerebras API checks

The same configured Cerebras API key/model succeeds outside the Active Memory embedded path:

  • Plain /v1/chat/completions: HTTP 200, visible content returned.
  • Streaming /v1/chat/completions with tools: HTTP 200, tool calls returned.
  • Tool-result followup: HTTP 200, visible content returned.

Example observed direct checks:

gpt-oss-120b plain chat: 200 OK, content="ok", ~639ms
gpt-oss-120b streaming tools: 200 OK, finish_reason="tool_calls", ~432ms
gpt-oss-120b tool-result followup: 200 OK, visible content returned, ~370ms

This suggests the key has real chat/completions access and the model can handle tool-call shape at the API layer.

Notes / suspected cause

gpt-oss-120b returns reasoning side-channel content even for simple requests. My suspicion is that the Active Memory embedded sub-agent path is treating the reasoning/tool activity as hidden thinking while waiting/retrying for visible summary text, then does not abort cleanly when the Active Memory timeout fires.

This looks related to, but not identical to:

  • #66804 Active Memory timeout with a reasoning model
  • #45681 reasoning models timing out across providers
  • #68004 Active Memory tools handling regression
  • #9956 GPT-OSS tool/function parsing behavior

The main difference here is that the docs recommend cerebras/gpt-oss-120b for Active Memory, and direct Cerebras chat/tool calls work, but the embedded Active Memory path times out and can destabilize the gateway.

Workaround

Disable Active Memory entirely, or avoid cerebras/gpt-oss-120b as the Active Memory model until the embedded Active Memory path handles reasoning-model responses/timeouts cleanly.

extent analysis

TL;DR

Disable Active Memory or avoid using cerebras/gpt-oss-120b as the Active Memory model to prevent timeouts and CPU-bound issues.

Guidance

  • Verify that the issue is specific to the cerebras/gpt-oss-120b model by testing with other models.
  • Check the Active Memory configuration to ensure that the timeoutMs value is set correctly and adjust it if necessary.
  • Consider disabling the Active Memory plugin entry and restarting the gateway to restore stability.
  • Test direct Cerebras API calls to confirm that the issue is specific to the Active Memory embedded path.

Example

No code snippet is provided as the issue is related to configuration and model compatibility.

Notes

The issue seems to be related to the handling of reasoning-model responses and timeouts in the Active Memory embedded sub-agent path. Disabling Active Memory or avoiding the cerebras/gpt-oss-120b model may be a temporary workaround until the issue is resolved.

Recommendation

Apply workaround: Disable Active Memory or avoid using cerebras/gpt-oss-120b as the Active Memory model. This is because the issue is specific to this model and disabling Active Memory or using a different model can prevent timeouts and CPU-bound issues.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Active Memory should either:

  • return a compact summary or NONE within the configured timeout, or
  • fail fast and non-disruptively, allowing the main reply path and gateway health/admin endpoints to continue working.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING