Active Memory should either: - return a compact summary or `NONE` within the configured timeout, or - fail fast and non-disruptively, allowing the main reply path and gateway health/admin endpoints to continue working.

openclaw - 💡(How to fix) Fix Active Memory with Cerebras gpt-oss-120b times out and can pin gateway CPU [1 comments, 2 participants]

Root Cause

Active Memory reliably times out and can leave the local gateway CPU-bound when configured to use cerebras/gpt-oss-120b, even though the same Cerebras key/model works through direct /v1/chat/completions calls, including streaming and tool calls.

This looks related to reasoning-model handling inside the embedded Active Memory sub-agent path, not a Cerebras auth/model-access failure.

Code Example

{
  "plugins": {
    "entries": {
      "active-memory": {
        "enabled": true,
        "config": {
          "enabled": true,
          "agents": ["main", "kong"],
          "allowedChatTypes": ["direct", "group", "channel"],
          "queryMode": "message",
          "promptStyle": "balanced",
          "thinking": "off",
          "maxSummaryChars": 220,
          "timeoutMs": 5000,
          "persistTranscripts": false,
          "logging": true,
          "model": "cerebras/gpt-oss-120b"
        }
      }
    }
  }
}

---

2026-04-29T04:58:04.636+08:00 [plugins] active-memory: agent=main session=agent:main:discord:channel:1468250966536228866 activeProvider=cerebras activeModel=gpt-oss-120b start timeoutMs=5000 queryChars=882
2026-04-29T04:58:29.215+08:00 [plugins] active-memory: agent=main session=agent:main:discord:channel:1468250966536228866 activeProvider=cerebras activeModel=gpt-oss-120b done status=timeout elapsedMs=24580 summaryChars=0

---

openclaw-gateway CPU: ~100%
curl http://127.0.0.1:18789/health: timed out after 5-8s, or returned only after several seconds
openclaw gateway status: websocket probe timeout

---

{
  "plugins": {
    "entries": {
      "active-memory": {
        "enabled": false,
        "config": {
          "enabled": false
        }
      }
    }
  }
}

---

Gateway health: ~2.5ms
openclaw-gateway CPU: ~0.1%

---

gpt-oss-120b plain chat: 200 OK, content="ok", ~639ms
gpt-oss-120b streaming tools: 200 OK, finish_reason="tool_calls", ~432ms
gpt-oss-120b tool-result followup: 200 OK, visible content returned, ~370ms

Summary

This looks related to reasoning-model handling inside the embedded Active Memory sub-agent path, not a Cerebras auth/model-access failure.

Environment

OpenClaw: 2026.4.26 (be8c246)
OS: macOS / Darwin 25.4.0 arm64
Node: v22.22.0
Install method: npm/Homebrew global install
Gateway mode: local LaunchAgent, port 18789
Channel: Discord channel session
Active Memory model: cerebras/gpt-oss-120b
Cerebras provider API: openai-completions

Active Memory config used for reproduction

{
  "plugins": {
    "entries": {
      "active-memory": {
        "enabled": true,
        "config": {
          "enabled": true,
          "agents": ["main", "kong"],
          "allowedChatTypes": ["direct", "group", "channel"],
          "queryMode": "message",
          "promptStyle": "balanced",
          "thinking": "off",
          "maxSummaryChars": 220,
          "timeoutMs": 5000,
          "persistTranscripts": false,
          "logging": true,
          "model": "cerebras/gpt-oss-120b"
        }
      }
    }
  }
}

Steps to reproduce

Configure a Cerebras provider with gpt-oss-120b.
Configure Active Memory to use cerebras/gpt-oss-120b.
Enable Active Memory for an interactive persistent Discord channel session.
Send a normal lightweight message in the channel.
Watch gateway logs and process health.

Expected behavior

Active Memory should either:

return a compact summary or NONE within the configured timeout, or
fail fast and non-disruptively, allowing the main reply path and gateway health/admin endpoints to continue working.

Actual behavior

Active Memory starts, then times out much later than the configured timeoutMs, returns no summary, and the gateway process can become CPU-bound. During the stall, /health and websocket/admin probes may time out.

Observed logs:

2026-04-29T04:58:04.636+08:00 [plugins] active-memory: agent=main session=agent:main:discord:channel:1468250966536228866 activeProvider=cerebras activeModel=gpt-oss-120b start timeoutMs=5000 queryChars=882
2026-04-29T04:58:29.215+08:00 [plugins] active-memory: agent=main session=agent:main:discord:channel:1468250966536228866 activeProvider=cerebras activeModel=gpt-oss-120b done status=timeout elapsedMs=24580 summaryChars=0

Process/health symptoms observed after reproduction:

openclaw-gateway CPU: ~100%
curl http://127.0.0.1:18789/health: timed out after 5-8s, or returned only after several seconds
openclaw gateway status: websocket probe timeout

Disabling only plugins.entries.active-memory.config.enabled=false was not enough to return the gateway to a clean state. Disabling the plugin entry itself and restarting restored stability:

{
  "plugins": {
    "entries": {
      "active-memory": {
        "enabled": false,
        "config": {
          "enabled": false
        }
      }
    }
  }
}

After disabling the plugin entry and restarting:

Gateway health: ~2.5ms
openclaw-gateway CPU: ~0.1%

Direct Cerebras API checks

The same configured Cerebras API key/model succeeds outside the Active Memory embedded path:

Plain /v1/chat/completions: HTTP 200, visible content returned.
Streaming /v1/chat/completions with tools: HTTP 200, tool calls returned.
Tool-result followup: HTTP 200, visible content returned.

Example observed direct checks:

gpt-oss-120b plain chat: 200 OK, content="ok", ~639ms
gpt-oss-120b streaming tools: 200 OK, finish_reason="tool_calls", ~432ms
gpt-oss-120b tool-result followup: 200 OK, visible content returned, ~370ms

This suggests the key has real chat/completions access and the model can handle tool-call shape at the API layer.

Notes / suspected cause

gpt-oss-120b returns reasoning side-channel content even for simple requests. My suspicion is that the Active Memory embedded sub-agent path is treating the reasoning/tool activity as hidden thinking while waiting/retrying for visible summary text, then does not abort cleanly when the Active Memory timeout fires.

This looks related to, but not identical to:

#66804 Active Memory timeout with a reasoning model
#45681 reasoning models timing out across providers
#68004 Active Memory tools handling regression
#9956 GPT-OSS tool/function parsing behavior

The main difference here is that the docs recommend cerebras/gpt-oss-120b for Active Memory, and direct Cerebras chat/tool calls work, but the embedded Active Memory path times out and can destabilize the gateway.

Workaround

Disable Active Memory entirely, or avoid cerebras/gpt-oss-120b as the Active Memory model until the embedded Active Memory path handles reasoning-model responses/timeouts cleanly.

extent analysis

TL;DR

Disable Active Memory or avoid using cerebras/gpt-oss-120b as the Active Memory model to prevent timeouts and CPU-bound issues.

Guidance

Verify that the issue is specific to the cerebras/gpt-oss-120b model by testing with other models.
Check the Active Memory configuration to ensure that the timeoutMs value is set correctly and adjust it if necessary.
Consider disabling the Active Memory plugin entry and restarting the gateway to restore stability.
Test direct Cerebras API calls to confirm that the issue is specific to the Active Memory embedded path.

Example

No code snippet is provided as the issue is related to configuration and model compatibility.

Notes

The issue seems to be related to the handling of reasoning-model responses and timeouts in the Active Memory embedded sub-agent path. Disabling Active Memory or avoiding the cerebras/gpt-oss-120b model may be a temporary workaround until the issue is resolved.

Recommendation

Apply workaround: Disable Active Memory or avoid using cerebras/gpt-oss-120b as the Active Memory model. This is because the issue is specific to this model and disabling Active Memory or using a different model can prevent timeouts and CPU-bound issues.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix Active Memory with Cerebras gpt-oss-120b times out and can pin gateway CPU [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Workaround

Code Example

Summary

Environment

Active Memory config used for reproduction

Steps to reproduce

Expected behavior

Actual behavior

Direct Cerebras API checks

Notes / suspected cause

Workaround

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix Active Memory with Cerebras gpt-oss-120b times out and can pin gateway CPU [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Workaround

Code Example

Summary

Environment

Active Memory config used for reproduction

Steps to reproduce

Expected behavior

Actual behavior

Direct Cerebras API checks

Notes / suspected cause

Workaround

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING