openclaw - ✅(Solved) Fix Regression after 2026.3.28: sessionStrategy behavior changed, ws-stream 500 fallback, slower Discord interaction handling [1 pull requests, 1 comments, 2 participants]

openclaw2026-03-29 08:27:00

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#56881•Fetched 2026-04-08 01:46:34

View on GitHub

Comments

Participants

Timeline

Reactions

Author

jlin53882

Participants

danilovmy

jlin53882

Timeline (top)

cross-referenced ×3commented ×1

After upgrading OpenClaw from a working older build to 2026.3.28, our local setup started behaving incorrectly. Rolling back to 2026.3.24 immediately restores normal behavior.

At the moment this looks like an OpenClaw regression rather than a problem in our local reranker server or in memory-lancedb-pro alone.

Root Cause

So the reranker itself is not the root cause.

Fix Action

Fix / Workaround

As a workaround, explicitly setting:

PR fix notes

PR #55535: fix: keep openai-codex on HTTP responses transport

Repository: openclaw/openclaw
Author: Nanako0129
State: closed | merged: True
Link: https://github.com/openclaw/openclaw/pull/55535

Description (problem / solution / changelog)

Summary

keep openai-codex on its existing HTTP responses path instead of routing it into the generic OpenAI websocket transport selector
update the websocket transport selector test to explicitly reject the openai-codex / openai-codex-responses pair

Change Type

Scope

This PR is intentionally narrow.

It only changes websocket transport eligibility for openai-codex on the embedded runner path.

It does not change:

the openai-codex provider normalization logic
HTTP request payload behavior for Codex
any image/media-understanding routing
any memory provider routing

Linked Issue/PR

Related #55523
Related #56826
Related #56881
Follow-up to merged PR #53702
This PR fixes a bug or regression

Root Cause / Regression History

openai-codex models normalize to the ChatGPT backend HTTP path (https://chatgpt.com/backend-api) and use openai-codex-responses.

After #53702, the embedded runner's websocket eligibility selector also treated the openai-codex / openai-codex-responses pair as websocket-eligible. However, the websocket connection manager still targets the generic OpenAI Responses websocket endpoint rather than a Codex-specific transport target.

In isolated upstream-main smoke testing, that caused openai-codex requests to attempt websocket first, fail with HTTP 500, and then fall back to HTTP.

This patch keeps openai-codex on the existing HTTP path until there is a verified provider-specific websocket target and end-to-end support for it.

Behavior Changes

Before:

openai-codex / openai-codex-responses entered the generic OpenAI websocket selector
websocket could fail and fall back to HTTP in isolated smoke tests

After:

only native openai / openai-responses is websocket-eligible
openai-codex stays on HTTP responses transport

Regression Test Plan

Updated:

src/agents/pi-embedded-runner/run/attempt.spawn-workspace.websocket.test.ts

Coverage:

accepts the native openai / openai-responses websocket pair
rejects openai-codex / openai-codex-responses
rejects mismatched provider/API websocket pairs

Repro + Verification

Observed on merged upstream/main during isolated smoke validation of #53702:

openai-codex/gpt-5.4 requests succeeded, but only after websocket failed and the runner fell back to HTTP
gateway log showed websocket connect failure with HTTP 500 before fallback

With this patch:

the websocket selector no longer routes openai-codex into the generic OpenAI websocket path
focused websocket selector test passes locally

Tests

Passed locally:

corepack pnpm test -- src/agents/pi-embedded-runner/run/attempt.spawn-workspace.websocket.test.ts --reporter=verbose
pre-commit pnpm check passed during commit

Risks and Mitigations

Risk:

if openai-codex later gets a valid provider-specific websocket endpoint, this patch will keep it on HTTP until that support is added explicitly

Mitigation:

this preserves the existing working HTTP transport and avoids the currently observed websocket->HTTP fallback path
the selector remains a small, centralized surface to expand later once Codex websocket support is verified end-to-end

AI assistance

AI-assisted: drafted and implemented with Codex, then locally reviewed and tested by me.

Changed files

src/agents/pi-embedded-runner/run/attempt.spawn-workspace.websocket.test.ts (modified, +6/-9)
src/agents/pi-embedded-runner/run/attempt.thread-helpers.ts (modified, +4/-4)

Code Example

session-strategy: using none (plugin memory-reflection hooks disabled)

---

"plugins": {
  "entries": {
    "memory-lancedb-pro": {
      "config": {
        "sessionStrategy": "memoryReflection"
      }
    }
  }
}

---

memory-reflection: integrated hooks registered (command:new, command:reset, after_tool_call, before_prompt_build, session_end)

---

[EventQueue] Slow listener detected: InteractionEventListener took 4254ms for event INTERACTION_CREATE

---

[EventQueue] Slow listener detected: InteractionEventListener took 2240ms for event INTERACTION_CREATE

---

[ws-stream] WebSocket connect failed ... Unexpected server response: 500
falling back to HTTP

---

"agents": {
  "defaults": {
    "memorySearch": {
      "enabled": false
    }
  }
}

---

"plugins": {
  "entries": {
    "memory-lancedb-pro": {
      "enabled": true,
      "config": {
        "autoCapture": true,
        "autoRecall": true,
        "embedding": {
          "baseURL": "http://localhost:11434/v1",
          "model": "jina-v5-retrieval-test",
          "dimensions": 1024
        },
        "retrieval": {
          "mode": "hybrid",
          "rerank": "cross-encoder",
          "rerankProvider": "siliconflow",
          "rerankModel": "BAAI/bge-reranker-base",
          "rerankEndpoint": "http://127.0.0.1:18799/v1/rerank",
          "rerankApiKey": "local"
        },
        "autoRecallTimeoutMs": 120000
      }
    }
  }
}

RAW_BUFFERClick to expand / collapse

Summary

After upgrading OpenClaw from a working older build to 2026.3.28, our local setup started behaving incorrectly. Rolling back to 2026.3.24 immediately restores normal behavior.

At the moment this looks like an OpenClaw regression rather than a problem in our local reranker server or in memory-lancedb-pro alone.

What changed after upgrading to 2026.3.28

1) `memory-lancedb-pro` started resolving `sessionStrategy` as `none`

After the upgrade, logs showed:

session-strategy: using none (plugin memory-reflection hooks disabled)

This disabled memory-reflection hooks unexpectedly.

As a workaround, explicitly setting:

"plugins": {
  "entries": {
    "memory-lancedb-pro": {
      "config": {
        "sessionStrategy": "memoryReflection"
      }
    }
  }
}

and restarting restored:

memory-reflection: integrated hooks registered (command:new, command:reset, after_tool_call, before_prompt_build, session_end)

So something in 2026.3.28 appears to have changed how runtime/plugin config is resolved.

2) Discord interaction handling became slower

Logs also showed:

[EventQueue] Slow listener detected: InteractionEventListener took 4254ms for event INTERACTION_CREATE

and later:

[EventQueue] Slow listener detected: InteractionEventListener took 2240ms for event INTERACTION_CREATE

3) Embedded websocket streaming started failing

Repeated logs:

[ws-stream] WebSocket connect failed ... Unexpected server response: 500
falling back to HTTP

This started appearing around the same time and may be part of the same regression.

What is NOT the problem

Reranker server is healthy

Local reranker endpoint is up and responds correctly:

endpoint: http://127.0.0.1:18799/v1/rerank
server process is running normally
manual POST test returns 200 OK

So the reranker itself is not the root cause.

Rolling back OpenClaw fixes it

Most importantly:

OpenClaw 2026.3.28 → broken / degraded behavior
OpenClaw 2026.3.24 → works normally again

That strongly suggests an OpenClaw-side regression introduced after 2026.3.24.

Environment

OS: Windows 10.0.26200 x64
Node: 24.13.0
OpenClaw problematic version: 2026.3.28
OpenClaw known-good rollback: 2026.3.24
Plugin: [email protected]

Relevant config excerpts

Agent defaults

"agents": {
  "defaults": {
    "memorySearch": {
      "enabled": false
    }
  }
}

memory-lancedb-pro

"plugins": {
  "entries": {
    "memory-lancedb-pro": {
      "enabled": true,
      "config": {
        "autoCapture": true,
        "autoRecall": true,
        "embedding": {
          "baseURL": "http://localhost:11434/v1",
          "model": "jina-v5-retrieval-test",
          "dimensions": 1024
        },
        "retrieval": {
          "mode": "hybrid",
          "rerank": "cross-encoder",
          "rerankProvider": "siliconflow",
          "rerankModel": "BAAI/bge-reranker-base",
          "rerankEndpoint": "http://127.0.0.1:18799/v1/rerank",
          "rerankApiKey": "local"
        },
        "autoRecallTimeoutMs": 120000
      }
    }
  }
}

Why I think this should be investigated in OpenClaw

Because the problem disappears completely on rollback to 2026.3.24, I suspect one or more regressions in 2026.3.28 related to:

plugin config resolution / default propagation into plugin runtime
session strategy behavior affecting plugin hook registration
embedded websocket streaming (ws-stream) behavior
Discord interaction handling latency in the event queue

Request

Could you help identify which change after 2026.3.24 caused:

memory-lancedb-pro to behave as if sessionStrategy=none
websocket streaming to start failing with HTTP 500 fallback
Discord interaction listener latency to increase noticeably

If needed, I can provide more logs and a more minimal reproduction.

extent analysis

Fix Plan

To address the issues introduced in OpenClaw version 2026.3.28, follow these steps:

Explicitly set sessionStrategy for memory-lancedb-pro: Update your configuration to include the sessionStrategy explicitly set to "memoryReflection" for the memory-lancedb-pro plugin:

"plugins": {
  "entries": {
    "memory-lancedb-pro": {
      "config": {
        "sessionStrategy": "memoryReflection"
      }
    }
  }
}

Adjust Discord Interaction Handling:
- Review the event queue configuration and ensure that the InteractionEventListener is properly optimized. This might involve adjusting the event handling logic to reduce processing time.
- Consider implementing a timeout or a queue limit to prevent the event listener from taking too long.
Fix Embedded Websocket Streaming:
- Investigate the cause of the HTTP 500 errors. This could be due to a misconfiguration or an issue with the server handling the websocket connections.
- Ensure that the ws-stream configuration is correct and that the server is properly set up to handle websocket connections.
Verify Plugin Config Resolution:
- Check how plugin configurations are resolved and propagated in OpenClaw 2026.3.28. There might be changes in how default configurations are applied or overridden.
- Ensure that all necessary configurations for memory-lancedb-pro and other plugins are correctly set and not overridden by default settings.

Verification

To verify that these fixes work:

Restart your application with the updated configurations.
Monitor logs for the memory-reflection hooks registration and ensure they are integrated correctly.
Test Discord interaction handling for latency issues.
Verify that websocket streaming is working without falling back to HTTP due to 500 errors.

Extra Tips

Regularly review OpenClaw's changelog and documentation for any changes that might affect plugin configurations or behavior.
Consider setting up automated tests to catch regressions early.
If issues persist, providing more detailed logs or a minimal reproduction environment can help in identifying the root cause more accurately.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #serialization error #model compatibility #GPU setup #container setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix Regression after 2026.3.28: sessionStrategy behavior changed, ws-stream 500 fallback, slower Discord interaction handling [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

PR fix notes

PR #55535: fix: keep openai-codex on HTTP responses transport

Description (problem / solution / changelog)

Summary

Change Type

Scope

Linked Issue/PR

Root Cause / Regression History

Behavior Changes

Regression Test Plan

Repro + Verification

Tests

Risks and Mitigations

AI assistance

Changed files

Code Example

Summary

What changed after upgrading to 2026.3.28

1) memory-lancedb-pro started resolving sessionStrategy as none

2) Discord interaction handling became slower

3) Embedded websocket streaming started failing

What is NOT the problem

Reranker server is healthy

Rolling back OpenClaw fixes it

Environment

Relevant config excerpts

Agent defaults

memory-lancedb-pro

Why I think this should be investigated in OpenClaw

Request

extent analysis

Fix Plan

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING

1) `memory-lancedb-pro` started resolving `sessionStrategy` as `none`