openclaw - 💡(How to fix) Fix [Bug]: Matrix thread session key case-normalizes event IDs, causing duplicate stuck sessions and thread delivery failures [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#75670Fetched 2026-05-02 05:31:58
View on GitHub
Comments
1
Participants
2
Timeline
2
Reactions
2
Timeline (top)
commented ×1subscribed ×1

OpenClaw lowercases Matrix event IDs when constructing thread session keys (sessionKey = ...thread:$<lowercased_event_id>), but also creates a second session with the original mixed-case event ID. This causes two compounding failures:

  1. Duplicate stuck sessions: Every Matrix thread spawns two sessions — one with the original event ID and one with the lowercased version. They deadlock each other (one reports active_embedded_run, the other no_active_work), and neither recovers.

  2. Thread reply delivery failures: When constructing m.relates_to relations for thread replies, the lowercased event ID is used. Synapse rejects these with [400] Can't send relation to unknown event because Matrix event IDs are case-sensitive per the spec.

Gateway restarts temporarily clear the stuck sessions, but any new thread reply immediately fails again because the case mismatch persists.

Error Message

2026-05-01T08:23:30.711 [delivery-recovery] Retry failed: MatrixError: [400] Can't send relation to unknown event 2026-05-01T08:38:07.270 [restart-sentinel] outbound delivery failed: MatrixError: [400] Can't send relation to unknown event

Root Cause

  1. Thread reply delivery failures: When constructing m.relates_to relations for thread replies, the lowercased event ID is used. Synapse rejects these with [400] Can't send relation to unknown event because Matrix event IDs are case-sensitive per the spec.

Fix Action

Workaround

Setting threadReplies: "off" in the Matrix channel config stops both the duplicate sessions and delivery failures. All messages route to the room session instead.

Code Example

[diagnostic] stuck session: sessionId=unknown sessionKey=...thread:$lSTsAlYrc_KOmteNbX6zqQxY5ZKMlYa79A7EArC4Jrg state=processing age=144s
[diagnostic] stuck session: sessionId=main sessionKey=...thread:$lstsalyrc_komtenbx6zqqxy5zkmlya79a7earc4jrg state=processing age=134s

---

2026-05-01T07:28:49.423 stuck session: sessionKey=...thread:$lSTsAlYrc_KOmteNbX6zqQxY5ZKMlYa79A7EArC4Jrg state=processing age=144s queueDepth=1
2026-05-01T07:28:49.424 stuck session: sessionKey=...thread:$lstsalyrc_komtenbx6zqqxy5zkmlya79a7earc4jrg state=processing age=134s queueDepth=0

---

2026-05-01T08:23:30.711 [delivery-recovery] Retry failed: MatrixError: [400] Can't send relation to unknown event
2026-05-01T08:38:07.270 [restart-sentinel] outbound delivery failed: MatrixError: [400] Can't send relation to unknown event
RAW_BUFFERClick to expand / collapse

Bug type

Crash (process/app exits or hangs)

Beta release blocker

No

Summary

OpenClaw lowercases Matrix event IDs when constructing thread session keys (sessionKey = ...thread:$<lowercased_event_id>), but also creates a second session with the original mixed-case event ID. This causes two compounding failures:

  1. Duplicate stuck sessions: Every Matrix thread spawns two sessions — one with the original event ID and one with the lowercased version. They deadlock each other (one reports active_embedded_run, the other no_active_work), and neither recovers.

  2. Thread reply delivery failures: When constructing m.relates_to relations for thread replies, the lowercased event ID is used. Synapse rejects these with [400] Can't send relation to unknown event because Matrix event IDs are case-sensitive per the spec.

Gateway restarts temporarily clear the stuck sessions, but any new thread reply immediately fails again because the case mismatch persists.

Steps to reproduce

  1. Configure OpenClaw with a Matrix channel and threadReplies: "always"
  2. Have a user send a message in a Matrix room thread
  3. Observe two session keys created for the same thread (e.g., thread:$lSTsAlY... and thread:$lstsaly...)
  4. Thread delivery fails with MatrixError: [400] Can't send relation to unknown event
  5. Both sessions enter state=processing and never recover

Expected behavior

  • Matrix event IDs should be treated as case-sensitive throughout the pipeline (per the Matrix spec)
  • Only one session should be created per thread, using the original event ID
  • Thread replies should use the original event ID in m.relates_to relations

Actual behavior

  • Two sessions created per thread: one with original case, one lowercased
  • Both sessions deadlock (diagnostic logs show alternating active_embedded_run / no_active_work)
  • Thread replies fail: MatrixError: [400] Can't send relation to unknown event
  • 443 delivery failures logged across 3 rooms over 10 days
  • 490 case-sensitive unique thread event IDs collapse to 249 case-insensitive — nearly every thread is affected

Example stuck session pair from logs:

[diagnostic] stuck session: sessionId=unknown sessionKey=...thread:$lSTsAlYrc_KOmteNbX6zqQxY5ZKMlYa79A7EArC4Jrg state=processing age=144s
[diagnostic] stuck session: sessionId=main sessionKey=...thread:$lstsalyrc_komtenbx6zqqxy5zkmlya79a7earc4jrg state=processing age=134s

Workaround

Setting threadReplies: "off" in the Matrix channel config stops both the duplicate sessions and delivery failures. All messages route to the room session instead.

OpenClaw version

2026.4.29 (a448042)

Operating system

macOS 26.2 (Darwin 25.2.0, arm64)

Install method

npm global (/opt/homebrew/lib/node_modules/openclaw), Node v25.8.1, launched via launchd (ai.openclaw.gateway)

Model

anthropic/claude-opus-4-6

Provider / routing chain

openclaw -> anthropic (direct)

Additional provider/model setup details

  • Matrix homeserver: Synapse (self-hosted, private network)
  • Active plugins: lossless-claw
  • The issue affects all Matrix rooms with threads, not specific to any room or thread

Logs, screenshots, and evidence

Stuck session diagnostic pairs (case collision visible):

2026-05-01T07:28:49.423 stuck session: sessionKey=...thread:$lSTsAlYrc_KOmteNbX6zqQxY5ZKMlYa79A7EArC4Jrg state=processing age=144s queueDepth=1
2026-05-01T07:28:49.424 stuck session: sessionKey=...thread:$lstsalyrc_komtenbx6zqqxy5zkmlya79a7earc4jrg state=processing age=134s queueDepth=0

Thread delivery failures:

2026-05-01T08:23:30.711 [delivery-recovery] Retry failed: MatrixError: [400] Can't send relation to unknown event
2026-05-01T08:38:07.270 [restart-sentinel] outbound delivery failed: MatrixError: [400] Can't send relation to unknown event

Scale: 443 "unknown event" failures logged from 2026-04-21 to 2026-05-01 across rooms !CtQaaSFRhaLfsgIJFh (196), !ZatHbfixtvTOjbQoYr (196), !dYPXGBxGPiWXDPdUnz (50).

Related issues

Partially related to #71127 (stuck sessions not auto-aborted), but this is a distinct root cause — the case normalization creates the stuck condition in the first place, and thread delivery fails independently of session state.

extent analysis

TL;DR

The most likely fix is to ensure that OpenClaw treats Matrix event IDs as case-sensitive throughout the pipeline, preventing the creation of duplicate sessions and thread reply delivery failures.

Guidance

  • Identify and modify the code responsible for constructing sessionKey to use the original case of the event ID, rather than lowercasing it.
  • Verify that the m.relates_to relations for thread replies use the original event ID, ensuring case sensitivity.
  • Temporarily, setting threadReplies: "off" in the Matrix channel config can mitigate the issue, but this may not be a desirable long-term solution.
  • Review the diagnostic logs to confirm that the stuck session pairs are no longer created and that thread delivery failures are resolved.

Example

No code snippet is provided as the issue does not specify the exact code location or syntax.

Notes

The provided information suggests that the issue is specific to the OpenClaw version 2026.4.29 and may be related to the lossless-claw plugin, but further investigation is needed to confirm.

Recommendation

Apply a workaround by setting threadReplies: "off" until a permanent fix can be implemented, as this stops both the duplicate sessions and delivery failures, although it may not be the desired functionality.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

  • Matrix event IDs should be treated as case-sensitive throughout the pipeline (per the Matrix spec)
  • Only one session should be created per thread, using the original event ID
  • Thread replies should use the original event ID in m.relates_to relations

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: Matrix thread session key case-normalizes event IDs, causing duplicate stuck sessions and thread delivery failures [1 comments, 2 participants]