openclaw - 💡(How to fix) Fix [Bug]: Session lookup fails after gateway uptime, creating new session instead of resuming existing one (100% reproducible next-day)

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Root Cause

  1. Check whether the session key → session ID mapping in the session store degrades over gateway uptime
  2. Investigate why the new session is created with messageChannel=webchat when the source is a feishu message
  3. Check whether there is a session store entry expiration or cleanup mechanism that could explain the next-day pattern
  4. The hybrid origin object with mismatched provider: \"webchat\" vs channel: \"feishu\" in the session entry may be the root cause of session key resolution failure

Code Example

[diagnostic] session state: sessionId=unknown sessionKey=agent:main:feishu:direct:{FEISHU_OPEN_ID} prev=idle new=processing reason=\"message_start\" queueDepth=1

---

embedded run start: runId={NEW_RUN_ID} sessionId={NEW_SESSION_ID} provider=minimax-portal model=MiniMax-M2.7-highspeed thinking=off messageChannel=webchat
RAW_BUFFERClick to expand / collapse

Bug Description

Session routing fails after gateway has been running for extended period, resulting in sessionId=unknown in diagnostic logs. New message creates a brand-new session instead of resuming the existing one, causing complete context loss. 100% reproducible: happens every next day of active conversation.

Environment

  • OpenClaw: 2026.5.4 (source build)
  • Channel: Feishu (Lark) direct message
  • Model: MiniMax-M2.7-highspeed (200K context)
  • Platform: macOS
  • Bootstrap files: MEMORY.md ~30K chars, AGENTS.md ~15K chars, SOUL.md ~1K chars

Steps to Reproduce

  1. Have an active feishu direct-message session with ongoing conversation (dozens of tool calls)
  2. Leave the session idle for several hours (overnight or across gateway uptime period)
  3. Send a new message to the same feishu conversation
  4. Expected: Session resumes with existing context
  5. Actual: Brand new session is created, agent has zero context from prior conversation

Diagnostic Evidence

When the bug triggers, logs show:

[diagnostic] session state: sessionId=unknown sessionKey=agent:main:feishu:direct:{FEISHU_OPEN_ID} prev=idle new=processing reason=\"message_start\" queueDepth=1

Immediately followed by:

embedded run start: runId={NEW_RUN_ID} sessionId={NEW_SESSION_ID} provider=minimax-portal model=MiniMax-M2.7-highspeed thinking=off messageChannel=webchat

Note: The new session creation is triggered by a webchat message even though the user is continuing a feishu conversation. The gateway appears unable to route the incoming message to the existing session and falls back to creating a new one.

The session file on disk for the old session still exists and contains all prior conversation history. The sessions.json store entry for the channel was not updated (old session ID still resolvable by direct session ID lookup).

What We Know

  • Session file is NOT lost: The old session .jsonl file still exists on disk with full history
  • sessions.json entry is NOT cleared: Direct lookup by session ID still resolves
  • Session key routing fails: The gateway cannot map the incoming channel message to the existing session by session key
  • New session is webchat-originated: The new session is created with messageChannel=webchat, not feishu — suggesting a routing-layer fallback rather than a proper session resume
  • Compaction happened before the failure: Session compaction ran successfully (compactionCount=2), ruling out compaction corruption

Session Store Entry (anonymized)

```json { "sessionId": "{ANONYMIZED_OLD_SESSION_ID}", "sessionStartedAt": "{TIMESTAMP_BEFORE_FAILURE}", "lastInteractionAt": "{TIMESTAMP_BEFORE_FAILURE}", "chatType": "direct", "deliveryContext": { "channel": "feishu", "accountId": "main" }, "lastChannel": "feishu", "lastAccountId": "main", "origin": { "provider": "webchat", "surface": "webchat", "chatType": "direct", "from": "feishu:{FEISHU_OPEN_ID}", "to": "{FEISHU_OPEN_ID}", "accountId": "main" }, "compactionCount": 2 } ```

Note the mismatch: deliveryContext.channel is "feishu" but origin.provider is "webchat". This hybrid origin object may be related to the routing failure.

Timing Pattern

  • Session works fine when actively used
  • Bug triggers after gateway has been running for extended period (hours)
  • Next message after idle period creates new session instead of resuming
  • The idle period appears to be the key trigger — not a gateway restart

Related Issues

  • #18194 (closed): Session lost after compaction timeout — similar context loss pattern but different trigger (compaction timeout vs. idle period)
  • #21104 (closed): Session history orphaned by agent routing change — similar session-lookup-failure pattern
  • #78059 (open): Session reset on idle loses triggering message — Slack thread about session reset on idle

Suggested Investigation

  1. Check whether the session key → session ID mapping in the session store degrades over gateway uptime
  2. Investigate why the new session is created with messageChannel=webchat when the source is a feishu message
  3. Check whether there is a session store entry expiration or cleanup mechanism that could explain the next-day pattern
  4. The hybrid origin object with mismatched provider: \"webchat\" vs channel: \"feishu\" in the session entry may be the root cause of session key resolution failure

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: Session lookup fails after gateway uptime, creating new session instead of resuming existing one (100% reproducible next-day)