openclaw - 💡(How to fix) Fix [Bug]: EmbeddedAttemptSessionTakeoverError still recurs on 2026.5.22 (post-#84250) — fresh sessions, long/tool-heavy turns (follow-up to #85306)

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

On 2026.5.22 (a374c3a), which includes #84250, EmbeddedAttemptSessionTakeoverError is reduced but NOT eliminated. Short turns succeed; long/tool-heavy turns still fail with the takeover — on FRESH sessions (the previously-implicated corrupted file was removed and the error simply relocated). Follow-up to #85306 (closed + locked "as resolved") and its fix #84250.

Error Message

Window A — gateway timeout (42s) → embedded fallback → takeover on fresh fallback session

20:54:16 [context-engine] Context engine "lossless-claw" is not registered; falling back to default engine "legacy". 20:54:26 warn diagnostic: lane wait exceeded: lane=session:agent:main:telegram:direct:<redacted> waitedMs=9602 queueAhead=0 activeAhead=1 20:54:30 EMBEDDED FALLBACK: Gateway agent timed out; running embedded agent with fresh session gateway-fallback-647d4fa4-…: GatewayTransportError: gateway timeout after 42000ms (target wss://127.0.0.1:18789) 20:54:48 warn agent/embedded: embedded run timeout: runId=gateway-fallback-647d4fa4-… sessionId=gateway-fallback-647d4fa4-… timeoutMs=12000 20:54:48 error diagnostic: lane task error: lane=main durationMs=16917 error="EmbeddedAttemptSessionTakeoverError: session file changed while embedded prompt lock was released: /Users/<user>/.openclaw/agents/main/sessions/gateway-fallback-647d4fa4-…jsonl" 20:54:48 error diagnostic: lane task error: lane=session:agent:main:explicit:gateway-fallback-647d4fa4-… durationMs=16920 error="EmbeddedAttemptSessionTakeoverError: …gateway-fallback-647d4fa4-…jsonl"

Window B — long turn (974-char inbound) → takeover on fresh session 4f181fe9 (durationMs 57281)

20:58:29 inbound: telegram:<redacted> -> @<bot> (direct, 974 chars) 20:58:30 [context-engine] Context engine "lossless-claw" is not registered; falling back to default engine "legacy". 20:59:28 error diagnostic: lane task error: lane=main durationMs=57281 error="EmbeddedAttemptSessionTakeoverError: session file changed while embedded prompt lock was released: /Users/<user>/.openclaw/agents/main/sessions/4f181fe9-9743-4e48-979c-9175400e3229.jsonl" 20:59:28 error diagnostic: lane task error: lane=session:agent:main:telegram:direct:<redacted> durationMs=57304 error="EmbeddedAttemptSessionTakeoverError: …/sessions/4f181fe9-…jsonl" 20:59:28 error Embedded agent failed before reply: …/sessions/4f181fe9-…jsonl

Context: the now-removed corrupted file showed this just before its failures (different session, for reference)

19:39:03 warn agent/embedded: session file repair skipped: invalid session header (ea3fedf2-…jsonl)

Root Cause

On 2026.5.22 (a374c3a), which includes #84250, EmbeddedAttemptSessionTakeoverError is reduced but NOT eliminated. Short turns succeed; long/tool-heavy turns still fail with the takeover — on FRESH sessions (the previously-implicated corrupted file was removed and the error simply relocated). Follow-up to #85306 (closed + locked "as resolved") and its fix #84250.

Code Example

# Window A — gateway timeout (42s) → embedded fallback → takeover on fresh fallback session
20:54:16 [context-engine] Context engine "lossless-claw" is not registered; falling back to default engine "legacy".
20:54:26 warn diagnostic: lane wait exceeded: lane=session:agent:main:telegram:direct:<redacted> waitedMs=9602 queueAhead=0 activeAhead=1
20:54:30 EMBEDDED FALLBACK: Gateway agent timed out; running embedded agent with fresh session gateway-fallback-647d4fa4-: GatewayTransportError: gateway timeout after 42000ms (target wss://127.0.0.1:18789)
20:54:48 warn agent/embedded: embedded run timeout: runId=gateway-fallback-647d4fa4-… sessionId=gateway-fallback-647d4fa4-… timeoutMs=12000
20:54:48 error diagnostic: lane task error: lane=main durationMs=16917 error="EmbeddedAttemptSessionTakeoverError: session file changed while embedded prompt lock was released: /Users/<user>/.openclaw/agents/main/sessions/gateway-fallback-647d4fa4-…jsonl"
20:54:48 error diagnostic: lane task error: lane=session:agent:main:explicit:gateway-fallback-647d4fa4-… durationMs=16920 error="EmbeddedAttemptSessionTakeoverError: …gateway-fallback-647d4fa4-…jsonl"

# Window B — long turn (974-char inbound) → takeover on fresh session 4f181fe9 (durationMs 57281)
20:58:29 inbound: telegram:<redacted> -> @<bot> (direct, 974 chars)
20:58:30 [context-engine] Context engine "lossless-claw" is not registered; falling back to default engine "legacy".
20:59:28 error diagnostic: lane task error: lane=main durationMs=57281 error="EmbeddedAttemptSessionTakeoverError: session file changed while embedded prompt lock was released: /Users/<user>/.openclaw/agents/main/sessions/4f181fe9-9743-4e48-979c-9175400e3229.jsonl"
20:59:28 error diagnostic: lane task error: lane=session:agent:main:telegram:direct:<redacted> durationMs=57304 error="EmbeddedAttemptSessionTakeoverError: …/sessions/4f181fe9-…jsonl"
20:59:28 error Embedded agent failed before reply:/sessions/4f181fe9-…jsonl

# Context: the now-removed corrupted file showed this just before its failures (different session, for reference)
19:39:03 warn agent/embedded: session file repair skipped: invalid session header (ea3fedf2-…jsonl)
RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Beta release blocker

No

Summary

On 2026.5.22 (a374c3a), which includes #84250, EmbeddedAttemptSessionTakeoverError is reduced but NOT eliminated. Short turns succeed; long/tool-heavy turns still fail with the takeover — on FRESH sessions (the previously-implicated corrupted file was removed and the error simply relocated). Follow-up to #85306 (closed + locked "as resolved") and its fix #84250.

Steps to reproduce

  1. OpenClaw 2026.5.22 (a374c3a), single agent (main), Telegram direct.
  2. Stripped to bare core: cron disabled, heartbeats "0m", memory plugins off (browser/memory-wiki/telegram only); lossless-claw disabled → legacy context engine.
  3. Use a fresh session (we removed an earlier corrupted session file; this reproduces on brand-new sessions).
  4. Send a long / tool-heavy message (longer turn / heavier processing / tool calls).
  5. → turn fails before reply with EmbeddedAttemptSessionTakeoverError; user sees "Something went wrong while processing your request." Short turns (a few seconds) succeed — the failure correlates with turn length.

Expected behavior

Long/tool-heavy turns complete and reply, same as short turns. #84250 should tolerate the run's own in-process transcript writes for the whole turn, including long ones.

Actual behavior

On long turns the takeover fence aborts the run: "session file changed while embedded prompt lock was released". Two lanes (lane=main + lane=session:…) fail on the same .jsonl. Also reproduces via the gateway-timeout → embedded-fallback path (gateway agent times out at 42s; the fallback session then trips the same takeover).

OpenClaw version

2026.5.22 (a374c3a). Original report #85306 was on 2026.5.20 (e510042) / first seen 2026.5.19 (a185ca2).

Operating system

macOS 26.5 (arm64), Apple M4 Mac mini

Install method

pnpm; gateway run as a user LaunchAgent

Model

anthropic/claude-sonnet-4-6

Provider / routing chain

Anthropic direct; auth = api-key (anthropic:default). Fallback chain: claude-sonnet-4-6 → claude-haiku-4-5 → claude-opus-4-7.

Additional provider/model setup details

Stripped to bare core during diagnosis — lossless-claw, active-memory, memory-core all DISABLED (legacy context engine), cron disabled, heartbeats "0m". So none of those are involved. No API keys/tokens included.

Logs, screenshots, and evidence

# Window A — gateway timeout (42s) → embedded fallback → takeover on fresh fallback session
20:54:16 [context-engine] Context engine "lossless-claw" is not registered; falling back to default engine "legacy".
20:54:26 warn diagnostic: lane wait exceeded: lane=session:agent:main:telegram:direct:<redacted> waitedMs=9602 queueAhead=0 activeAhead=1
20:54:30 EMBEDDED FALLBACK: Gateway agent timed out; running embedded agent with fresh session gateway-fallback-647d4fa4-…: GatewayTransportError: gateway timeout after 42000ms (target wss://127.0.0.1:18789)
20:54:48 warn agent/embedded: embedded run timeout: runId=gateway-fallback-647d4fa4-… sessionId=gateway-fallback-647d4fa4-… timeoutMs=12000
20:54:48 error diagnostic: lane task error: lane=main durationMs=16917 error="EmbeddedAttemptSessionTakeoverError: session file changed while embedded prompt lock was released: /Users/<user>/.openclaw/agents/main/sessions/gateway-fallback-647d4fa4-…jsonl"
20:54:48 error diagnostic: lane task error: lane=session:agent:main:explicit:gateway-fallback-647d4fa4-… durationMs=16920 error="EmbeddedAttemptSessionTakeoverError: …gateway-fallback-647d4fa4-…jsonl"

# Window B — long turn (974-char inbound) → takeover on fresh session 4f181fe9 (durationMs 57281)
20:58:29 inbound: telegram:<redacted> -> @<bot> (direct, 974 chars)
20:58:30 [context-engine] Context engine "lossless-claw" is not registered; falling back to default engine "legacy".
20:59:28 error diagnostic: lane task error: lane=main durationMs=57281 error="EmbeddedAttemptSessionTakeoverError: session file changed while embedded prompt lock was released: /Users/<user>/.openclaw/agents/main/sessions/4f181fe9-9743-4e48-979c-9175400e3229.jsonl"
20:59:28 error diagnostic: lane task error: lane=session:agent:main:telegram:direct:<redacted> durationMs=57304 error="EmbeddedAttemptSessionTakeoverError: …/sessions/4f181fe9-…jsonl"
20:59:28 error Embedded agent failed before reply: …/sessions/4f181fe9-…jsonl

# Context: the now-removed corrupted file showed this just before its failures (different session, for reference)
19:39:03 warn agent/embedded: session file repair skipped: invalid session header (ea3fedf2-…jsonl)

Impact and severity

  • Affected: agent main, every long/tool-heavy turn, on a clean 2026.5.22 install (Telegram; also seen in terminal chat mode in #85306).
  • Severity: blocks all substantial work — the agent cannot complete a long turn.
  • Frequency: deterministic on long/tool-heavy turns; short turns pass.
  • Consequence: agent unusable for any real task on 2026.5.22.

Additional information

Follow-up to #85306 (closed + locked "as resolved" by openclaw-barnacle, so I couldn't reopen/comment) and its fix #84250.

Regression timeline: worked pre-2026.5.19; #82767 (in 5.20) cut error volume ~167/morning → ~3; #84250 (in 5.22) fixed the short-turn case; the long-turn dual-lane race remains.

We have a LIVE session (4f181fe9) reproducing this deterministically on long turns. The earlier corrupted file (ea3fedf2, invalid header) was removed — the error then relocated to fresh sessions (4f181fe9, gateway-fallback-647d4fa4), confirming it's not file-corruption-specific.

Apparent mechanism: the takeover fence compares the session file across the prompt-lock-release window; on a long turn the file keeps growing (transcript appends / tool output) during that window, so the fence reads it as an external change and aborts — even though it's the run's own in-process writes. Likely the broader "core same-session lane ownership" case beyond the #85208 Telegram dedupe. Happy to share full unsanitized logs privately if useful.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Long/tool-heavy turns complete and reply, same as short turns. #84250 should tolerate the run's own in-process transcript writes for the whole turn, including long ones.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING