openclaw - 💡(How to fix) Fix EmbeddedAttemptSessionTakeoverError: completed LLM call silently discarded under concurrent same-session writes

Error Message

gateway.err.log contains 84 occurrences. Representative sample:

2026-05-18T20:56:16.827-07:00 [diagnostic] lane task error: lane=main durationMs=246652 error="EmbeddedAttemptSessionTakeoverError: session file changed while embedded prompt lock was released: /Users/xiaoou/lobster-team/agents/main/.openclaw-runtime/sessions/ 43febae9-82cf-4e7a-9a49-120018b31401.jsonl"

Some discarded calls had run for as long as 830,691ms (~14 min) before being thrown away.

Code Example

gateway.err.log contains 84 occurrences. Representative sample:

2026-05-18T20:56:16.827-07:00 [diagnostic] lane task error: lane=main
durationMs=246652 error="EmbeddedAttemptSessionTakeoverError: session file
changed while embedded prompt lock was released:
/Users/xiaoou/lobster-team/agents/main/.openclaw-runtime/sessions/
43febae9-82cf-4e7a-9a49-120018b31401.jsonl"

Some discarded calls had run for as long as 830,691ms (~14 min) before 
being thrown away.

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

Summary

When a second message arrives in the same agent session before the first LLM call completes, the first call is discarded with EmbeddedAttemptSessionTakeoverError — the lock is released during LLM inference, allowing the second message to overwrite the session file and invalidate the first call's fingerprint.

Steps to reproduce

Start OpenClaw gateway (2026.4.26) on macOS, single agent "main" with multiple entry points (Feishu DM + dashboard + CLI)
Send a message that triggers an LLM call lasting >30 seconds (e.g., a multi-step autonomous task with tool calls)
Within 30 seconds, send a second message in the same session
Observe: the first message never returns a response; the user sees "..." indefinitely

This reproduces consistently whenever two messages arrive within the same LLM inference window. It is not a race condition requiring precise timing — any second message during the first call's inference phase triggers it.

Expected behavior

The first LLM call should complete and return its response to the user. The second message should either wait (pessimistic lock) or be written independently without invalidating the first (sharded sessions). At minimum, a completed LLM call that cost tokens should not be silently discarded.

Actual behavior

The first LLM call's response is silently discarded. The gateway error log records:

EmbeddedAttemptSessionTakeoverError: session file changed while embedded prompt lock was released: <session-path>.jsonl

84 occurrences of this error in gateway.err.log between 2026-05-18 and 2026-05-19. The discarded call had already completed its full LLM inference (some for up to 830 seconds / ~14 minutes), burning tokens with zero user-visible output.

OpenClaw version

2026.5.18

Operating system

macOS 26.4.1 (Build 25E253)

Install method

npm global install, located at ~/.openclaw/tools/node-v22.22.0/lib/node_modules/openclaw/

Model

mimo/mimo-v2.5-pro (default). Also reproducible with apiset-anthropic/claude-sonnet-4-6 and apiset/qwen3.5-flash. The issue is model-agnostic — it occurs at the session/lock layer, not the LLM layer.

Provider / routing chain

Local OpenClaw gateway at 127.0.0.1:18789, routing to multiple providers (apiset, apiset-anthropic, mimo). Multi-entry: Feishu DM, dashboard web UI, and CLI all targeting the same "main" agent.

Additional provider/model setup details

Single agent ("main") with a model allowlist of 17 models across 3 providers. The agent has 3 Feishu accounts bound to it plus dashboard access. No custom lock configuration — using default OpenClaw session locking. Session files are .jsonl format stored at: agents/main/.openclaw-runtime/sessions/<uuid>.jsonl

Logs, screenshots, and evidence

gateway.err.log contains 84 occurrences. Representative sample:

2026-05-18T20:56:16.827-07:00 [diagnostic] lane task error: lane=main
durationMs=246652 error="EmbeddedAttemptSessionTakeoverError: session file
changed while embedded prompt lock was released:
/Users/xiaoou/lobster-team/agents/main/.openclaw-runtime/sessions/
43febae9-82cf-4e7a-9a49-120018b31401.jsonl"

Some discarded calls had run for as long as 830,691ms (~14 min) before 
being thrown away.

Impact and severity

Severity: Workflow-blocking

Affected: Any user with multiple entry points to the same agent (Feishu + dashboard + CLI), or any user who sends a follow-up message before the first response arrives.

Frequency: Reproducible every time two messages arrive in the same session within one LLM inference window.

Consequences:

Tokens burned on discarded LLM calls with zero user-visible output
Agent appears frozen/broken (user sees "..." indefinitely)
Multi-step autonomous tasks (where the agent chains tool calls) are particularly vulnerable — a user checking progress triggers the bug and kills the running task
Makes concurrent use of the same agent (dashboard monitoring + Feishu chat) effectively unsafe

Additional information

Root cause analysis:

The session lock is released before LLM inference begins, not after:

T0: Message A → acquire lock → write session → release lock → LLM inference T1: Message B → acquire lock → overwrite session → release lock → LLM inference T2: A's LLM returns → fingerprint mismatch → takeover error → A discarded

The lock guards only the session file write, not the full request lifecycle.

Suggested fix (pessimistic lock):

acquire_lock → write session → run LLM → write response → release_lock

For single-user and small-team deployments, serializing same-session access is far preferable to silently discarding completed work with burned tokens.

Alternative (sharded sessions): write each turn as an independent file (turn_001_user.json, turn_001_assistant.json, etc.) to eliminate fingerprint conflicts and allow true concurrent writes without a lock.

FAQ

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix EmbeddedAttemptSessionTakeoverError: completed LLM call silently discarded under concurrent same-session writes

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Bug type

Beta release blocker

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

FAQ

Expected behavior

Still need to ship something?

TRENDING