openclaw - 💡(How to fix) Fix WhatsApp session stalls on long model_call: incomplete turn with payloads=0, reply never delivered [1 pull requests]

openclaw2026-05-20 12:22:33

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Error Message

11:58:06 [agent/embedded] incomplete turn detected: stopReason=stop payloads=0 — surfacing error to user When incomplete turn is detected with payloads=0, the embedded runner returns an error payload. This error payload is never surfaced to the WhatsApp channel — no send log appears. The delivery layer silently drops error payloads from incomplete turns. 2. Incomplete turn error payloads should be delivered to the user via the originating channel, not silently dropped

Root Cause

Two separate but related issues:

Fix Action

Fixed

Fixed by PR: fix(whatsapp): deliver final error payloads so incomplete-turn errors reach users (https://github.com/openclaw/openclaw/pull/84578)

Code Example

11:54:00  [whatsapp] Inbound message <REDACTED> (107 chars)
11:56:25  [diagnostic] long-running session: age=142s queueDepth=1 reason=queued_behind_active_work activeWorkKind=model_call
11:57:55  [diagnostic] stalled session: age=232s reason=active_work_without_progress classification=stalled_agent_run
11:58:06  [agent/embedded] incomplete turn detected: stopReason=stop payloads=0 — surfacing error to user
11:58:24  [whatsapp] Web connection closed (status 428)

RAW_BUFFERClick to expand / collapse

Problem

When a WhatsApp direct session is processing a long model call, subsequent inbound messages queue up. If the model call takes too long (~120-240s), the session transitions to stalled_agent_run and eventually terminates with an incomplete turn (stopReason=stop, payloads=0). The reply is never delivered to WhatsApp — the user sees nothing.

Version

OpenClaw 2026.5.18 (50a2481)

Reproduction (observed in production)

User sends WhatsApp message → agent starts processing (long model call)
While model call is still running, user sends another message (~60s later)
Gateway logs: long-running session → stalled_agent_run → incomplete turn detected: payloads=0
Reply never appears in WhatsApp

Gateway Logs (evidence)

11:54:00  [whatsapp] Inbound message <REDACTED> (107 chars)
11:56:25  [diagnostic] long-running session: age=142s queueDepth=1 reason=queued_behind_active_work activeWorkKind=model_call
11:57:55  [diagnostic] stalled session: age=232s reason=active_work_without_progress classification=stalled_agent_run
11:58:06  [agent/embedded] incomplete turn detected: stopReason=stop payloads=0 — surfacing error to user
11:58:24  [whatsapp] Web connection closed (status 428)

After 11:58:06, zero outbound WhatsApp send logs appear. The reply payload (isError: true) from the incomplete turn handler is never delivered to the WhatsApp channel.

Root Cause Analysis

Two separate but related issues:

1. Session Queue Deadlock

The embedded runner queues new inbound messages behind an active model call. When the model call runs long (120s+), the session enters stalled_agent_run state with recovery=none. There is no mechanism to interrupt the long model call gracefully, deliver a "still working" status message, or queue subsequent messages without dropping the current one.

2. Incomplete Turn Payloads Never Delivered

When incomplete turn is detected with payloads=0, the embedded runner returns an error payload. This error payload is never surfaced to the WhatsApp channel — no send log appears. The delivery layer silently drops error payloads from incomplete turns.

Expected Behavior

Long model calls should timeout with a user-visible fallback message
Incomplete turn error payloads should be delivered to the user via the originating channel, not silently dropped

Impact

Users on WhatsApp (and potentially other channels) lose responses entirely when model calls take too long.

Environment

Host: ARM64 (DGX Spark, NVIDIA GB10)
Model: vLLM serving Qwen3.6-27B-FP8 (--gpu-memory-utilization 0.65)
Concurrent workloads: Chatterbox TTS, vectordb graph build (254 pages)
Gateway memory pressure observed (heap 1.4GB, threshold 1.0GB)

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering