openclaw - 💡(How to fix) Fix LLM idle timeout error silently dropped when agentRunStarted is true [2 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

When an LLM idle timeout occurs after the agent has started (e.g., after tool calls), the error is written to the session log but never broadcast to connected clients. Users see no error feedback — the response silently stops. 4. The timeout error is logged to the session JSONL file but never reaches the client {"type":"custom","customType":"openclaw:prompt-error","data":{"error":"LLM idle timeout (120s): no response from model | LLM idle timeout (120s): no response from model",...}} The session ends here — no final/error event is broadcast. // Agent never started → processes deliveredReplies, broadcasts final/error ✅ The timeout error flows like this:

  1. run.ts handles the timeout by returning an error payload ({ text: "...", isError: true }), not throwing an exception
  2. The error payload is collected in deliveredReplies via the deliver callback Result: The error payload sits in deliveredReplies but is never broadcast. Connected clients (ACP bridges, etc.) never receive any error event. Clients should receive a state: "error" chat event with the timeout error message, the same as other error scenarios. // Agent started — check for error payloads that weren't streamed

Root Cause

File: src/gateway/server-methods/chat.ts (.then() handler, ~line 2692 in main)

if (!agentRunStarted) {
  // Agent never started → processes deliveredReplies, broadcasts final/error ✅
  broadcastChatFinal(...);
} else if (!hasBeforeAgentRunGate) {
  // Agent started → only updates transcript, NO broadcast ❌
  await emitUserTranscriptUpdate();
}

The timeout error flows like this:

  1. run.ts handles the timeout by returning an error payload ({ text: "...", isError: true }), not throwing an exception
  2. The error payload is collected in deliveredReplies via the deliver callback
  3. The .then() handler checks agentRunStarted — since the agent had started (it made tool calls), it's true
  4. The code only calls emitUserTranscriptUpdate()no broadcastChatError() or broadcastChatFinal() is called
  5. Meanwhile, .catch() (which does call broadcastChatError()) is never reached because run.ts returned normally, not threw

Result: The error payload sits in deliveredReplies but is never broadcast. Connected clients (ACP bridges, etc.) never receive any error event.

Fix Action

Fixed

Code Example

{"type":"custom","customType":"openclaw:prompt-error","data":{"error":"LLM idle timeout (120s): no response from model | LLM idle timeout (120s): no response from model",...}}

---

if (!agentRunStarted) {
  // Agent never started → processes deliveredReplies, broadcasts final/error ✅
  broadcastChatFinal(...);
} else if (!hasBeforeAgentRunGate) {
  // Agent started → only updates transcript, NO broadcast ❌
  await emitUserTranscriptUpdate();
}

---

} else {
  // Agent started — check for error payloads that weren't streamed
  const errorPayloads = deliveredReplies
    .filter((entry) => entry.payload.isError);
  if (errorPayloads.length > 0) {
    const errorMsg = errorPayloads
      .map((entry) => entry.payload.text)
      .filter(Boolean)
      .join(" | ");
    broadcastChatError({
      context,
      runId: clientRunId,
      sessionKey,
      errorMessage: errorMsg,
    });
  } else if (!hasBeforeAgentRunGate) {
    await emitUserTranscriptUpdate().catch(...);
  }
}
RAW_BUFFERClick to expand / collapse

Bug Description

When an LLM idle timeout occurs after the agent has started (e.g., after tool calls), the error is written to the session log but never broadcast to connected clients. Users see no error feedback — the response silently stops.

Reproduction

  1. Start an agent session via gateway (e.g., through ACP bridge / Ki-Agents)
  2. The agent begins processing — reads skills, makes tool calls (so agentRunStarted = true)
  3. On a subsequent LLM call, the model fails to produce any token within the idle timeout window (default 120s)
  4. The timeout error is logged to the session JSONL file but never reaches the client

Session Log Evidence

{"type":"custom","customType":"openclaw:prompt-error","data":{"error":"LLM idle timeout (120s): no response from model | LLM idle timeout (120s): no response from model",...}}

The session ends here — no final/error event is broadcast.

Root Cause

File: src/gateway/server-methods/chat.ts (.then() handler, ~line 2692 in main)

if (!agentRunStarted) {
  // Agent never started → processes deliveredReplies, broadcasts final/error ✅
  broadcastChatFinal(...);
} else if (!hasBeforeAgentRunGate) {
  // Agent started → only updates transcript, NO broadcast ❌
  await emitUserTranscriptUpdate();
}

The timeout error flows like this:

  1. run.ts handles the timeout by returning an error payload ({ text: "...", isError: true }), not throwing an exception
  2. The error payload is collected in deliveredReplies via the deliver callback
  3. The .then() handler checks agentRunStarted — since the agent had started (it made tool calls), it's true
  4. The code only calls emitUserTranscriptUpdate()no broadcastChatError() or broadcastChatFinal() is called
  5. Meanwhile, .catch() (which does call broadcastChatError()) is never reached because run.ts returned normally, not threw

Result: The error payload sits in deliveredReplies but is never broadcast. Connected clients (ACP bridges, etc.) never receive any error event.

Expected Behavior

Clients should receive a state: "error" chat event with the timeout error message, the same as other error scenarios.

Suggested Fix

In the .then() handler, when agentRunStarted = true, check deliveredReplies for payloads with isError: true. If found, call broadcastChatError() to notify connected clients:

} else {
  // Agent started — check for error payloads that weren't streamed
  const errorPayloads = deliveredReplies
    .filter((entry) => entry.payload.isError);
  if (errorPayloads.length > 0) {
    const errorMsg = errorPayloads
      .map((entry) => entry.payload.text)
      .filter(Boolean)
      .join(" | ");
    broadcastChatError({
      context,
      runId: clientRunId,
      sessionKey,
      errorMessage: errorMsg,
    });
  } else if (!hasBeforeAgentRunGate) {
    await emitUserTranscriptUpdate().catch(...);
  }
}

Environment

  • OpenClaw version: main branch (bde07ddb)
  • Model: glm-5-turbo (via anthropic-messages API)
  • Connection: ACP bridge (Ki-Agents gateway)
  • Idle timeout: 120s (default)

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix LLM idle timeout error silently dropped when agentRunStarted is true [2 pull requests]