openclaw - 💡(How to fix) Fix Feature: Per-Session Health Polling & Live Status Indicators [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#57933Fetched 2026-04-08 01:55:59
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0
Participants

Code Example

interface SessionHeartbeat {
  sessionKey: string;
  state: 
    | "idle"           // no active turn
    | "awaiting_model" // sent request, waiting for first token
    | "streaming"      // receiving tokens from model
    | "tool_exec"      // executing a tool call
    | "tool_wait"      // waiting for tool result
    | "complete";      // turn finished, waiting for next input
  turnId?: string;     // current turn identifier
  model?: string;      // active model
  toolName?: string;   // currently executing tool
  lastTokenAt?: number;// timestamp of last received token
  startedAt: number;   // when current turn started
}

---

GET /api/sessions/:key/heartbeat   → current heartbeat state
GET /api/sessions/heartbeat        → all session heartbeats (batch)
RAW_BUFFERClick to expand / collapse

Problem

When a session is processing a request (especially waiting for a slow model or a long tool call), the Control UI shows no indication of whether the session is alive, thinking, stalled, or crashed. Users experience "dead air" — minutes of silence with no feedback — which is indistinguishable from a crash or stall.

This is especially painful with:

  • Slower models (long time-to-first-token)
  • Multi-step agentic loops with tool calls
  • Background/long-running operations
  • Sessions with a history of instability

Proposed Solution

Session Heartbeat Emitter

Each active session emits a lightweight heartbeat signal at a regular interval (e.g., every 5 seconds) containing:

interface SessionHeartbeat {
  sessionKey: string;
  state: 
    | "idle"           // no active turn
    | "awaiting_model" // sent request, waiting for first token
    | "streaming"      // receiving tokens from model
    | "tool_exec"      // executing a tool call
    | "tool_wait"      // waiting for tool result
    | "complete";      // turn finished, waiting for next input
  turnId?: string;     // current turn identifier
  model?: string;      // active model
  toolName?: string;   // currently executing tool
  lastTokenAt?: number;// timestamp of last received token
  startedAt: number;   // when current turn started
}

Key Behaviors

  1. Transition-based emission — heartbeat fires on every state transition (idle → awaiting_model → streaming → tool_exec, etc.) AND on a periodic timer (5s) to catch stalls
  2. No extra model calls — this is pure session lifecycle metadata, not LLM traffic
  3. Stall detection — if heartbeat shows awaiting_model or tool_exec for > configurable threshold (e.g., 30s), the session is flagged as "potentially stalled"
  4. Crash detection — if no heartbeat received for > 15s, the UI shows session as "unresponsive"

Control UI Integration

  • Per-session status indicator (colored dot or badge): 🟢 active, 🟡 waiting, 🔴 stalled/dead
  • Tooltip showing: state, current model, current tool, elapsed time
  • Optional "last activity" timestamp per session
  • Session list should sort/update in real-time based on heartbeat

API Surface

GET /api/sessions/:key/heartbeat   → current heartbeat state
GET /api/sessions/heartbeat        → all session heartbeats (batch)

Or expose via existing session list endpoint with an extended status field.

Open Questions

  • Should heartbeat be WebSocket-based (push) or polling (pull)? WebSocket is better for real-time, polling is simpler to implement.
  • Should stall thresholds be configurable per-session or global?
  • Should there be an auto-recovery action (restart stalled session) exposed via the UI?

Alternatives Considered

  • Model streaming alone — doesn't cover tool execution gaps or slow time-to-first-token
  • Existing session list polling — already exists but lacks granular state info
  • Log tailing — too heavy, requires parsing, not structured

Impact

This would significantly improve the operational experience of running OpenClaw, especially for users managing multiple sessions or using slower/cheaper models where latency is expected. It turns "is it dead?" from a guessing game into a visible status.

extent analysis

Fix Plan

To implement the session heartbeat emitter, follow these steps:

  • Step 1: Define the SessionHeartbeat interface
interface SessionHeartbeat {
  sessionKey: string;
  state: 
    | "idle"           // no active turn
    | "awaiting_model" // sent request, waiting for first token
    | "streaming"      // receiving tokens from model
    | "tool_exec"      // executing a tool call
    | "tool_wait"      // waiting for tool result
    | "complete";      // turn finished, waiting for next input
  turnId?: string;     // current turn identifier
  model?: string;      // active model
  toolName?: string;   // currently executing tool
  lastTokenAt?: number;// timestamp of last received token
  startedAt: number;   // when current turn started
}
  • Step 2: Implement the heartbeat emitter
class SessionHeartbeatEmitter {
  private sessionKey: string;
  private state: SessionHeartbeat['state'];
  private timer: NodeJS.Timeout;

  constructor(sessionKey: string) {
    this.sessionKey = sessionKey;
    this.state = 'idle';
    this.timer = setInterval(this.emitHeartbeat, 5000); // 5 seconds
  }

  private emitHeartbeat = () => {
    const heartbeat: SessionHeartbeat = {
      sessionKey: this.sessionKey,
      state: this.state,
      startedAt: Date.now(),
    };
    // Send the heartbeat to the API or WebSocket
    // ...
  };

  public updateState(state: SessionHeartbeat['state']) {
    this.state = state;
    this.emitHeartbeat(); // Emit immediately on state change
  }
}
  • Step 3: Integrate with the Control UI
// Update the session list to display the heartbeat status
const sessionList = document.getElementById('session-list');
sessionList.addEventListener('update', (event: CustomEvent) => {
  const session = event.detail.session;
  const heartbeat = session.heartbeat;
  const statusIndicator = document.createElement('span');
  statusIndicator.textContent = heartbeat.state;
  statusIndicator.className = `status-${heartbeat.state}`;
  sessionList.appendChild(statusIndicator);
});
  • Step 4: Implement stall and crash detection
const stallThreshold = 30000; // 30 seconds
const crashThreshold = 15000; // 15 seconds

const sessionHeartbeats: { [sessionKey: string]: SessionHeartbeat } = {};

setInterval(() => {
  Object.keys(sessionHeartbeats).forEach((sessionKey) => {
    const heartbeat = sessionHeartbeats[sessionKey];
    const elapsed = Date.now() - heartbeat.startedAt;
    if (elapsed > stallThreshold && heartbeat.state === 'awaiting_model' ||

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Feature: Per-Session Health Polling & Live Status Indicators [1 participants]