openclaw - 💡(How to fix) Fix Gateway freezes after nested subagent activity, stops all Telegram polling [1 participants]

Official PRs (…)
ON THIS PAGE

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#53745Fetched 2026-04-08 01:24:02
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0
Author
Participants

After a coordinator agent calls subagents via sessions_send, the gateway process stays alive (HTTP 200 on healthcheck, PID running) but completely stops processing Telegram updates. Log output stops entirely — no incoming messages, no health-monitor, no sendMessage, nothing. All Telegram bots go silent simultaneously.

Error Message

Last entries before freeze:

[agent:nested] session=agent:skyler:main ... ANNOUNCE_SKIP

After this — complete silence in both gateway.log and gateway.err.log.

Error log context (shortly before freeze):

[agent/embedded] ... isError=true error=The AI service is temporarily overloaded
[agent/embedded] embedded run failover decision: ... decision=rotate_profile reason=overloaded

Previous occurrence (March 22): On the older version (2026.3.2), the same freeze pattern occurred, additionally accompanied by:

[gateway] full process restart failed (spawnSync launchctl ETIMEDOUT); falling back to in-process restart

Root Cause

After a coordinator agent calls subagents via sessions_send, the gateway process stays alive (HTTP 200 on healthcheck, PID running) but completely stops processing Telegram updates. Log output stops entirely — no incoming messages, no health-monitor, no sendMessage, nothing. All Telegram bots go silent simultaneously.

Fix Action

Workaround

Kill and restart the gateway process:

launchctl kickstart -k gui/$(id -u)/ai.openclaw.gateway

Code Example

[agent:nested] session=agent:skyler:main ... ANNOUNCE_SKIP

---

[agent/embedded] ... isError=true error=The AI service is temporarily overloaded
[agent/embedded] embedded run failover decision: ... decision=rotate_profile reason=overloaded

---

[gateway] full process restart failed (spawnSync launchctl ETIMEDOUT); falling back to in-process restart

---

launchctl kickstart -k gui/$(id -u)/ai.openclaw.gateway
RAW_BUFFERClick to expand / collapse

Description

After a coordinator agent calls subagents via sessions_send, the gateway process stays alive (HTTP 200 on healthcheck, PID running) but completely stops processing Telegram updates. Log output stops entirely — no incoming messages, no health-monitor, no sendMessage, nothing. All Telegram bots go silent simultaneously.

Environment

  • OpenClaw version: 2026.3.13 (also reproduced on 2026.3.2)
  • OS: macOS 15 (Darwin 24.3.0), Apple Silicon
  • Node: v25.8.0
  • Setup: 10 Telegram bot accounts (1 default + 9 subagents), each with separate bot token

Steps to Reproduce

  1. Configure multiple agents (default + subagents with separate Telegram bots)
  2. Send message to coordinator agent (e.g. "Saul") via Telegram
  3. Coordinator calls subagent (e.g. "Skyler") via sessions_send
  4. Subagent responds successfully
  5. After subagent response, gateway log goes completely silent
  6. All Telegram bots stop responding (including default/main agent)

Expected Behavior

Gateway should continue processing Telegram updates after subagent interaction completes.

Actual Behavior

Gateway process remains alive (responds HTTP 200 on port), but:

  • Zero log output after the last agent:nested entry
  • No Telegram polling — no incoming messages processed for any bot
  • No health-monitor checks running
  • No sendMessage attempts
  • Process must be killed and restarted to recover

Logs

Last entries before freeze:

[agent:nested] session=agent:skyler:main ... ANNOUNCE_SKIP

After this — complete silence in both gateway.log and gateway.err.log.

Error log context (shortly before freeze):

[agent/embedded] ... isError=true error=The AI service is temporarily overloaded
[agent/embedded] embedded run failover decision: ... decision=rotate_profile reason=overloaded

Previous occurrence (March 22): On the older version (2026.3.2), the same freeze pattern occurred, additionally accompanied by:

[gateway] full process restart failed (spawnSync launchctl ETIMEDOUT); falling back to in-process restart

Frequency

Reproduced 3 times in 2 days (March 22-24, 2026). Each time required manual launchctl kickstart -k to recover.

Workaround

Kill and restart the gateway process:

launchctl kickstart -k gui/$(id -u)/ai.openclaw.gateway

extent analysis

Fix Plan

To address the issue of the gateway process freezing after a subagent interaction, we will implement the following steps:

  • Increase logging verbosity: Modify the logging configuration to include more detailed logs, which can help identify the root cause of the issue.
  • Implement retry mechanism: Add a retry mechanism for the sessions_send method to handle temporary overloads and failures.
  • Handle AI service overload: Modify the AI service to handle overload situations more robustly, such as by implementing a queue or load balancing.

Code Changes

Here are some example code snippets to illustrate the proposed changes:

Increase Logging Verbosity

// Increase logging verbosity
const logger = require('logger');
logger.setLevel('DEBUG');

// Log detailed information about sessions_send method
logger.debug('Sending session to subagent:', subagentName);

Implement Retry Mechanism

// Implement retry mechanism for sessions_send method
const maxRetries = 3;
const retryDelay = 500; // 500ms

async function sendSession(subagentName, data) {
  let retries = 0;
  while (retries < maxRetries) {
    try {
      // Call sessions_send method
      const response = await sessionsSend(subagentName, data);
      return response;
    } catch (error) {
      logger.error('Error sending session:', error);
      retries++;
      await new Promise(resolve => setTimeout(resolve, retryDelay));
    }
  }
  // Handle max retries exceeded
  logger.error('Max retries exceeded for sending session to subagent:', subagentName);
  throw new Error('Max retries exceeded');
}

Handle AI Service Overload

// Handle AI service overload
const aiServiceQueue = [];

async function aiServiceRequest(data) {
  // Check if AI service is overloaded
  if (aiServiceQueue.length > 10) {
    // Handle overload situation, e.g., by returning an error or delaying the request
    logger.error('AI service is overloaded. Delaying request.');
    await new Promise(resolve => setTimeout(resolve, 1000)); // Delay for 1 second
  }
  // Process AI service request
  aiServiceQueue.push(data);
  // ...
}

Verification

To verify that the fix worked, you can:

  • Test the sessions_send method with a subagent and verify that the gateway process continues to process Telegram updates after the interaction.
  • Monitor the logs for any errors or issues related to the AI service overload or retry mechanism.
  • Verify that the retry mechanism is working correctly by simulating a temporary overload or failure of the AI service.

Extra Tips

  • Regularly review and monitor the logs to identify any potential issues or errors.
  • Consider implementing additional logging and monitoring tools to provide more detailed insights into the system's behavior.
  • Test the system thoroughly after implementing

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Gateway freezes after nested subagent activity, stops all Telegram polling [1 participants]