hermes - 💡(How to fix) Fix QQ Bot WebSocket 频繁断开:长时间工具执行阻塞 asyncio 事件循环导致心跳超时 [1 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

5月 28 15:21:09 [QQBot] WebSocket error: WebSocket closed 5月 28 15:22:09 [QQBot] WebSocket error: WebSocket closed 5月 28 15:23:09 [QQBot] WebSocket error: WebSocket closed ...

Fix Action

Fixed

Code Example

528 15:21:09  [QQBot] WebSocket error: WebSocket closed
528 15:22:09  [QQBot] WebSocket error: WebSocket closed
528 15:23:09  [QQBot] WebSocket error: WebSocket closed
...

---

14:36:44  Tool terminal returned error (322.13s): [Weixin] rate limited...
14:38:08  Tool terminal returned error (60.27s): [Weixin] rate limited...

---

async def _heartbeat_loop(self) -> None:
    import threading
    import concurrent.futures
    
    try:
        while self._running:
            # 使用独立线程睡眠,避免阻塞事件循环
            sleep_event = threading.Event()
            
            def _do_sleep():
                sleep_event.wait(self._heartbeat_interval)
            
            with concurrent.futures.ThreadPoolExecutor(max_workers=1) as executor:
                future = executor.submit(_do_sleep)
                future.result()
            
            if not self._running:
                break
            if not self._ws or self._ws.closed:
                continue
            await self._ws.send_json({"op": 1, "d": self._last_seq})
    except asyncio.CancelledError:
        pass
RAW_BUFFERClick to expand / collapse

🐛 问题描述

Hermes Gateway 在使用 QQ Bot 平台时,WebSocket 连接频繁断开(约每分钟一次),导致 QQ Bot 无回应。

📋 环境信息

项目
Hermes 版本0.13.0
Python 版本3.11
平台Linux (Ubuntu)
QQ Bot 连接wss://api.sgroup.qq.com/websocket
心跳间隔~30 秒(服务器要求)

🔍 问题现象

5月 28 15:21:09  [QQBot] WebSocket error: WebSocket closed
5月 28 15:22:09  [QQBot] WebSocket error: WebSocket closed
5月 28 15:23:09  [QQBot] WebSocket error: WebSocket closed
...

断开间隔:约 60-70 秒

🎯 根本原因分析

事件循环阻塞导致心跳超时

问题说明
QQ Gateway 心跳间隔30 秒
客户端应发送间隔24 秒(80%)
实际发送间隔60-70 秒
原因Agent 执行长时间任务阻塞 asyncio 事件循环

日志证据

14:36:44  Tool terminal returned error (322.13s): [Weixin] rate limited...
14:38:08  Tool terminal returned error (60.27s): [Weixin] rate limited...

当 Agent 执行长时间任务(如微信限流等待 322 秒)时,整个 asyncio 事件循环被阻塞,心跳任务无法及时运行,导致 WebSocket 被服务器断开。

🔧 当前修复方案

已修改 gateway/platforms/qqbot/adapter.py_heartbeat_loop 方法,将 asyncio.sleep() 改为独立线程睡眠

async def _heartbeat_loop(self) -> None:
    import threading
    import concurrent.futures
    
    try:
        while self._running:
            # 使用独立线程睡眠,避免阻塞事件循环
            sleep_event = threading.Event()
            
            def _do_sleep():
                sleep_event.wait(self._heartbeat_interval)
            
            with concurrent.futures.ThreadPoolExecutor(max_workers=1) as executor:
                future = executor.submit(_do_sleep)
                future.result()
            
            if not self._running:
                break
            if not self._ws or self._ws.closed:
                continue
            await self._ws.send_json({"op": 1, "d": self._last_seq})
    except asyncio.CancelledError:
        pass

💡 建议的长期解决方案

方案 1:参考 OpenClaw 架构,实现进程隔离

OpenClaw Gateway 使用 Node.js,工具执行在独立子进程中运行,与 WebSocket 心跳完全隔离。这是更健壮的架构选择。

方案 2:将长时间工具执行移到独立线程/进程

对于可能长时间阻塞的工具执行(如网络请求、文件操作),应该:

  • 使用 asyncio.to_thread()run_in_executor()
  • 或者使用独立的进程池

方案 3:心跳任务使用独立线程

类似当前修复,但应该作为框架级别的改进,而不是平台特定的修复。

📊 架构对比

特性OpenClawHermes (当前)
工具执行隔离✅ 独立子进程❌ 同一事件循环
心跳独立性✅ 独立唤醒系统❌ 与工具共享循环
长时间任务影响不影响 WebSocket阻塞所有异步任务

📎 相关文件

  • gateway/platforms/qqbot/adapter.py - QQ Bot 平台适配器
  • gateway/run.py - 网关主循环
  • gateway/event_loop.py - 事件循环管理

标签建议: bug, platform:qqbot, websocket, asyncio, heartbeat

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING