hermes - 💡(How to fix) Fix [Bug]: Local model hang + WeChat rate limit silent failure + indistinguishable status messages [2 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

  1. When local Ollama is not running, the agent hangs for 548 seconds instead of fast-failing with an error.
  2. Status/error messages (Retrying..., Still working..., API failed...) are sent as normal chat messages and are visually indistinguishable from actual agent replies. The user cannot tell whether the agent answered their question.
  3. Observe: agent waits ~548 seconds instead of immediately returning an error

Additional Logs / Traceback (optional)

2026-05-07 12:36:45 ERROR [Weixin] send failed: iLink sendmessage rate limited: ret=-2

Root Cause

Root Cause Analysis (optional)

Fix Action

Fixed

Code Example

Debug report uploaded:
  Report       https://paste.rs/TZ2CD
  agent.log    https://paste.rs/Vg9mR
  gateway.log  https://paste.rs/Au24v

---

2026-05-07 12:25:58 INFO gateway.run: inbound message: msg='你有没有可视化操作页面'
2026-05-07 12:35:07 INFO gateway.run: response ready: time=548.9s api_calls=1


2026-05-07 12:35:07 INFO  gateway.run: response ready: time=548.9s api_calls=1 response=51 chars
2026-05-07 12:35:07 INFO  gateway.platforms.base: [Weixin] Sending response (51 chars)
2026-05-07 12:36:32 WARNING [Weixin] rate limited for o9cq80-e; backing off 3.0s before retry
2026-05-07 12:36:35 WARNING [Weixin] rate limited; backing off 3.0s before retry
2026-05-07 12:36:38 WARNING [Weixin] rate limited; backing off 3.0s before retry
2026-05-07 12:36:41 WARNING [Weixin] rate limited; backing off 3.0s before retry
2026-05-07 12:36:45 ERROR  [Weixin] send failed: iLink sendmessage rate limited: ret=-2
2026-05-07 13:04:22 WARNING [Weixin] rate limited; backing off 3.0s before retry  ← gateway restart triggered re-delivery attempt
RAW_BUFFERClick to expand / collapse

Bug Description

Three related bugs observed when local Ollama model is unavailable and WeChat iLink API is rate-limited:

  1. When local Ollama is not running, the agent hangs for 548 seconds instead of fast-failing with an error.
  2. When the WeChat iLink API is rate-limited, the agent retries with a fixed 3s interval, exhausts retries, and silently drops the response — the user never receives the reply.
  3. Status/error messages (Retrying..., Still working..., API failed...) are sent as normal chat messages and are visually indistinguishable from actual agent replies. The user cannot tell whether the agent answered their question.

Steps to Reproduce

  1. Configure Hermes with local Ollama as its default model
  2. Stop Ollama (do not start it)
  3. Send a message via WeChat
  4. Observe: agent waits ~548 seconds instead of immediately returning an error
  5. If a response is eventually generated, trigger WeChat rate limiting (send messages rapidly beforehand)
  6. Observe: response is silently dropped after 4 retries, user receives no notification

Expected Behavior

  1. If local model endpoint is unreachable, fail within ~10 seconds and either fall back to a cloud model or reply: "Local model unavailable."
  2. After max retries are exhausted on send failure, notify the user: "Your message was processed but could not be delivered due to rate limiting."
  3. System/status messages should be clearly distinguished from agent replies (e.g. prefixed with [System]).

Actual Behavior

  1. Agent hung for 548.9 seconds (api_calls=1, response=51 chars per gateway.log)
  2. After send failed at 12:36:45 (ret=-2, rate limited), response was silently dropped. On gateway restart 28 minutes later, delivery was attempted again immediately — also failed.
  3. User received 6+ status messages (Retrying, Still working, Max retries exhausted, API failed) with no way to distinguish them from real replies. The actual 51-char response was never delivered.

agent.log

Affected Component

CLI (interactive chat)

Messaging Platform (if gateway-related)

N/A (CLI only), WhatsApp

Debug Report

Debug report uploaded:
  Report       https://paste.rs/TZ2CD
  agent.log    https://paste.rs/Vg9mR
  gateway.log  https://paste.rs/Au24v

Operating System

Ubuntu 22.04.5 LTS (WSL2 on Windows 11)

Python Version

3.11.15

Hermes Version

V0.12.0

Additional Logs / Traceback (optional)

2026-05-07 12:25:58 INFO gateway.run: inbound message: msg='你有没有可视化操作页面'
2026-05-07 12:35:07 INFO gateway.run: response ready: time=548.9s api_calls=1


2026-05-07 12:35:07 INFO  gateway.run: response ready: time=548.9s api_calls=1 response=51 chars
2026-05-07 12:35:07 INFO  gateway.platforms.base: [Weixin] Sending response (51 chars)
2026-05-07 12:36:32 WARNING [Weixin] rate limited for o9cq80-e; backing off 3.0s before retry
2026-05-07 12:36:35 WARNING [Weixin] rate limited; backing off 3.0s before retry
2026-05-07 12:36:38 WARNING [Weixin] rate limited; backing off 3.0s before retry
2026-05-07 12:36:41 WARNING [Weixin] rate limited; backing off 3.0s before retry
2026-05-07 12:36:45 ERROR  [Weixin] send failed: iLink sendmessage rate limited: ret=-2
2026-05-07 13:04:22 WARNING [Weixin] rate limited; backing off 3.0s before retry  ← gateway restart triggered re-delivery attempt

Root Cause Analysis (optional)

gateway.log shows the agent received the message at 12:25:58 and produced a response at 12:35:07 (548.9s). Only 1 api_call was made, suggesting the local Ollama endpoint accepted the connection but stalled rather than returning connection refused immediately.

The rate limiting retry loop uses a fixed 3s backoff (confirmed in gateway.log: 4 retries at 12:36:32, 12:36:35, 12:36:38, 12:36:41) with no exponential backoff. After final failure at 12:36:45, the undelivered message was not queued — on gateway restart at 13:04:29, a new delivery attempt was made immediately, also rate-limited.

Proposed Fix (optional)

No response

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING