hermes - 💡(How to fix) Fix [Weixin] ret=-3 cron push silently fails after tokenless retry (regression from v0.14) [2 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

[Weixin] send chunk failed to=... attempt=1/5, retrying in 1.00s: iLink sendmessage error: ret=-3 errcode=None errmsg=unknown error [Weixin] send chunk failed to=... attempt=2/5, retrying in 2.00s: iLink sendmessage error: ret=-3 errcode=None errmsg=unknown error ... [Weixin] send failed to=...: iLink sendmessage error: ret=-3 errcode=None errmsg=unknown error

Root Cause

In v0.14, the send_without_context_token_on_session_expiry fix (commit e105b7ac9) worked because it treated errcode=-14 as the signal and the send path tried tokenless as primary fallback.

In v0.15, the new _is_stale_session_ret helper (PR #17432) extended detection to include ret=-2 and ret=-3 with errmsg="unknown error". However, there is a logic gap in gateway/platforms/weixin.py around L1579:

if is_session_expired and not retried_without_token and context_token:
    retried_without_token = True
    context_token = None
    self._token_store._cache.pop(...)
    logger.warning("[%s] session expired for %s; retrying without context_token", ...)
    continue

The tokenless retry only fires if context_token already exists. During cron-initiated pushes to a long-inactive chat, iLink may never have allocated a valid context_token in the first place. The condition context_token (truthy) is False, so the retry branch never executes and the code falls through to the generic error path — silently dropping the message.

The and context_token guard is too strict. Removing it lets the adapter attempt a tokenless send regardless of whether a cached token existed.

Fix Action

Fixed

Code Example

[Weixin] send chunk failed to=... attempt=1/5, retrying in 1.00s: iLink sendmessage error: ret=-3 errcode=None errmsg=unknown error
[Weixin] send chunk failed to=... attempt=2/5, retrying in 2.00s: iLink sendmessage error: ret=-3 errcode=None errmsg=unknown error
...
[Weixin] send failed to=...: iLink sendmessage error: ret=-3 errcode=None errmsg=unknown error

---

if is_session_expired and not retried_without_token and context_token:
    retried_without_token = True
    context_token = None
    self._token_store._cache.pop(...)
    logger.warning("[%s] session expired for %s; retrying without context_token", ...)
    continue

---

# Before (too strict):
if is_session_expired and not retried_without_token and context_token:

# After (correct):
if is_session_expired and not retried_without_token:
RAW_BUFFERClick to expand / collapse

Bug Description

Weixin cron-initiated push silently fails with ret=-3 errcode=None errmsg=unknown error since upgrading from v0.14.

Same behavior was working in v0.14. After upgrading to v0.15, all cron jobs configured with deliver: weixin fail to deliver, even though content generation completes successfully and interactive replies work fine.

Steps to Reproduce

  1. Configure a cron job delivering to Weixin (e.g., 0 7 * * *)
  2. Ensure no user message has been sent to the bot for several hours
  3. Wait for the cron job to fire at the scheduled time
  4. Observe: content is generated but delivery fails with ret=-3 errcode=None errmsg=unknown error

Actual Behavior

From gateway.log:

[Weixin] send chunk failed to=... attempt=1/5, retrying in 1.00s: iLink sendmessage error: ret=-3 errcode=None errmsg=unknown error
[Weixin] send chunk failed to=... attempt=2/5, retrying in 2.00s: iLink sendmessage error: ret=-3 errcode=None errmsg=unknown error
...
[Weixin] send failed to=...: iLink sendmessage error: ret=-3 errcode=None errmsg=unknown error

Note: attempt 1 correctly identifies stale session and strips token. Attempt 2 fails immediately without a session expired warning log — suggesting the retry path is bypassed.

Expected Behavior

The message should be delivered via tokenless fallback. iLink accepts tokenless sends as a degraded fallback even when the session has expired. At minimum, the root cause should be identifiable from logs.

Root Cause Analysis

In v0.14, the send_without_context_token_on_session_expiry fix (commit e105b7ac9) worked because it treated errcode=-14 as the signal and the send path tried tokenless as primary fallback.

In v0.15, the new _is_stale_session_ret helper (PR #17432) extended detection to include ret=-2 and ret=-3 with errmsg="unknown error". However, there is a logic gap in gateway/platforms/weixin.py around L1579:

if is_session_expired and not retried_without_token and context_token:
    retried_without_token = True
    context_token = None
    self._token_store._cache.pop(...)
    logger.warning("[%s] session expired for %s; retrying without context_token", ...)
    continue

The tokenless retry only fires if context_token already exists. During cron-initiated pushes to a long-inactive chat, iLink may never have allocated a valid context_token in the first place. The condition context_token (truthy) is False, so the retry branch never executes and the code falls through to the generic error path — silently dropping the message.

The and context_token guard is too strict. Removing it lets the adapter attempt a tokenless send regardless of whether a cached token existed.

Related Issues

  • #17228 — ret=-2 stale session (fixed in PR #17432)
  • #18100 — ret=-2 with empty errmsg (follow-up fix)
  • #21011 — iLink rate limiting with no retry
  • #26828 — Gateway OOM from retry storm (downstream)

Environment

  • OS: macOS 26.5
  • Hermes: v0.15.2 (2026.5.29) / commit 689ef5e
  • Python: 3.11.14
  • Platform: Weixin iLink API

Suggested Fix

Remove the and context_token guard so the tokenless retry fires regardless of whether a cached token exists:

# Before (too strict):
if is_session_expired and not retried_without_token and context_token:

# After (correct):
if is_session_expired and not retried_without_token:

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix [Weixin] ret=-3 cron push silently fails after tokenless retry (regression from v0.14) [2 pull requests]