hermes - ✅(Solved) Fix [Bug]: Weixin inbound voice messages skip audio download when text hint exists [1 pull requests, 1 participants]

hermes2026-04-17 16:35:25

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#11686•Fetched 2026-04-18 05:59:22

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Sapientropic

Participants

Sapientropic

Timeline (top)

cross-referenced ×1

Error Message

Additional Logs / Traceback (optional)

Root Cause

Root Cause Analysis (optional)

Fix Action

Fix / Workaround

I have a patch ready for this and can open a PR.

PR fix notes

PR #11689: fix(weixin): keep inbound voice audio when text hint exists

Repository: NousResearch/hermes-agent
Author: Sapientropic
State: open | merged: False
Link: https://github.com/NousResearch/hermes-agent/pull/11689

Description (problem / solution / changelog)

What does this PR do?

Fixes a Weixin inbound voice handling bug: if voice_item.text is already present, Hermes currently skips downloading the actual voice media.

That is too aggressive. Some Weixin / iLink variants provide both:

a text hint
a downloadable voice payload

Before this patch, Hermes kept the text hint but dropped the audio entirely. After this patch, Hermes still preserves the text hint, but it also keeps the real voice payload when the media reference is available.

Related Issue

Fixes #11686

Type of Change

🐛 Bug fix (non-breaking change that fixes an issue)
✨ New feature (non-breaking change that adds functionality)
🔒 Security fix
📝 Documentation update
✅ Tests (adding or improving test coverage)
♻️ Refactor (no behavior change)
🎯 New skill (bundled or hub)

Changes Made

gateway/platforms/weixin.py
- stop treating voice_item.text as a reason to skip voice download
- only skip when there is no usable media reference
- extract inbound voice hint metadata (voice_text_hint, voice_duration_ms)
- normalize short playtime values that appear to be in seconds instead of milliseconds
gateway/platforms/base.py
- add a small MessageEvent.metadata dict so adapters can carry normalized inbound metadata without inventing ad-hoc attributes
tests/gateway/test_weixin.py
- add regressions for:
  - keeping audio even when a text hint exists
  - playtime normalization
  - safe no-media fallback
  - attaching normalized voice metadata to the inbound event

How to Test

Run: python -m pytest tests/gateway/test_weixin.py tests/gateway/test_platform_base.py tests/gateway/test_stt_config.py -q
Send a Weixin voice note where the inbound payload contains both voice_item.text and downloadable voice media.
Confirm Hermes keeps both:
- event.text still contains the hint
- event.media_urls includes the cached voice file

Checklist

Code

I've read the Contributing Guide
My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
I searched for existing PRs to make sure this isn't a duplicate
My PR contains only changes related to this fix/feature (no unrelated commits)
I've run pytest tests/ -q and all tests pass
I've added tests for my changes (required for bug fixes, strongly encouraged for features)
I've tested on my platform: macOS

Documentation & Housekeeping

I've updated relevant documentation (README, docs/, docstrings) — or N/A
I've updated cli-config.yaml.example if I added/changed config keys — or N/A
I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — or N/A
I've considered cross-platform impact (Windows, macOS) per the compatibility guide — or N/A
I've updated tool descriptions/schemas if I changed tool behavior — or N/A

Screenshots / Logs

Targeted validation run:

python -m pytest tests/gateway/test_weixin.py tests/gateway/test_platform_base.py tests/gateway/test_stt_config.py -q
127 passed

Changed files

gateway/platforms/base.py (modified, +4/-0)
gateway/platforms/weixin.py (modified, +30/-1)
tests/gateway/test_weixin.py (modified, +90/-0)

Code Example

async def _download_voice(self, item: Dict[str, Any]) -> Optional[str]:
    voice_item = item.get("voice_item") or {}
    media = voice_item.get("media") or {}
    if voice_item.get("text"):
        return None

RAW_BUFFERClick to expand / collapse

Bug Description

When a Weixin inbound voice message already includes voice_item.text, Hermes skips downloading the actual voice media.

That means the gateway keeps the text hint, but drops the original audio payload entirely. Anything downstream that wants the real voice note — STT fallback, format-specific handling, future voice analysis, or even just preserving the .silk file — never sees it.

Steps to Reproduce

Run Hermes with the Weixin adapter.
Send a Weixin voice note from a client / bridge variant that populates both:
- voice_item.text
- downloadable voice_item.media
Inspect the normalized inbound MessageEvent.
event.text is present, but event.media_urls is empty for the voice item.

Expected Behavior

If the inbound voice item has downloadable media, Hermes should cache the voice payload even when a text hint already exists.

The text hint should still be preserved, but it should act as a hint, not as a reason to discard the audio.

Actual Behavior

WeixinAdapter._download_voice() returns early when voice_item.text is present, so the voice media is never downloaded.

Affected Component

Gateway (Telegram/Discord/Slack/WhatsApp)
Agent Core (conversation loop, context compression, memory)
Other

Messaging Platform (if gateway-related)

Weixin

Additional Logs / Traceback (optional)

Current logic in gateway/platforms/weixin.py:

async def _download_voice(self, item: Dict[str, Any]) -> Optional[str]:
    voice_item = item.get("voice_item") or {}
    media = voice_item.get("media") or {}
    if voice_item.get("text"):
        return None

Root Cause Analysis (optional)

The adapter currently treats voice_item.text as if it were a full replacement for the voice media.

That assumption is too strong. In practice, some Weixin/iLink variants provide both:

a text hint / transcript
a downloadable voice payload

Those two signals should coexist.

Proposed Fix (optional)

Only skip the download when the voice item has no usable media reference.

If media exists, keep downloading and caching the voice payload, and preserve the text hint separately so later stages can use both.

I have a patch ready for this and can open a PR.

Are you willing to submit a PR for this?

I'd like to fix this myself and submit a PR

extent analysis

TL;DR

The issue can be fixed by modifying the _download_voice method in weixin.py to download the voice media even when voice_item.text is present.

Guidance

Review the current logic in gateway/platforms/weixin.py and update the _download_voice method to check for the presence of voice_item.media before deciding whether to download the voice media.
Verify that the updated method correctly downloads and caches the voice payload when both voice_item.text and voice_item.media are present.
Test the changes with different Weixin client/bridge variants to ensure compatibility.
Consider adding logging or debugging statements to monitor the behavior of the updated method.

Example

async def _download_voice(self, item: Dict[str, Any]) -> Optional[str]:
    voice_item = item.get("voice_item") or {}
    media = voice_item.get("media") or {}
    if not media:
        return None
    # Proceed with downloading the voice media

Notes

The proposed fix assumes that the presence of voice_item.media indicates that the voice payload is downloadable. However, additional checks may be necessary to handle cases where voice_item.media is present but the download fails.

Recommendation

Apply the workaround by updating the _download_voice method to download the voice media even when voice_item.text is present, as this will allow the gateway to preserve both the text hint and the original audio payload.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#agent execution #callback error #memory management #API rate limit #retriever error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

hermes - ✅(Solved) Fix [Bug]: Weixin inbound voice messages skip audio download when text hint exists [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Additional Logs / Traceback (optional)

Root Cause

Root Cause Analysis (optional)

Fix Action

Fix / Workaround

PR fix notes

PR #11689: fix(weixin): keep inbound voice audio when text hint exists

Description (problem / solution / changelog)

What does this PR do?

Related Issue

Type of Change

Changes Made

How to Test

Checklist

Code

Documentation & Housekeeping

Screenshots / Logs

Changed files

Code Example

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Affected Component

Messaging Platform (if gateway-related)

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

Proposed Fix (optional)

Are you willing to submit a PR for this?

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING