hermes - 💡(How to fix) Fix [Bug]: Telegram TTS voice reply caption sent as plain text, breaking markdown rendering

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

Additional Logs / Traceback (optional)

Root Cause

Root cause

Fix Action

Fix / Workaround

Workaround

Code Example

Those contain a lot of personal info, not sharing that.

---
RAW_BUFFERClick to expand / collapse

Bug Description

Summary

When the bot responds to a Telegram voice message and voice.auto_tts is enabled, the response text is sent as the caption of a TTS audio message. The caption is set to raw text_content (standard markdown: bold, italic, etc.) and passed to send_voice(). Telegram's send_voice call does not set parse_mode, so the caption renders as plain text with raw syntax characters visible (bold appears literally).

<img width="628" height="701" alt="Image" src="https://github.com/user-attachments/assets/afbcfb7c-be2c-45b2-ac1a-b165cc93f855" />

Reproduction steps

  1. voice.auto_tts: true in config.yaml (or /voice on in chat)
  2. Send a voice message to the bot
  3. Bot response is ≤ 1024 chars
  4. Observe: audio bubble appears with caption showing raw bold asterisks

Root cause

In base.py, _process_message():

~line 3515

telegram_tts_caption = None if ( self.platform == Platform.TELEGRAM and text_content and text_content[:1024] == text_content ): telegram_tts_caption = text_content # ← raw markdown, never formatted

The caption is then passed to send_voice() → Telegram Bot API send_voice with no parse_mode. Telegram receives bold and renders it literally.

Workaround

None available via config. Disabling voice.auto_tts prevents the bug but loses voice reply functionality.

Environment

  • Platform: Telegram (supergroup with topics)
  • voice.auto_tts: true
  • streaming.enabled: false

Steps to Reproduce

  1. voice.auto_tts: true in config.yaml (or /voice on in chat)
  2. Send a voice message to the bot
  3. Bot response is ≤ 1024 chars
  4. Observe: audio bubble appears with caption showing raw bold asterisks

Expected Behavior

formatted text visible in telegram with format applied

Actual Behavior

formatting asterisks visible

Affected Component

Agent Core (conversation loop, context compression, memory)

Messaging Platform (if gateway-related)

Telegram

Debug Report

Those contain a lot of personal info, not sharing that.

Operating System

WSL2 with Ubuntu

Python Version

No response

Hermes Version

No response

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

In base.py, _process_message():

~line 3515

telegram_tts_caption = None if ( self.platform == Platform.TELEGRAM and text_content and text_content[:1024] == text_content ): telegram_tts_caption = text_content # ← raw markdown, never formatted

The caption is then passed to send_voice() → Telegram Bot API send_voice with no parse_mode. Telegram receives bold and renders it literally.

Proposed Fix (optional)

Two changes needed:

  1. base.py — format the caption before passing it, or pass a stripped version:

Option A: let the adapter format it (requires adapter access here)

telegram_tts_caption = text_content # unchanged — adapter must handle it

Option B: strip markdown for caption (plain text, no rendering artifacts)

from tools.tts_tool import _strip_markdown_for_tts telegram_tts_caption = _strip_markdown_for_tts(text_content)

  1. telegram.py — in send_voice(), apply format_message() to caption and set parse_mode=ParseMode.MARKDOWN_V2:

In send_voice(), before passing to send_voice Bot API call:

if caption: formatted_caption = self.format_message(caption[:1024]) # pass formatted_caption + parse_mode=ParseMode.MARKDOWN_V2

Option B (strip at source in base.py) is simpler and safer — captions rarely need rich formatting, and plain text captions never cause parse errors. The proper text response is still sent separately via _send_with_retry() through the normal formatted path.

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING