openclaw - ✅(Solved) Fix [Bug]: Telegram: Binary file content injected into prompt via msg.caption causes token explosion [2 pull requests, 2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#66647Fetched 2026-04-15 06:25:10
View on GitHub
Comments
2
Participants
2
Timeline
9
Reactions
0
Author
Timeline (top)
commented ×2cross-referenced ×2labeled ×2referenced ×2

When users send binary files (e.g., .mobi e-books) via Telegram, the file's binary content is incorrectly injected into the LLM prompt through msg.caption, causing massive token explosions and context overflow errors.

Error Message

• Context overflow error: Context overflow: 426921 tokens, model limit: 262144

Root Cause

• Binary content from msg.caption is passed directly to the prompt • Token count explodes (observed: 8.9KB file → 460,506 tokens) • Context overflow error: Context overflow: 426921 tokens, model limit: 262144 • Auto-compaction fails because the overflow comes from system-injected context, not conversation history

Fix Action

Fixed

PR fix notes

PR #66663: fix: filter binary content from Telegram captions to prevent token explosion

Description (problem / solution / changelog)

Closes #66647

Summary

  • When a user sends a binary document (e.g. .mobi, .epub) via Telegram, raw binary bytes can leak into msg.caption. getTelegramTextParts() passes this through to the LLM prompt, causing catastrophic token explosion (~460K tokens for a single message).
  • Added isBinaryContent() that detects non-printable control characters (0x000x08, 0x0E0x1F) and used it to sanitize text in getTelegramTextParts() before it reaches the prompt pipeline.
  • When binary content is detected, both text and entities are replaced with empty values — the message is still processed (media placeholder works) but binary junk is dropped.

Changes

  • extensions/telegram/src/bot/body-helpers.ts: Added isBinaryContent() helper and integrated it into getTelegramTextParts().
  • extensions/telegram/src/bot/helpers.ts: Re-exported isBinaryContent for test access.
  • extensions/telegram/src/bot/helpers.test.ts: Added 8 new tests covering binary detection, caption filtering for binary .mobi/.epub content, normal caption preservation, and binary text handling.

Test plan

  • All 63 existing tests in helpers.test.ts continue to pass
  • New isBinaryContent tests verify detection of null bytes, binary headers, normal text, and whitespace
  • New getTelegramTextParts tests verify binary captions are stripped while normal captions are preserved

Risks and Mitigations

  • Risk: False positives on legitimate text with unusual control characters.
    • Mitigation: The regex only flags bytes 0x000x08 and 0x0E0x1F, which are never present in legitimate Unicode text messages (tabs 0x09, newlines 0x0A/0x0D are explicitly excluded). This is the same heuristic used by file(1) for binary detection.

Joel Nishanth · offlyn.AI

Made with Cursor

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • extensions/telegram/src/bot-handlers.runtime.ts (modified, +3/-3)
  • extensions/telegram/src/bot/body-helpers.ts (modified, +13/-2)
  • extensions/telegram/src/bot/helpers.test.ts (modified, +62/-0)
  • extensions/telegram/src/bot/helpers.ts (modified, +2/-0)

PR #66877: Telegram/documents: sanitize binary payloads to prevent prompt input inflation

Description (problem / solution / changelog)

Summary

Describe the problem and fix in 2–5 bullets:

  • Problem: Telegram .epub / .mobi uploads could still leak raw binary into prompt context after #66663 through two side paths: reply/quote context and media-understanding file extraction.
  • Why it matters: On a live deployed OpenClaw instance running the latest main, a real EPUB upload still produced a huge <file mime="text/plain"> block with raw ZIP bytes and drove the prompt to about 231k tokens from a binary file with only 100 KB.
  • What changed: Reused the existing Telegram binary-text filter for reply and quote text, treated +zip attachment MIME types like EPUB as binary, and blocked text/plain coercion for buffers with raw binary control bytes.
  • What did NOT change (scope boundary): This PR does not broaden the Telegram binary heuristic beyond the existing control-byte rule, and it does not change generic prompt assembly outside the Telegram/media attachment seams.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes # N/A
  • Related #66647, #66663
  • This PR fixes a bug or regression

Root Cause (if applicable)

For bug fixes or regressions, explain why this happened, not just what changed. Otherwise write N/A. If the cause is unclear, write Unknown.

  • Root cause: two prompt-facing Telegram seams still bypassed the intended binary filtering after #66663. describeReplyTarget() built reply and quote context from raw Telegram fields, and extractFileBlocks() could still coerce ZIP-like attachment bytes into text/plain.
  • Missing detection / guardrail: Tests covered getTelegramTextParts() but did not cover describeReplyTarget(), ReplyToBody, or media-understanding extraction for EPUB-like container uploads.
  • Contributing context (if known): the EPUB path was hidden behind media-understanding heuristics, so the original direct caption/text fix did not cover the real attachment extraction route.

Regression Test Plan (if applicable)

For bug fixes or regressions, name the smallest reliable test coverage that should catch this. Otherwise write N/A.

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: extensions/telegram/src/bot.test.ts, extensions/telegram/src/bot/helpers.test.ts, and src/media-understanding/apply.test.ts
  • Scenario the test should lock in: binary reply captions and quote text must not surface in ReplyToBody, and EPUB-like ZIP containers must not be reclassified into inline text/plain file blocks.
  • Why this is the smallest reliable guardrail: the bug crossed the Telegram boundary into media-understanding prompt assembly, so we need both the Telegram seam tests and one attachment-extraction regression to prove the prompt-facing behavior.
  • Existing test that already covers this (if any): None
  • If no new test is added, why not: N/A

User-visible / Behavior Changes

Telegram .epub / .mobi document uploads no longer inline raw binary bytes into prompt context through reply/quote metadata or text/plain attachment extraction. Normal reply behavior and real text-file extraction stay the same.

Diagram (if applicable)

N/A

Before:
[Telegram EPUB upload] -> [file bytes coerced to text/plain] -> [raw ZIP bytes in prompt]
[reply to binary payload] -> [raw reply text used] -> [ReplyToBody contains junk]

After:
[Telegram EPUB upload] -> [EPUB treated as binary] -> [no inline file block]
[reply to binary payload] -> [Telegram reply text filtered] -> [ReplyToBody omitted]

Security Impact (required)

  • New permissions/capabilities? (Yes/No) No
  • Secrets/tokens handling changed? (Yes/No) No
  • New/changed network calls? (Yes/No) No
  • Command/tool execution surface changed? (Yes/No) No
  • Data access scope changed? (Yes/No) No
  • If any Yes, explain risk + mitigation:

Repro + Verification

Environment

  • OS: Linux
  • Runtime/container: local repo checkout with Node 22 / pnpm, plus a live deployed OpenClaw instance used first on main and then on the fix branch
  • Model/provider: kilocode/minimax/minimax-m2.5:free
  • Integration/channel (if any): Telegram
  • Relevant config (redacted): default Telegram test-bot config

Steps

  1. Send a real .epub document to the live Telegram bot on the deployed OpenClaw instance with no manual caption. Example epub used: https://www.gutenberg.org/ebooks/26551.epub.images
  2. Observe the resulting session JSONL prompt body for the inbound message.
  3. Reply to that message with reply caption probe.
  4. Repeat the same flow after the fix.

Expected

  • Binary Telegram uploads should stay as <media:document> metadata only, and binary reply or quote text should be filtered before it reaches ReplyToBody.

Actual

  • Before the fix, the live deployed OpenClaw instance running the latest main inlined a real Project Gutenberg EPUB as <file mime="text/plain"> with raw ZIP bytes in prompt context and produced a run with about 231,840 total tokens.
  • Before the fix, the local Telegram reply repro also produced ReplyToBody: "PK\u0000\u0003\u0004BINARY".
  • After the fix, the live EPUB message stayed as plain <media:document> with no <file> block and no raw bytes, and the reply message carried only Replied message.body: "<media:document>".

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Before fix live deployed-instance snippet:

{
  "message_id": "81",
  "body_excerpt": "<file name=\"pg26551-images...\" mime=\"text/plain\">\\n<<<EXTERNAL_UNTRUSTED_CONTENT ... >>>\\n䭐Ѓ\\u0014\\u0000\\u0000...",
  "usage": {
    "input": 109534,
    "cacheRead": 122240,
    "totalTokens": 231840
  }
}

After fix live deployed-instance snippet:

{
  "message_id": "85",
  "body_excerpt": "<media:document>",
  "reply_message_id": "86",
  "reply_context": {
    "body": "<media:document>"
  },
  "usage": {
    "input": 651,
    "replyInput": 515
  }
}

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios: personally reproduced the live Telegram EPUB upload on a deployed OpenClaw instance running the latest main and confirmed that raw EPUB bytes were inlined into prompt context; redeployed the fix branch; resent the same EPUB and verified the inbound prompt now stayed at <media:document> with no raw <file> block. Also reran the local Telegram reply repro and confirmed ReplyToBody no longer populated with binary text.
  • Edge cases checked: binary reply captions, binary quote text with safe reply-text fallback, binary external-reply quote text with safe fallback, EPUB-like +zip MIME attachments, binary-control-byte buffers with unknown MIME, and UTF-16 text attachment extraction.
  • What you did not verify: a release-tag before/after deployment flow separate from latest origin/main, or non-EPUB container formats beyond the regression cases above.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

I already ran codex review --base upstream/main in a loop locally, fixing every justified finding until there were no more left - so I don't expect any more findings to show up.

Compatibility / Migration

  • Backward compatible? (Yes/No) Yes
  • Config/env changes? (Yes/No) No
  • Migration needed? (Yes/No) No
  • If yes, exact upgrade steps:

Risks and Mitigations

List only real risks for this PR. Add/remove entries as needed. If none, write None.

  • Risk:
    • False positives on legitimate reply or quote text containing disallowed control bytes, or on future attachment formats that intentionally use unusual control-byte-heavy text encodings.
  • Mitigation:
    • Reuses the existing Telegram control-byte heuristic already accepted for direct caption/text handling, preserves the current media/location fallback behavior when sanitized text is empty, and keeps UTF-16 text attachments eligible for extraction with dedicated regression coverage.

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • extensions/telegram/src/bot-message-context.session.ts (modified, +18/-3)
  • extensions/telegram/src/bot.test.ts (modified, +33/-0)
  • extensions/telegram/src/bot/body-helpers.ts (modified, +6/-2)
  • extensions/telegram/src/bot/helpers.test.ts (modified, +59/-1)
  • extensions/telegram/src/bot/helpers.ts (modified, +22/-12)
  • src/media-understanding/apply.test.ts (modified, +119/-0)
  • src/media-understanding/apply.ts (modified, +20/-0)

Code Example

export function getTelegramTextParts(
  msg: Pick<Message, "text" | "caption" | "entities" | "caption_entities">,
): {
  text: string;
  entities: TelegramTextEntity[];
} {
  const text = msg.text ?? msg.caption ?? "";  // ← No binary content check
  const entities = msg.entities ?? msg.caption_entities ?? [];
  return { text, entities };
}

---

export function getTelegramTextParts(
  msg: Pick<Message, "text" | "caption" | "entities" | "caption_entities">,
): {
  text: string;
  entities: TelegramTextEntity[];
} {
  const rawText = msg.text ?? msg.caption ?? "";
  const entities = msg.entities ?? msg.caption_entities ?? [];
  
  // Filter out binary content that may be incorrectly placed in caption
  const text = isBinaryContent(rawText) ? "" : rawText;
  
  return { text, entities };
}

function isBinaryContent(str: string): boolean {
  if (!str || str.length < 200) return false;
  
  const sample = str.slice(0, 2000);
  
  // Check for CJK compatibility characters (common in binary garbled text)
  const cjkCompatPattern = /[\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F]/g;
  const matches = sample.match(cjkCompatPattern);
  
  if (matches && matches.length > sample.length * 0.3) {
    return true;
  }
  
  // Check for NUL characters
  const nullCount = (sample.match(/\x00/g) || []).length;
  return nullCount > 5;
}

---
RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Beta release blocker

No

Summary

When users send binary files (e.g., .mobi e-books) via Telegram, the file's binary content is incorrectly injected into the LLM prompt through msg.caption, causing massive token explosions and context overflow errors.

Steps to reproduce

  1. Start a conversation with OpenClaw via Telegram
  2. Send a binary file (e.g., a .mobi e-book file, ~9KB)
  3. Observe the Gateway logs for context overflow

Expected behavior

• Binary file content should be filtered out from msg.caption • The text field should return an empty string or a media placeholder (like media:document) • Similar to how LINE channel handles this with extractMediaPlaceholder()

Actual behavior

• Binary content from msg.caption is passed directly to the prompt • Token count explodes (observed: 8.9KB file → 460,506 tokens) • Context overflow error: Context overflow: 426921 tokens, model limit: 262144 • Auto-compaction fails because the overflow comes from system-injected context, not conversation history

OpenClaw version

OpenClaw version: 2026.4.12 (also tested on 2026.4.14)

Operating system

Mac

Install method

No response

Model

anthropic

Provider / routing chain

Telegram

Additional provider/model setup details

In extensions/telegram/src/bot/body-helpers.ts, the getTelegramTextParts() function directly uses msg.caption without validating whether it contains binary content:

export function getTelegramTextParts(
  msg: Pick<Message, "text" | "caption" | "entities" | "caption_entities">,
): {
  text: string;
  entities: TelegramTextEntity[];
} {
  const text = msg.text ?? msg.caption ?? "";  // ← No binary content check
  const entities = msg.entities ?? msg.caption_entities ?? [];
  return { text, entities };
}

Suggested Fix

Add a binary content detection helper and filter before returning:

export function getTelegramTextParts(
  msg: Pick<Message, "text" | "caption" | "entities" | "caption_entities">,
): {
  text: string;
  entities: TelegramTextEntity[];
} {
  const rawText = msg.text ?? msg.caption ?? "";
  const entities = msg.entities ?? msg.caption_entities ?? [];
  
  // Filter out binary content that may be incorrectly placed in caption
  const text = isBinaryContent(rawText) ? "" : rawText;
  
  return { text, entities };
}

function isBinaryContent(str: string): boolean {
  if (!str || str.length < 200) return false;
  
  const sample = str.slice(0, 2000);
  
  // Check for CJK compatibility characters (common in binary garbled text)
  const cjkCompatPattern = /[\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F]/g;
  const matches = sample.match(cjkCompatPattern);
  
  if (matches && matches.length > sample.length * 0.3) {
    return true;
  }
  
  // Check for NUL characters
  const nullCount = (sample.match(/\x00/g) || []).length;
  return nullCount > 5;
}

Logs, screenshots, and evidence

Impact and severity

No response

Additional information

• LINE channel already handles this correctly with extractMediaPlaceholder() returning media:document for file types • This issue is specific to Telegram's handling of msg.caption

extent analysis

TL;DR

Add a binary content detection helper to filter out binary data from msg.caption before passing it to the LLM prompt.

Guidance

  • Implement the suggested isBinaryContent function to detect binary content in the msg.caption field.
  • Modify the getTelegramTextParts function to use the isBinaryContent function and filter out binary content.
  • Verify that the isBinaryContent function correctly identifies binary content by testing it with known binary files.
  • Test the updated getTelegramTextParts function with binary files to ensure that the binary content is properly filtered out.

Example

The provided code snippet for isBinaryContent and the updated getTelegramTextParts function can be used as a starting point:

function isBinaryContent(str: string): boolean {
  // implementation as provided in the issue
}

export function getTelegramTextParts(
  msg: Pick<Message, "text" | "caption" | "entities" | "caption_entities">,
): {
  text: string;
  entities: TelegramTextEntity[];
} {
  const rawText = msg.text ?? msg.caption ?? "";
  const entities = msg.entities ?? msg.caption_entities ?? [];
  
  const text = isBinaryContent(rawText) ? "" : rawText;
  
  return { text, entities };
}

Notes

The provided isBinaryContent function uses heuristics to detect binary content, such as checking for CJK compatibility characters and NUL characters. This approach may not be foolproof and may require further refinement.

Recommendation

Apply the suggested workaround by implementing the isBinaryContent function and updating the getTelegramTextParts function. This should help prevent the context overflow error caused by binary content in the msg.caption field.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

• Binary file content should be filtered out from msg.caption • The text field should return an empty string or a media placeholder (like media:document) • Similar to how LINE channel handles this with extractMediaPlaceholder()

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Bug]: Telegram: Binary file content injected into prompt via msg.caption causes token explosion [2 pull requests, 2 comments, 2 participants]