openclaw - ✅(Solved) Fix [UX] Excessive inline JSON metadata in Telegram user messages degrades model comprehension [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#72704Fetched 2026-04-28 06:33:13
View on GitHub
Comments
0
Participants
1
Timeline
2
Reactions
0
Participants
Timeline (top)
cross-referenced ×2

Each user message delivered via Telegram is prefixed with 400-500 bytes of JSON metadata (Conversation info + Sender blocks) embedded directly in the message text. This metadata-to-content ratio makes it harder for models — especially weaker fallback models — to locate and process actual user text.

Root Cause

Each user message delivered via Telegram is prefixed with 400-500 bytes of JSON metadata (Conversation info + Sender blocks) embedded directly in the message text. This metadata-to-content ratio makes it harder for models — especially weaker fallback models — to locate and process actual user text.

Fix Action

Fixed

PR fix notes

PR #72749: Fix: Compact inbound metadata

Description (problem / solution / changelog)

Summary

Describe the problem and fix in 2–5 bullets:

If this PR fixes a plugin beta-release blocker, title it fix(<plugin-id>): beta blocker - <summary> and link the matching Beta blocker: <plugin-name> - <summary> issue labeled beta-blocker. Contributors cannot label PRs, so the title is the PR-side signal for maintainers and automation.

  • Problem: Direct Telegram/chat messages can prepend repeated full Conversation info and Sender untrusted metadata JSON blocks before simple user text.
  • Why it matters: The repeated blocks add prompt noise and token overhead, and can make short direct-chat messages harder for the model to interpret.
  • What changed: Added optional messages.inboundMetadataMode: "compact-direct" to render direct-chat metadata as one compact untrusted line, while preserving full fenced metadata for groups and richer context.
  • What did NOT change (scope boundary): Defaults remain unchanged; group chats, replies, forwards, locations, structured context, and history keep the existing full metadata format.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #72704
  • Related #62077
  • This PR fixes a bug or regression

Root Cause (if applicable)

  • Root cause: Inbound prompt assembly always used the full untrusted metadata block format for direct external-channel messages.
  • Missing detection / guardrail: Existing tests covered safety and metadata presence, but not compact direct-chat prompt shape or stripping compact AI-facing metadata from user-visible history.
  • Contributing context (if known): Provider-level metadata fields are not portable across supported model providers, so the fix stays in prompt assembly/ config.

Regression Test Plan (if applicable)

For bug fixes or regressions, name the smallest reliable test coverage that should catch this. Otherwise write N/A.

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
    • Target test or file: src/auto-reply/reply/inbound-meta.test.ts, src/auto-reply/reply/strip-inbound-meta.test.ts, src/auto-reply/reply/get-reply-run.media- only.test.ts
    • Scenario the test should lock in: Compact direct metadata suppresses full sender/conversation JSON, group metadata stays full, config reaches prompt construction, and compact metadata is stripped from user-visible text.
    • Why this is the smallest reliable guardrail: These tests cover the prompt builder, the call site, and the display/history sanitizer without requiring a live Telegram bot.
    • Existing test that already covers this (if any): Existing inbound metadata tests covered full metadata behavior and safety escaping.
    • If no new test is added, why not: N/A

User-visible / Behavior Changes

New optional config: messages.inboundMetadataMode: "compact-direct".

Default behavior remains unchanged.

Diagram (if applicable)

For UI changes or non-trivial logic flows, include a small ASCII diagram reviewers can scan quickly. Otherwise write N/A.

  Before:
  [direct chat message] -> [full conversation JSON + full sender JSON + body]

  After:
  [direct chat message + compact-direct] -> [one compact untrusted metadata line + body]
  [group/rich context] -> [existing full fenced metadata blocks]

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: Node 22 / pnpm
  • Model/provider: N/A
  • Integration/channel (if any): Telegram/direct chat prompt assembly
  • Relevant config (redacted): messages.inboundMetadataMode: "compact-direct"

Steps

  1. Configure messages.inboundMetadataMode: "compact-direct".
  2. Build inbound user context for a direct Telegram message.
  3. Compare prompt prefix against default/full mode and group-chat mode.

Expected

  • Direct chats use one compact untrusted metadata line.
  • Sender name/username JSON block is not prepended in compact direct mode.
  • Groups and rich context retain full fenced metadata blocks.
  • Compact metadata does not appear in user-visible history.

Actual

  • Matches expected after this PR.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios:
    • pnpm check:changed
    • pnpm build
    • pnpm test src/auto-reply/reply/inbound-meta.test.ts src/auto-reply/reply/strip-inbound-meta.test.ts src/auto-reply/reply/get-reply-run.media-only.test.ts src/plugins/cli.test.ts src/config/schema.help.quality.test.ts
    • pnpm config:schema:check
    • pnpm config:docs:check
    • codex review --base origin/main
  • Edge cases checked: direct vs group chats, compact metadata stripping, lookalike user text preservation, default config schema generation, wiki/memory CLI scope review finding.
  • What you did not verify: Live Telegram bot delivery.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (Yes)
  • Migration needed? (No)
  • If yes, exact upgrade steps: Optional only. Set messages.inboundMetadataMode to "compact-direct" to enable the new direct-chat prompt format.

Risks and Mitigations

List only real risks for this PR. Add/remove entries as needed. If none, write None.

  • Risk: Compact direct metadata could leak into user-visible history.
    • Mitigation: Added sanitizer support and tests for compact metadata stripping plus lookalike user text preservation.
  • Risk: Compacting metadata could remove context needed in groups or rich-message flows.
    • Mitigation: Compact mode applies only to direct chats; groups/replies/forwards/locations/history keep full metadata.

Changed files

  • docs/.generated/config-baseline.sha256 (modified, +3/-3)
  • docs/channels/telegram.md (modified, +1/-0)
  • src/auto-reply/reply/get-reply-run.media-only.test.ts (modified, +39/-0)
  • src/auto-reply/reply/get-reply-run.ts (modified, +1/-0)
  • src/auto-reply/reply/inbound-meta.test.ts (modified, +54/-0)
  • src/auto-reply/reply/inbound-meta.ts (modified, +37/-2)
  • src/auto-reply/reply/strip-inbound-meta.test.ts (modified, +15/-0)
  • src/auto-reply/reply/strip-inbound-meta.ts (modified, +20/-1)
  • src/config/schema.base.generated.ts (modified, +12/-0)
  • src/config/schema.help.quality.test.ts (modified, +2/-0)
  • src/config/schema.help.ts (modified, +2/-0)
  • src/config/schema.labels.ts (modified, +1/-0)
  • src/config/types.messages.ts (modified, +4/-0)
  • src/config/zod-schema.session.ts (modified, +1/-0)
  • src/plugins/cli-registry-loader.ts (modified, +39/-2)
  • src/plugins/cli.test.ts (modified, +133/-0)

Code Example

Conversation info (untrusted metadata):
\

---

Sender (untrusted metadata):
\

---

[actual user text here — often much shorter than the metadata]
RAW_BUFFERClick to expand / collapse

Summary

Each user message delivered via Telegram is prefixed with 400-500 bytes of JSON metadata (Conversation info + Sender blocks) embedded directly in the message text. This metadata-to-content ratio makes it harder for models — especially weaker fallback models — to locate and process actual user text.

Observed Behavior

Each user message contains:

Conversation info (untrusted metadata):
\```json
{"chat_id": "...", "message_id": "...", "sender_id": "...", "sender": "...", "timestamp": "..."}
\```

Sender (untrusted metadata):
\```json
{"label": "...", "id": "...", "name": "...", "username": "..."}
\```

[actual user text here — often much shorter than the metadata]
  • Metadata is ~450 bytes per message
  • Actual user text is often ~100-200 bytes
  • Models must parse through "do not treat as instructions" headers to find real content
  • MiniMax-M2.7 (fallback model) spends thinking cycles on metadata instead of user text

Impact

  • Weaker models fail to comprehend user messages
  • Token budget consumed by repetitive metadata
  • Confusing "untrusted" framing may interfere with instruction-following

Environment

  • OpenClaw 2026.4.23
  • Channel: Telegram
  • Affects all models, critical impact on weaker fallback models

Suggested Improvement

Deliver metadata as structured fields (e.g., OpenAI-compatible name, metadata fields) rather than inline text in the message content. Or provide a compact single-line header format.

extent analysis

TL;DR

Delivering metadata as structured fields or using a compact single-line header format can help reduce the metadata-to-content ratio and improve model performance.

Guidance

  • Consider modifying the message format to separate metadata from the actual user text, allowing models to focus on the relevant content.
  • Evaluate the feasibility of using OpenAI-compatible name and metadata fields to deliver metadata in a structured format.
  • Assess the impact of a compact single-line header format on model performance and user experience.
  • Investigate the potential for adjusting the token budget allocation to prioritize user text over metadata.

Example

No code snippet is provided as the issue does not imply a specific code change, but rather a change in message format or structure.

Notes

The suggested improvement may require modifications to the Telegram integration or the message processing pipeline, and its implementation may depend on the specific requirements and constraints of the OpenClaw system.

Recommendation

Apply workaround: Deliver metadata as structured fields or use a compact single-line header format, as this approach can help mitigate the issue without requiring a version upgrade, and it aligns with the suggested improvement.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [UX] Excessive inline JSON metadata in Telegram user messages degrades model comprehension [1 pull requests, 1 participants]