openclaw - ✅(Solved) Fix [Bug]: Telegram/Codex response corruption after provider failure — truncated replies, leaked internal reply tag, and malformed final delivery [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#55019Fetched 2026-04-08 01:33:39
View on GitHub
Comments
1
Participants
2
Timeline
7
Reactions
0
Participants
Timeline (top)
labeled ×2commented ×1cross-referenced ×1mentioned ×1

When using OpenClaw on Telegram DM with openai-codex/gpt-5.4, a provider-side failure (server_error) can be followed by corrupted user-visible replies.

Observed symptoms in the same incident chain:

• assistant replies get truncated mid-sentence • long replies become unreliable after the first failure • a malformed internal-looking reply marker was delivered to Telegram as plain text: • [[replyReturn_current • structured work involving tools + file writes appears to increase the chance of corruption • local session transcript shows incomplete / partial assistant outputs around the failure window

This looks like a compound failure:

  1. upstream Codex/provider error
  2. OpenClaw reply/delivery pipeline not safely handling partial/corrupted assistant output afterward

Error Message

  1. Provider error in session logs

Example pattern found in local transcript: { "stopReason": "error", "errorMessage": "Codex error: {"type":"error","error":{"type":"server_error","code":"server_error","message":"An error occurred while processing your request..."}}" }

  1. Malformed reply marker delivered to Telegram

User received this as a visible Telegram message: [[replyReturn_current This strongly suggests a broken internal reply tag or partially emitted routing marker.

  1. Repeated truncation of assistant replies

Assistant replies in the same incident window were visibly cut mid-sentence, e.g. patterns like: ... eu estava quebrando no momento de fazer o write da and ... especialmente quando tento:

  1. pensar
  2. executar
  3. escrever arquivo grande donc
  4. Structured task context

The issue occurred during a turn involving:

• tool calls • write/read of SKILL.md • long structured assistant output

This may be relevant if the corruption is triggered or amplified by:

• long responses • multi-phase tool usage • structured markdown/YAML-like content • reply tag insertion on Telegram delivery...

Root Cause

Root cause hypothesis

Fix Action

Fixed

PR fix notes

PR #55040: fix: sanitize leaked directive tags before outbound delivery

Description (problem / solution / changelog)

Summary

  • Adds sanitizeLeakedDirectiveTags() in src/utils/directive-tags.ts — a defensive safety net that catches malformed/partial directive tags (e.g. [[replyReturn_current with no closing ]]) that bypass the precise REPLY_TAG_RE regex
  • Wires the sanitization into normalizeReplyPayload() in src/auto-reply/reply/normalize-reply.ts, right after sanitizeUserFacingText() — the central outbound path for all channels

Fixes #55019

Test plan

  • 11 new tests in src/utils/directive-tags.test.ts covering:
    • Malformed tags with no closing brackets ([[replyReturn_current, [[reply_to_current, [[ reply_to : 123)
    • Partial audio tag ([[audio_as_voice)
    • Single closing bracket ([[reply_to_current])
    • Well-formed tags still stripped as safety net
    • Normal [[...]] content NOT stripped (e.g. [[1,2,3]], [[wikipedia]])
    • Empty string, plain text, multiple malformed tags
  • All 20 directive-tags tests pass
  • pnpm check passes (lint, format, type-check, boundaries)

Changed files

  • src/auto-reply/reply/normalize-reply.ts (modified, +6/-0)
  • src/utils/directive-tags.test.ts (modified, +51/-0)
  • src/utils/directive-tags.ts (modified, +19/-0)

Code Example

1. Provider error in session logs

Example pattern found in local transcript:
{
"stopReason": "error",
"errorMessage": "Codex error: {\"type\":\"error\",\"error\":{\"type\":\"server_error\",\"code\":\"server_error\",\"message\":\"An error occurred while processing your request...\"}}"
}

2. Malformed reply marker delivered to Telegram

User received this as a visible Telegram message:
[[replyReturn_current
This strongly suggests a broken internal reply tag or partially emitted routing marker.

3. Repeated truncation of assistant replies

Assistant replies in the same incident window were visibly cut mid-sentence, e.g. patterns like:
... eu estava quebrando no momento de fazer o write da
and
... especialmente quando tento:
1. pensar
2. executar
3. escrever arquivo grande donc
4. Structured task context

The issue occurred during a turn involving:

• tool calls
• write/read of SKILL.md
• long structured assistant output

This may be relevant if the corruption is triggered or amplified by:

• long responses
• multi-phase tool usage
• structured markdown/YAML-like content
• reply tag insertion on Telegram delivery...
RAW_BUFFERClick to expand / collapse

Bug type

Crash (process/app exits or hangs)

Summary

When using OpenClaw on Telegram DM with openai-codex/gpt-5.4, a provider-side failure (server_error) can be followed by corrupted user-visible replies.

Observed symptoms in the same incident chain:

• assistant replies get truncated mid-sentence • long replies become unreliable after the first failure • a malformed internal-looking reply marker was delivered to Telegram as plain text: • [[replyReturn_current • structured work involving tools + file writes appears to increase the chance of corruption • local session transcript shows incomplete / partial assistant outputs around the failure window

This looks like a compound failure:

  1. upstream Codex/provider error
  2. OpenClaw reply/delivery pipeline not safely handling partial/corrupted assistant output afterward

Steps to reproduce

I do not yet have a minimal deterministic repro, but the failure pattern appears more likely under this shape:

  1. Use Telegram DM
  2. Use openai-codex/gpt-5.4
  3. Trigger a long multi-step task involving:

• tool calls • file writes • structured markdown output

  1. Hit a provider failure (server_error) during or near final answer generation
  2. Observe whether subsequent replies:

• truncate • leak internal text • or emit malformed reply tags

Expected behavior

After a provider failure or interrupted completion:

• Telegram users should receive either: • a clean error message • or no malformed partial output at all • internal reply routing markers should never be delivered to the user • partial/corrupted assistant text should be suppressed or sanitized before outbound delivery • later replies in the same session should not remain “poisoned” / degraded

Actual behavior

Observed user-visible behavior:

• replies cut off in the middle of sentences • larger responses repeatedly fail after the first incident • malformed raw text delivered to Telegram: • [[replyReturn_current • behavior becomes worse on longer/structured replies

Observed transcript/log behavior:

• assistant messages with incomplete final text • provider-side server_error • evidence of corruption around tool-heavy / structured turns

OpenClaw version

2026.3.13

Operating system

Linux

Install method

global npm / standard OpenClaw runtime

Model

openai-codex/gpt-5.4

Provider / routing chain

Telegram DM

Additional provider/model setup details

We were in the middle of a multi-step task involving:

• analysis • tool calls • file writes • generating/editing a SKILL.md

During that session:

  1. assistant began failing mid-reply
  2. subsequent longer replies were repeatedly truncated
  3. local logs later showed explicit provider failures:

• Codex error: {"type":"error","error":{"type":"server_error","code":"server_error","message":"An error occurred while processing your request"...}}

  1. after the failure sequence, Telegram received a malformed raw message:

• [[replyReturn_current

This token was not intentional user-facing text and appears to be an internal reply-routing marker or corrupted variant of one.

Logs, screenshots, and evidence

1. Provider error in session logs

Example pattern found in local transcript:
{
"stopReason": "error",
"errorMessage": "Codex error: {\"type\":\"error\",\"error\":{\"type\":\"server_error\",\"code\":\"server_error\",\"message\":\"An error occurred while processing your request...\"}}"
}

2. Malformed reply marker delivered to Telegram

User received this as a visible Telegram message:
[[replyReturn_current
This strongly suggests a broken internal reply tag or partially emitted routing marker.

3. Repeated truncation of assistant replies

Assistant replies in the same incident window were visibly cut mid-sentence, e.g. patterns like:
... eu estava quebrando no momento de fazer o write da
and
... especialmente quando tento:
1. pensar
2. executar
3. escrever arquivo grande donc
4. Structured task context

The issue occurred during a turn involving:

• tool calls
• write/read of SKILL.md
• long structured assistant output

This may be relevant if the corruption is triggered or amplified by:

• long responses
• multi-phase tool usage
• structured markdown/YAML-like content
• reply tag insertion on Telegram delivery...

Impact and severity

Why this seems important

This is not only a provider outage symptom.

The provider error alone would explain a failed turn, but not why Telegram later receives:

• truncated partial final answers • malformed internal-looking reply markers • continued degradation on subsequent responses

That suggests an OpenClaw-side outbound sanitization / reply-state / delivery handling gap after partial model failure.

Additional information

Root cause hypothesis

Most likely a compound issue:

A. Provider/model failure

Codex returns server_error during generation.

B. OpenClaw post-failure handling gap

After the failed or partial generation:

• partial assistant output may remain in an invalid state • reply-tag injection / stripping may mis-handle the corrupted text • outbound Telegram delivery may send unsanitized internal fragments

C. Structured / long turns increase risk

Long responses involving tools and file writes seem to make the failure mode easier to trigger or more severe.

───

Related issues

This report appears related to the same general bug family as:

• #52084 — Telegram/Codex leaks commentary/tool-call trace text • #24376 — Telegram stream/reasoning leaks intermediate toolUse text • #3952 — Telegram messages truncated due to tag/filter handling • #27157 — Telegram long-message split/truncation/duplicate delivery issues

This issue is slightly different because it specifically includes:

• provider-side server_error • post-error reply corruption • leaked malformed reply marker: • [[replyReturn_current

<img width="1015" height="328" alt="Image" src="https://github.com/user-attachments/assets/4c115eca-57e4-4d0c-ba65-d1c304c9e1c4" /> <img width="658" height="977" alt="Image" src="https://github.com/user-attachments/assets/57605897-d1d8-45c0-83e6-4c305e7b5404" />

extent analysis

Fix Plan

To address the compound issue of provider-side failure and OpenClaw's post-failure handling gap, we will implement the following steps:

  • Sanitize partial output: Ensure that any partial or corrupted assistant output is properly sanitized before being sent to Telegram.
  • Handle reply tags: Improve the handling of reply tags to prevent malformed or internal-looking markers from being delivered to the user.
  • Implement error handling: Enhance error handling to catch and handle provider-side errors, such as server errors, and prevent them from causing subsequent reply corruption.

Code Changes

# Sanitize partial output
def sanitize_output(output):
    # Remove any internal-looking reply markers
    output = output.replace('[[replyReturn_current', '')
    # Truncate output to prevent malformed text
    output = output[:4096]
    return output

# Handle reply tags
def handle_reply_tags(output):
    # Remove any reply tags
    output = output.replace('[reply]', '')
    return output

# Implement error handling
def handle_provider_error(error):
    # Log the error
    logging.error(f'Provider error: {error}')
    # Return a clean error message to the user
    return 'Error: Unable to generate response.'

Configuration Changes

  • Increase logging verbosity: Increase the logging verbosity to capture more detailed error messages and debug information.
  • Implement retry mechanism: Implement a retry mechanism to handle temporary provider-side errors and prevent subsequent reply corruption.

Temporary Workarounds

  • Disable structured turns: Temporarily disable structured turns involving tools and file writes to prevent the failure mode from being triggered.
  • Use a different provider: Consider using a different provider or model to mitigate the issue until a permanent fix is implemented.

Verification

To verify that the fix worked, test the following scenarios:

  • Trigger a provider-side error and verify that the subsequent replies are not corrupted or truncated.
  • Test structured turns involving tools and file writes to ensure that the failure mode is no longer triggered.
  • Verify that malformed reply markers are no longer delivered to the user.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

After a provider failure or interrupted completion:

• Telegram users should receive either: • a clean error message • or no malformed partial output at all • internal reply routing markers should never be delivered to the user • partial/corrupted assistant text should be suppressed or sanitized before outbound delivery • later replies in the same session should not remain “poisoned” / degraded

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING