openclaw - 💡(How to fix) Fix bug: openai-codex-responses provider leaks internal reasoning into assistantTexts (gpt-5.5) [2 comments, 2 participants]

Internal reasoning from the openai-codex-responses provider (gpt-5.5) appears in trajectory assistantTexts and gets forwarded to user-facing channels (e.g. Feishu groups via feishu_im_user_message). End users see English reasoning fragments mid-Chinese conversation.

This is structurally similar to the deepseek thinking leak from #72044, but for a different provider (openai-codex-responses, not openai-completions).

Related issues (different root causes):

#72044 — deepseek-v4-pro reasoning_content fix incomplete (sister issue, different provider)
#39032 — subagent tool-failure reasoning leak (different scope: subagent, tool-error trigger)
#67092 — </think> parsing failure (different mechanism)
#45041 — output-level filter feature request (orthogonal: this issue is a provider-level sanitization bug)

Root Cause

Related issues (different root causes):

#72044 — deepseek-v4-pro reasoning_content fix incomplete (sister issue, different provider)
#39032 — subagent tool-failure reasoning leak (different scope: subagent, tool-error trigger)
#67092 — </think> parsing failure (different mechanism)
#45041 — output-level filter feature request (orthogonal: this issue is a provider-level sanitization bug)

Summary

This is structurally similar to the deepseek thinking leak from #72044, but for a different provider (openai-codex-responses, not openai-completions).

Related issues (different root causes):

#72044 — deepseek-v4-pro reasoning_content fix incomplete (sister issue, different provider)
#39032 — subagent tool-failure reasoning leak (different scope: subagent, tool-error trigger)
#67092 — </think> parsing failure (different mechanism)
#45041 — output-level filter feature request (orthogonal: this issue is a provider-level sanitization bug)

Environment

OpenClaw v2026.4.24
macOS 25.4.0 (Apple Silicon)
Provider: openai-codex (Responses API)
Model: gpt-5.5
Channel: feishu_im_user_message → Feishu group

Symptom

Main agent (agentId: main) using gpt-5.5 over openai-codex-responses produces user-facing assistant text where internal reasoning fragments appear unsanitized. The fragments have a distinctive telegraphic style: terse English with omitted articles ("Need write", "Need maybe", "first run") — clearly internal model reasoning, not normal assistant output style.

Reproduction

Run main agent with provider: openai-codex + modelId: gpt-5.5
Drive a multi-turn task involving spawning subagents and chained tool calls
Inspect trajectory data.assistantTexts — internal reasoning fragments appear

Evidence (session 9aa089ac-ebb8-4fba-a8ca-f8a5676d66f8)

Trajectory file:

~/.openclaw/agents/main/sessions/9aa089ac-ebb8-4fba-a8ca-f8a5676d66f8.trajectory.jsonl

7 leaked reasoning fragments captured in trajectory assistantTexts:

"Need write SESSION updated? already perhaps but new choice. Need write not use edit. Then handoff files. Need include outdated GPT-5.3-codex maybe user supplied; but requires live? ..."

"Need maybe also fetch codex page for basic; but enough. Need maybe domestic preparation details user asks: network, OpenAI account, payment, install. ..."

"Need spawn. Need use agents_list? for reasoning available? probably. But can call sessions_spawn agentId reasoning. Explicit model. Use task concise but include handoff path ..."

"Need since note says don't poll, wait completions. Need maybe update plan completed spawn. Then sessions_yield."

"Need progress update every 2 min; for now immediately maybe user-visible? final? Since in tool phase, can send normal? But final later. ..."

"Need update records. TopicPipeline decision_status maybe 已发布 or 已完成? Skill says Joulian 反馈已发布后 update TopicPipeline decision_status=已发布 ..."

"Need maybe update SESSION-STATE. Also maybe update memory daily? User confirms publication; valuable. AGENTS says WAL if user message has decision/preference/key status ..."

Expected

Internal reasoning content (whatever the codex Responses API emits as "reasoning") should be filtered out of assistantTexts and out of any messaging-tool content the agent produces, similar to how deepseek's reasoning_content is intended to be sanitized (per the partial fix discussed in #72044).

Impact

User in Feishu sees English reasoning fragments mid-Chinese conversation (jarring UX in a Chinese-only content workflow)
Erodes trust in agent output when "thinking out loud" leaks into visible messages
Affects content-factory production workflow (assistantTexts → feishu_im_user_message → end user)
Also pollutes trajectories used for audit/debugging

Workaround (none satisfactory)

Prompt-level instructions cannot suppress provider-level reasoning emission (per #45041 research notes — "more reasoning steps = more leakage", and prompts work only ~80% reliably). This needs runtime sanitization in the openai-codex-responses provider — symmetric to whatever path handles deepseek's reasoning_content separation.

extent analysis

TL;DR

The issue can be fixed by implementing runtime sanitization in the openai-codex-responses provider to filter out internal reasoning content from assistantTexts.

Guidance

Review the openai-codex-responses provider code to identify where internal reasoning content is being emitted and how it can be filtered out.
Investigate the deepseek provider's reasoning_content sanitization mechanism (mentioned in #72044) as a potential reference for implementing similar sanitization in the openai-codex-responses provider.
Consider adding a post-processing step to assistantTexts to remove any internal reasoning fragments before they are sent to user-facing channels.
Verify that the sanitization mechanism is working correctly by inspecting the assistantTexts trajectory and checking for the presence of internal reasoning fragments.

Example

No code example is provided as the issue does not contain sufficient information about the provider's implementation details.

Notes

The issue is specific to the openai-codex-responses provider and the gpt-5.5 model, and the fix may need to be tailored to this specific configuration. Additionally, the issue is similar to #72044, but the root cause and solution may differ.

Recommendation

Apply a workaround by implementing runtime sanitization in the openai-codex-responses provider, as prompt-level instructions are not sufficient to suppress provider-level reasoning emission.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix bug: openai-codex-responses provider leaks internal reasoning into assistantTexts (gpt-5.5) [2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Workaround (none satisfactory)

Summary

Environment

Symptom

Reproduction

Evidence (session 9aa089ac-ebb8-4fba-a8ca-f8a5676d66f8)

Expected

Impact

Workaround (none satisfactory)

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix bug: openai-codex-responses provider leaks internal reasoning into assistantTexts (gpt-5.5) [2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Workaround (none satisfactory)

Summary

Environment

Symptom

Reproduction

Evidence (session 9aa089ac-ebb8-4fba-a8ca-f8a5676d66f8)

Expected

Impact

Workaround (none satisfactory)

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING