openclaw - 💡(How to fix) Fix Gateway-level model identity not passed to delivery hooks (enables silent model-tag forgery) [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#69110Fetched 2026-04-20 12:01:38
View on GitHub
Comments
1
Participants
2
Timeline
4
Reactions
0
Timeline (top)
labeled ×2commented ×1cross-referenced ×1

Agents can run on model X while emitting [Model Y · Agent] self-tagged output. No enforcement layer verifies the self-tag matches the actual runtime model. The message_sending hook receives content but not model info; the llm_output hook has model info but is fire-and-forget. The two layers don't bridge, enabling silent model-tag forgery and allowing fabrication.

Root Cause

Agents can run on model X while emitting [Model Y · Agent] self-tagged output. No enforcement layer verifies the self-tag matches the actual runtime model. The message_sending hook receives content but not model info; the llm_output hook has model info but is fire-and-forget. The two layers don't bridge, enabling silent model-tag forgery and allowing fabrication.

Fix Action

Fix / Workaround

See attached file p1-github-issue.md for full failure case documentation including:
    - Redacted excerpts of modelOverride values
    - Proposed patch as unified diff (3 lines of change to applyMessageSendingHook metadata)
    - Two retraction-ledger entries documenting fabricated outputs

Code Example

See attached file p1-github-issue.md for full failure case documentation including:
    - Redacted excerpts of modelOverride values
    - Proposed patch as unified diff (3 lines of change to applyMessageSendingHook metadata)
    - Two retraction-ledger entries documenting fabricated outputs
RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

Agents can run on model X while emitting [Model Y · Agent] self-tagged output. No enforcement layer verifies the self-tag matches the actual runtime model. The message_sending hook receives content but not model info; the llm_output hook has model info but is fire-and-forget. The two layers don't bridge, enabling silent model-tag forgery and allowing fabrication.

Steps to reproduce

  1. Configure a session with a deprecated modelOverride (e.g., claude-3-opus-20240229) in sessions.json
  2. Send any message through the Telegram channel to that session
  3. Observe: session silently fails over to fallback model (xai/grok-4-fast)
  4. Observe: agent continues emitting self-tagged [Opus · Agent] output
  5. Observe: no server-side verification catches the mismatch

Expected behavior

When runtime model differs from agent self-tag, the gateway should detect the mismatch and either overwrite the tag with truthful model info, block the send, or log/alert. The applyMessageSendingHook metadata should include the actual model/provider used for the response.

Actual behavior

Agent-reported tags pass through unverified. applyMessageSendingHook in deliver-BNvlWd4P.js only receives {channel, accountId, mediaUrls} — no model or provider info. Confirmed via two real failure cases on 2026-04-19: (1) Winter Telegram session on xai/grok-4-fast fabricated outputs while self-tagging Opus 4.5 for multiple hours; (2) separate Claude Code session on grok-4-fast fabricated file writes and test results while self-tagging Opus 4.5.

OpenClaw version

OpenClaw version: 2026.4.15

Operating system

Operating system: macOS 15.4

Install method

Install method: npm global

Model

Model: anthropic/claude-opus-4-5 (Winter), xai/grok-4-fast (subagents/Cortex)

Provider / routing chain

Provider / routing chain: openclaw -> anthropic-direct (Winter), openclaw -> xai-direct (Cortex/subagents)

Additional provider/model setup details

Winter Telegram session pinned via modelOverride: claude-opus-4-5 in sessions.json
Cortex subagent runs on xai/grok-4-fast by configuration
Config at ~/.openclaw/openclaw.json and ~/.openclaw/agents/main/sessions/sessions.json

Logs, screenshots, and evidence

See attached file p1-github-issue.md for full failure case documentation including:
    - Redacted excerpts of modelOverride values
    - Proposed patch as unified diff (3 lines of change to applyMessageSendingHook metadata)
    - Two retraction-ledger entries documenting fabricated outputs

Impact and severity

Affected: Any deployment using multiple model providers with session-level modelOverride routing Severity: High — enables silent fabrication of outputs that appear authoritative (tagged with incorrect model identity) Frequency: Occurs whenever session falls back to a different model than its configured modelOverride (deprecated model strings, API errors, rate limits) Consequence: Downstream trust-boundary breaks; agent-tagged outputs cannot be verified as from claimed model; fabrication becomes indistinguishable from legitimate output

Additional information

No response

extent analysis

TL;DR

To fix the issue, modify the applyMessageSendingHook metadata to include the actual model/provider used for the response, enabling verification of the agent-reported tags.

Guidance

  • Review the deliver-BNvlWd4P.js file to understand how the applyMessageSendingHook function is currently implemented and identify where the model/provider information can be added.
  • Update the applyMessageSendingHook metadata to include the actual model/provider used for the response, as proposed in the unified diff patch provided in the attached file p1-github-issue.md.
  • Verify that the updated applyMessageSendingHook function correctly includes the model/provider information in the metadata by checking the logs or output of the deliver-BNvlWd4P.js file.
  • Consider adding additional logging or alerting to detect and prevent silent model-tag forgery in the future.

Example

No code example is provided as the specific implementation details are not available, but the proposed patch in the attached file p1-github-issue.md can be used as a starting point.

Notes

The fix may require additional changes to the openclaw.json and sessions.json configuration files to ensure that the correct model/provider information is being used.

Recommendation

Apply the proposed patch to the applyMessageSendingHook metadata to include the actual model/provider used for the response, as this will enable verification of the agent-reported tags and prevent silent model-tag forgery.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

When runtime model differs from agent self-tag, the gateway should detect the mismatch and either overwrite the tag with truthful model info, block the send, or log/alert. The applyMessageSendingHook metadata should include the actual model/provider used for the response.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING