openclaw - ✅(Solved) Fix [Bug]: Discord multi-agent channels churn Codex native threads; suspect dynamic-tool fingerprint changes beyond owner-only tool flips [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#84086Fetched 2026-05-20 03:44:12
View on GitHub
Comments
1
Participants
2
Timeline
7
Reactions
1
Author
Timeline (top)
labeled ×4cross-referenced ×2commented ×1

In OpenClaw v2026.5.18, a long-running Discord channel session using the Codex app-server runtime repeatedly starts new native Codex threads when human owner messages and another agent/bot's Discord messages are interleaved in the same channel.

This conflicts with the documented ambient-room-events contract in docs/channels/ambient-room-events.md:191, which says Discord keeps room-event history until a visible Discord send succeeds so quiet context is not lost before message-tool delivery. In practice, the runtime preserves OpenClaw mirrored history, but the native Codex thread binding churns and repeatedly abandons the accumulated Codex thread/cache state.

The observed churn overlaps with #76179's owner-only tool-count flip class, but the production trace also shows same-tool-count native thread changes. That suggests tool count alone does not explain the channel's churn, and a second fingerprint/input surface may be involved.

One plausible additional trigger is room_event / message_tool_only changing the message tool between deferred and direct registration:

  1. Discord room_event turns force sourceReplyDeliveryMode = "message_tool_only".
  2. message_tool_only forces the message tool into directToolNames.
  3. The Codex dynamic tool spec for message changes shape: direct specs omit namespace / deferLoading, while deferred specs include them.
  4. dynamicToolsFingerprint includes those fields, excluding only description.
  5. startOrResumeThread can see the fingerprint mismatch, clear the binding, and start a new native Codex thread.

This report should be treated as a Discord multi-agent churn investigation unless a maintainer can confirm that #83367/#83369 fully covers this channel pattern.

Root Cause

If this test already passes after #83367/#83369, add a second regression for same-toolCount adjacent turns from a mixed Discord channel so the remaining binding invalidation reason is captured.

Fix Action

Fix / Workaround

src/auto-reply/reply/dispatch-from-config.ts:772-786 room_event/message_tool_only makes the runtime also-allow message.

src/auto-reply/reply/dispatch-from-config.ts:831-856 sourceReplyDeliveryMode is resolved and attached as message_tool_only.

This is related and likely explains the cron/gateway/nodes / tool-count-change part of this channel. However, this production trace also has 19 same-toolCount thread changes, so a patch for #76179 should be tested against this Discord multi-agent channel before assuming the churn is fully fixed.

PR fix notes

PR #84211: fix(codex): keep message dynamic tool direct

Description (problem / solution / changelog)

Summary

  • Keep the OpenClaw message dynamic tool direct for Codex app-server threads in every source-reply mode.
  • Preserve deferred loading for other searchable OpenClaw tools, including web_search, while keeping sessions_yield direct.
  • Add a two-turn regression showing an automatic turn followed by message_tool_only resumes the existing Codex thread and still carries the visible-reply instruction.

Fixes #84086.

Real behavior proof

Behavior or issue addressed: Discord room-event turns can flip sourceReplyDeliveryMode to message_tool_only; when message alternated between deferred and direct dynamic-tool specs, Codex native thread bindings could churn even when the tool count stayed the same.

Real environment tested: Local OpenClaw checkout on macOS using the real Codex dynamic-tool bridge and fingerprint code with mocked tool implementations.

Exact steps or command run after this patch: PATH=/Users/andy/.cache/codex-runtimes/codex-primary-runtime/dependencies/node/bin:$PATH node --import tsx /private/tmp/proof-84086.mjs

Evidence after fix:

openclaw-codex-message-tool-contract-proof=ok
automatic_message_direct=true
message_tool_only_message_direct=true
fingerprints_equal=true
deferred_tools=web_search

Observed result after fix: The message dynamic tool is direct for both ordinary and message_tool_only Codex turns, the dynamic-tool fingerprint is stable across those modes, and unrelated tools remain deferred/searchable.

What was not tested: No live Discord production trajectory was replayed; the proof exercises the real Codex dynamic-tool construction and fingerprint path locally.

Validation

  • NODE_OPTIONS=--max-old-space-size=8192 OPENCLAW_VITEST_MAX_WORKERS=1 PATH=/Users/andy/.cache/codex-runtimes/codex-primary-runtime/dependencies/node/bin:$PATH node scripts/run-vitest.mjs extensions/codex/src/app-server/dynamic-tools.test.ts -t "keeps turn-yield direct" --pool forks --maxWorkers 1 --vmMemoryLimit 8192MB
  • NODE_OPTIONS=--max-old-space-size=8192 OPENCLAW_VITEST_MAX_WORKERS=1 PATH=/Users/andy/.cache/codex-runtimes/codex-primary-runtime/dependencies/node/bin:$PATH node scripts/run-vitest.mjs extensions/codex/src/app-server/run-attempt.test.ts -t "resumes Codex threads when source reply mode toggles" --pool forks --maxWorkers 1 --vmMemoryLimit 8192MB
  • NODE_OPTIONS=--max-old-space-size=8192 OPENCLAW_VITEST_MAX_WORKERS=1 PATH=/Users/andy/.cache/codex-runtimes/codex-primary-runtime/dependencies/node/bin:$PATH node scripts/run-vitest.mjs extensions/codex/src/app-server/dynamic-tools.test.ts extensions/codex/src/app-server/thread-lifecycle.test.ts extensions/codex/src/app-server/run-attempt.test.ts --pool forks --maxWorkers 1 --vmMemoryLimit 8192MB
  • NODE_OPTIONS=--max-old-space-size=8192 OPENCLAW_VITEST_MAX_WORKERS=1 PATH=/Users/andy/.cache/codex-runtimes/codex-primary-runtime/dependencies/node/bin:$PATH node scripts/run-vitest.mjs src/auto-reply/reply/source-reply-delivery-mode.test.ts src/auto-reply/reply/get-reply-run.media-only.test.ts src/agents/tools/message-tool.test.ts --pool forks --maxWorkers 1 --vmMemoryLimit 8192MB
  • PATH=/Users/andy/.cache/codex-runtimes/codex-primary-runtime/dependencies/node/bin:/Users/andy/openclaw-84086/node_modules/.bin:$PATH oxfmt --check extensions/codex/src/app-server/dynamic-tools.ts extensions/codex/src/app-server/dynamic-tools.test.ts extensions/codex/src/app-server/run-attempt.test.ts
  • PATH=/Users/andy/.cache/codex-runtimes/codex-primary-runtime/dependencies/node/bin:/Users/andy/openclaw-84086/node_modules/.bin:$PATH oxlint extensions/codex/src/app-server/dynamic-tools.ts extensions/codex/src/app-server/dynamic-tools.test.ts extensions/codex/src/app-server/run-attempt.test.ts
  • git diff --check

Maintainer note

If this PR is squashed or reworked, please preserve author attribution or include:

Co-authored-by: Andy Ye <[email protected]>

Changed files

  • extensions/codex/src/app-server/dynamic-tools.test.ts (modified, +2/-5)
  • extensions/codex/src/app-server/dynamic-tools.ts (modified, +4/-3)
  • extensions/codex/src/app-server/run-attempt.test.ts (modified, +73/-0)

Code Example

provider=openai-codex
   modelApi=openai-codex-responses
   modelId=gpt-5.5
   contextTokenBudget=272000

---

agent:main:discord:channel:<redacted-channel-id>

---

~/.openclaw/agents/main/sessions/<session-id>.trajectory.jsonl

---

Session shape:
agent:main:discord:channel:<redacted-channel-id>

Trajectory window:
2026-05-19T00:09:29.902Z through 2026-05-19T09:12:50.851Z

session.started events: 249
unique native Codex thread IDs: 215
adjacent thread ID changes: 214

toolCount distribution:
24 tools: 121 starts
26 tools: 39 starts
27 tools: 89 starts

---

different-toolCount thread changes: 198
same-toolCount thread changes: 19

---

Other agent/app sender:
121 turns
toolCount=24 for all 121 turns
102 unique native Codex thread IDs

Human owner sender:
128 turns
toolCount=26 for 39 turns
toolCount=27 for 89 turns
113 unique native Codex thread IDs

---

2026-05-19T00:10:23Z prevTools=26 newTools=24 prevThread=019e3d90-f1e newThread=019e3d91-c27
2026-05-19T00:11:06Z prevTools=24 newTools=24 prevThread=019e3d91-c27 newThread=019e3d92-69a
2026-05-19T00:11:32Z prevTools=24 newTools=27 prevThread=019e3d92-69a newThread=019e3d92-d08
2026-05-19T00:12:26Z prevTools=27 newTools=24 prevThread=019e3d92-d08 newThread=019e3d93-a5a
2026-05-19T00:18:01Z prevTools=24 newTools=26 prevThread=019e3d94-956 newThread=019e3d98-be9
2026-05-19T00:19:36Z prevTools=26 newTools=24 prevThread=019e3d98-be9 newThread=019e3d9a-329
2026-05-19T00:46:12Z prevTools=27 newTools=27 prevThread=019e3db2-615 newThread=019e3db2-8f7

---

session.started events: 15
unique native Codex thread IDs: 5
normal toolCount=27 starts: 14

---

openai-codex/gpt-5.5 via Codex app-server
modelApi=openai-codex-responses
contextTokenBudget=272000

---

Discord guild channel -> OpenClaw agent runtime -> Codex app-server runtime -> GPT-5.5

---

messages.groupChat.unmentionedInbound = "room_event"
messages.groupChat.visibleReplies = "message_tool"

---

~/.openclaw/agents/main/sessions/<session-id>.trajectory.jsonl

---

src/auto-reply/reply/source-reply-delivery-mode.ts:45-47
room_event forces message_tool_only.

src/auto-reply/reply/dispatch-from-config.ts:772-786
room_event/message_tool_only makes the runtime also-allow message.

src/auto-reply/reply/dispatch-from-config.ts:831-856
sourceReplyDeliveryMode is resolved and attached as message_tool_only.

src/auto-reply/reply/get-reply-run.ts:1135
room_event passes sourceReplyDeliveryMode: "message_tool_only" into the run.

extensions/codex/src/app-server/run-attempt.ts:906-910
message_tool_only adds directToolNames: ["message"].

extensions/codex/src/app-server/run-attempt.ts:3204-3206
shouldForceMessageTool is true when sourceReplyDeliveryMode is message_tool_only.

extensions/codex/src/app-server/dynamic-tools.ts:204-221
direct tool specs return the base spec, while deferred specs add namespace and deferLoading.

extensions/codex/src/app-server/thread-lifecycle.ts:743-768
dynamicToolsFingerprint includes the stabilized tool spec and excludes only description.

extensions/codex/src/app-server/thread-lifecycle.ts:220-249
dynamic tool catalog mismatch clears the binding instead of resuming the existing thread.

extensions/codex/src/app-server/run-attempt.ts:1082-1092
when startupBinding.threadId is absent or the fingerprint is incompatible, the legacy mirrored-history projection rebuild path executes.

---

#76179 / PR #83367 / PR #83369:
Codex app-server restarts native threads when owner-only tool availability flips.

---

#69876:
Codex harness dynamicToolsFingerprint churn across runtime surfaces.

---

#82134:
Discord channel delivery suppressed by sourceReplyDeliveryMode: message_tool_only.

---

Legacy mirrored-history fallback may cap rebuilt history at 24,000 rendered chars.
RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

In OpenClaw v2026.5.18, a long-running Discord channel session using the Codex app-server runtime repeatedly starts new native Codex threads when human owner messages and another agent/bot's Discord messages are interleaved in the same channel.

This conflicts with the documented ambient-room-events contract in docs/channels/ambient-room-events.md:191, which says Discord keeps room-event history until a visible Discord send succeeds so quiet context is not lost before message-tool delivery. In practice, the runtime preserves OpenClaw mirrored history, but the native Codex thread binding churns and repeatedly abandons the accumulated Codex thread/cache state.

The observed churn overlaps with #76179's owner-only tool-count flip class, but the production trace also shows same-tool-count native thread changes. That suggests tool count alone does not explain the channel's churn, and a second fingerprint/input surface may be involved.

One plausible additional trigger is room_event / message_tool_only changing the message tool between deferred and direct registration:

  1. Discord room_event turns force sourceReplyDeliveryMode = "message_tool_only".
  2. message_tool_only forces the message tool into directToolNames.
  3. The Codex dynamic tool spec for message changes shape: direct specs omit namespace / deferLoading, while deferred specs include them.
  4. dynamicToolsFingerprint includes those fields, excluding only description.
  5. startOrResumeThread can see the fingerprint mismatch, clear the binding, and start a new native Codex thread.

This report should be treated as a Discord multi-agent churn investigation unless a maintainer can confirm that #83367/#83369 fully covers this channel pattern.

Steps to reproduce

  1. Use OpenClaw v2026.5.18 with an agent configured for Codex app-server runtime:
    provider=openai-codex
    modelApi=openai-codex-responses
    modelId=gpt-5.5
    contextTokenBudget=272000
  2. Connect the agent to a Discord guild/channel session:
    agent:main:discord:channel:<redacted-channel-id>
  3. Have both a human owner and another OpenClaw/agent app post in the same channel.
  4. Let the main agent receive both:
    • direct human Discord channel messages
    • other-agent/bot messages mirrored as quiet room_event context
  5. Inspect the session trajectory:
    ~/.openclaw/agents/main/sessions/<session-id>.trajectory.jsonl
  6. Compare session.started events by data.threadId, data.toolCount, and sender in the adjacent model.completed.data.messagesSnapshot[0].

Expected behavior

The same OpenClaw Discord channel session should generally resume the same native Codex thread across interleaved human and other-agent messages.

The documented room-event history contract says:

Supported room-event channels keep recent ambient room messages as context. Discord keeps room-event history until a visible Discord send succeeds, so quiet context is not lost before message-tool delivery.

room_event and message_tool_only are delivery/visibility concerns. They should not, by themselves, make the native Codex thread incompatible.

The current turn must still enforce delivery and authorization correctly:

  • room-event replies should still use the required message-tool path when needed;
  • unauthorized or unavailable message sends should still fail closed;
  • but the thread-level dynamic tool schema should remain stable unless the actual tool contract changes.

Actual behavior

The affected Discord channel repeatedly starts new native Codex threads.

Observed on v2026.5.18 (OpenClaw 2026.5.18 (50a2481)) in one production Discord channel session:

Session shape:
agent:main:discord:channel:<redacted-channel-id>

Trajectory window:
2026-05-19T00:09:29.902Z through 2026-05-19T09:12:50.851Z

session.started events: 249
unique native Codex thread IDs: 215
adjacent thread ID changes: 214

toolCount distribution:
24 tools: 121 starts
26 tools: 39 starts
27 tools: 89 starts

Thread changes by adjacent tool-count transition:

different-toolCount thread changes: 198
same-toolCount thread changes: 19

The different-toolCount changes are consistent with #76179's owner-only/dynamic-tool availability class. The 19 same-toolCount changes are why this channel likely needs a separate investigation instead of being closed as obviously identical to #76179.

Sender split from the same trajectory:

Other agent/app sender:
121 turns
toolCount=24 for all 121 turns
102 unique native Codex thread IDs

Human owner sender:
128 turns
toolCount=26 for 39 turns
toolCount=27 for 89 turns
113 unique native Codex thread IDs

Representative adjacent starts:

2026-05-19T00:10:23Z prevTools=26 newTools=24 prevThread=019e3d90-f1e newThread=019e3d91-c27
2026-05-19T00:11:06Z prevTools=24 newTools=24 prevThread=019e3d91-c27 newThread=019e3d92-69a
2026-05-19T00:11:32Z prevTools=24 newTools=27 prevThread=019e3d92-69a newThread=019e3d92-d08
2026-05-19T00:12:26Z prevTools=27 newTools=24 prevThread=019e3d92-d08 newThread=019e3d93-a5a
2026-05-19T00:18:01Z prevTools=24 newTools=26 prevThread=019e3d94-956 newThread=019e3d98-be9
2026-05-19T00:19:36Z prevTools=26 newTools=24 prevThread=019e3d98-be9 newThread=019e3d9a-329
2026-05-19T00:46:12Z prevTools=27 newTools=27 prevThread=019e3db2-615 newThread=019e3db2-8f7

For comparison, a same-agent Discord channel without the mixed-agent traffic pattern showed far less churn in the same period:

session.started events: 15
unique native Codex thread IDs: 5
normal toolCount=27 starts: 14

OpenClaw version

v2026.5.18 (50a2481)

Operating system

Darwin 25.4.0

Install method

npm global

Model

openai-codex/gpt-5.5 via Codex app-server
modelApi=openai-codex-responses
contextTokenBudget=272000

Provider / routing chain

Discord guild channel -> OpenClaw agent runtime -> Codex app-server runtime -> GPT-5.5

Additional provider/model setup details

messages.groupChat.unmentionedInbound = "room_event"
messages.groupChat.visibleReplies = "message_tool"

The affected session is a Discord guild/channel session using the Codex app-server runtime. Channel IDs, user IDs, auth profile details, and local file paths are redacted.

Logs, screenshots, and evidence

Observed trajectory file, redacted:

~/.openclaw/agents/main/sessions/<session-id>.trajectory.jsonl

The raw session contains private channel content, so the public issue should use the aggregate counts above unless maintainers request a sanitized excerpt.

Relevant v2026.5.18 source references:

src/auto-reply/reply/source-reply-delivery-mode.ts:45-47
room_event forces message_tool_only.

src/auto-reply/reply/dispatch-from-config.ts:772-786
room_event/message_tool_only makes the runtime also-allow message.

src/auto-reply/reply/dispatch-from-config.ts:831-856
sourceReplyDeliveryMode is resolved and attached as message_tool_only.

src/auto-reply/reply/get-reply-run.ts:1135
room_event passes sourceReplyDeliveryMode: "message_tool_only" into the run.

extensions/codex/src/app-server/run-attempt.ts:906-910
message_tool_only adds directToolNames: ["message"].

extensions/codex/src/app-server/run-attempt.ts:3204-3206
shouldForceMessageTool is true when sourceReplyDeliveryMode is message_tool_only.

extensions/codex/src/app-server/dynamic-tools.ts:204-221
direct tool specs return the base spec, while deferred specs add namespace and deferLoading.

extensions/codex/src/app-server/thread-lifecycle.ts:743-768
dynamicToolsFingerprint includes the stabilized tool spec and excludes only description.

extensions/codex/src/app-server/thread-lifecycle.ts:220-249
dynamic tool catalog mismatch clears the binding instead of resuming the existing thread.

extensions/codex/src/app-server/run-attempt.ts:1082-1092
when startupBinding.threadId is absent or the fingerprint is incompatible, the legacy mirrored-history projection rebuild path executes.

Related issues and why this appears distinct:

Closest existing issue:

#76179 / PR #83367 / PR #83369:
Codex app-server restarts native threads when owner-only tool availability flips.

This is related and likely explains the cron/gateway/nodes / tool-count-change part of this channel. However, this production trace also has 19 same-toolCount thread changes, so a patch for #76179 should be tested against this Discord multi-agent channel before assuming the churn is fully fixed.

Older related issue:

#69876:
Codex harness dynamicToolsFingerprint churn across runtime surfaces.

That issue is closed, and v2026.5.18 already excludes description from the fingerprint. The current observed churn still reproduces on v2026.5.18.

Related but not this:

#82134:
Discord channel delivery suppressed by sourceReplyDeliveryMode: message_tool_only.

That issue concerns visible Discord delivery suppression, not native Codex thread churn or context continuity.

Separate downstream damage:

Legacy mirrored-history fallback may cap rebuilt history at 24,000 rendered chars.

That is tracked separately in #84084. It makes each new native thread much more damaging, but it does not explain why the new native threads are created in the first place.

Impact and severity

Affected: long-running Discord guild/channel sessions using Codex app-server runtime where the agent receives both human messages and other-agent/bot messages in the same channel.

Severity: High for live multi-agent channels. The user remains in the same OpenClaw session, but Codex native thread continuity is repeatedly lost.

Observed impact:

  • native Codex thread restarted 214 times across 249 starts in roughly 9 hours;
  • context status stayed much lower on the affected channel than on a stable channel;
  • the agent repeatedly lost recent corrections/decisions and answered from stale or partial reconstructed context.

Additional information

Suggested fix shape, not normative:

First verify which binding-compatibility check is clearing the thread binding for the same-toolCount transitions. If it is dynamic-tool fingerprint churn, stabilize the Codex dynamic tool schema across source reply delivery modes.

Possible approaches:

  1. Keep the message tool registered with the same direct/deferred shape for the lifetime of a Codex app-server thread, and enforce per-turn message-tool requirements in the bridge/runtime policy.
  2. If direct vs deferred loading is not a semantic schema contract change for thread resume, normalize namespace / deferLoading out of the compatibility fingerprint for otherwise-identical tools.
  3. If Codex app-server cannot safely resume across direct/deferred changes, prefer always registering message in one stable mode for Discord channel sessions and enforce source-delivery behavior separately.

The invariant should be:

Delivery mode and per-turn authorization can change per turn; the native Codex thread should only be invalidated when the actual tool contract changes in a way Codex cannot resume.

Regression test suggestion:

Add a Codex app-server test that simulates one Discord channel session with alternating turns:

  1. normal human channel message (sourceReplyDeliveryMode automatic);
  2. other-agent room event (InboundEventKind = "room_event", sourceReplyDeliveryMode = "message_tool_only");
  3. normal human channel message again.

Assert:

  • one thread/start;
  • subsequent turns use thread/resume with the same native thread ID;
  • the dynamic tool fingerprint remains compatible across those turns;
  • message remains available/required for room-event delivery;
  • unauthorized message/owner-only tool execution still fails closed.

If this test already passes after #83367/#83369, add a second regression for same-toolCount adjacent turns from a mixed Discord channel so the remaining binding invalidation reason is captured.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

The same OpenClaw Discord channel session should generally resume the same native Codex thread across interleaved human and other-agent messages.

The documented room-event history contract says:

Supported room-event channels keep recent ambient room messages as context. Discord keeps room-event history until a visible Discord send succeeds, so quiet context is not lost before message-tool delivery.

room_event and message_tool_only are delivery/visibility concerns. They should not, by themselves, make the native Codex thread incompatible.

The current turn must still enforce delivery and authorization correctly:

  • room-event replies should still use the required message-tool path when needed;
  • unauthorized or unavailable message sends should still fail closed;
  • but the thread-level dynamic tool schema should remain stable unless the actual tool contract changes.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Bug]: Discord multi-agent channels churn Codex native threads; suspect dynamic-tool fingerprint changes beyond owner-only tool flips [1 pull requests, 1 comments, 2 participants]