openclaw - 💡(How to fix) Fix Failure-fallback path bypasses agents.defaults.silentReply policy in groups/channels (internal inconsistency) [3 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

agent-runner-execution.ts's failure-fallback path silently bypasses the agents.defaults.silentReply / surfaces.<id>.silentReply policy that the rest of the auto-reply pipeline already honors. As documented, those policies are the user-visible knob for whether NO_REPLY-style silent payloads are delivered or rewritten in direct / group / internal conversations.

But on the runReplyAgent failure path, the function resolveExternalRunFailureTextForConversation() makes its own hardcoded decision — "in non-direct contexts, force SILENT_REPLY_TOKEN" — without ever consulting the resolved silentReply policy. This means an operator cannot opt their group into seeing failure-fallback copy by setting agents.defaults.silentReply.group: "disallow", even though that knob is documented as the policy override.

The result for end users in groups/channels: an LLM run fails, the agent reads the message, and there is no visible reply or notice. Operators have no documented config to change that behavior.

Root Cause

Root cause (source citations)

Fix Action

Fixed

Code Example

{
  "agents": {
    "defaults": {
      "silentReply": { "group": "disallow", "direct": "disallow" }
    }
  },
  "models": { "providers": { "litellm": { "timeoutSeconds": 120 } } }
}

---

function resolveExternalRunFailureTextForConversation(params: {
  text: string;
  sessionCtx: TemplateContext;
  isGenericRunnerFailure: boolean;
}): string {
  if (!isNonDirectConversationContext(params.sessionCtx)) {
    return params.text;
  }
  if (!params.isGenericRunnerFailure && !params.text.includes(AGENT_FAILED_BEFORE_REPLY_TEXT)) {
    return params.text;
  }
  return SILENT_REPLY_TOKEN;
}
RAW_BUFFERClick to expand / collapse

Summary

agent-runner-execution.ts's failure-fallback path silently bypasses the agents.defaults.silentReply / surfaces.<id>.silentReply policy that the rest of the auto-reply pipeline already honors. As documented, those policies are the user-visible knob for whether NO_REPLY-style silent payloads are delivered or rewritten in direct / group / internal conversations.

But on the runReplyAgent failure path, the function resolveExternalRunFailureTextForConversation() makes its own hardcoded decision — "in non-direct contexts, force SILENT_REPLY_TOKEN" — without ever consulting the resolved silentReply policy. This means an operator cannot opt their group into seeing failure-fallback copy by setting agents.defaults.silentReply.group: "disallow", even though that knob is documented as the policy override.

The result for end users in groups/channels: an LLM run fails, the agent reads the message, and there is no visible reply or notice. Operators have no documented config to change that behavior.

Why this is an internal inconsistency, not a design change

Two paths in src/auto-reply/reply/ decide whether silent output is delivered:

  1. route-reply.ts:127 — calls resolveSilentReplyPolicy({...}) from src/config/silent-reply.ts and only preserves silent payloads when the resolved policy is not "allow". This is the documented contract.

  2. agent-runner-execution.ts:472-486resolveExternalRunFailureTextForConversation short-circuits to SILENT_REPLY_TOKEN based purely on isNonDirectConversationContext(ctx) + isGenericRunnerFailure. It never imports or calls resolveSilentReplyPolicy.

Path 1 is the one users discover from docs/concepts/messages.md. Path 2 silently overrides path 1 for the failure-fallback case. That's the bug — they should agree.

Repro

OpenClaw 2026.5.6 (also reproduced against main @ be166b9), Feishu group, config:

{
  "agents": {
    "defaults": {
      "silentReply": { "group": "disallow", "direct": "disallow" }
    }
  },
  "models": { "providers": { "litellm": { "timeoutSeconds": 120 } } }
}

Trigger: send a complex message to the agent in the group such that all fallback model attempts time out (or any other path that lands in the generic runner-failure branch with verbose=off).

Expected: per silentReply.group: "disallow", the group sees a sanitized fallback copy.

Actual: silent — replies=0 in Feishu logs, no user-visible message.

Root cause (source citations)

src/auto-reply/reply/agent-runner-execution.ts @ be166b9:

function resolveExternalRunFailureTextForConversation(params: {
  text: string;
  sessionCtx: TemplateContext;
  isGenericRunnerFailure: boolean;
}): string {
  if (!isNonDirectConversationContext(params.sessionCtx)) {
    return params.text;
  }
  if (!params.isGenericRunnerFailure && !params.text.includes(AGENT_FAILED_BEFORE_REPLY_TEXT)) {
    return params.text;
  }
  return SILENT_REPLY_TOKEN;
}

This is the only call site in the failure-reply payload builders (buildKnownAgentRunFailureReplyPayload and the embedded-runner catch path near line 2316). Neither call site checks resolveSilentReplyPolicy.

Proposal

Make resolveExternalRunFailureTextForConversation call resolveSilentReplyPolicy({...}) (already exported from src/config/silent-reply.ts) and only return SILENT_REPLY_TOKEN when policy resolves to "allow" for the conversation type. Defaults remain unchanged ({ direct: "disallow", group: "allow", internal: "allow" }), so the out-of-the-box behavior — including the documented "groups don't see gateway boilerplate" promise from docs/concepts/messages.md — is preserved verbatim.

The only behavioral change: when an operator explicitly sets agents.defaults.silentReply.group: "disallow" (or the per-surface override), the failure-fallback copy now respects that override, matching what route-reply.ts already does for non-failure silent payloads.

Cross-references

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING