openclaw - 💡(How to fix) Fix Plugin register() invoked per agent run; silent Telegram reply loss in subagent path (v2026.5.22)

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Two related symptoms observed on openclaw 2026.5.22 (a374c3a) in a Telegram-direct deployment (native gateway, single plugin extension, single primary user):

  1. Plugin register(api) is invoked ~15 times in 15 minutes on a quiet bot conversation (≈ once per agent run, including main turn, subagent spawn, and heartbeat). Manifest declares activation: { onStartup: true }, but the entry point is re-invoked per run.
  2. Silent assistant reply loss on turns where the main agent spawns a subagent with tool calls: original request_id later appears bound to channel_id=heartbeat instead of the original Telegram channel, and the user-facing outbound is never sent — no EmbeddedAttemptSessionTakeoverError, no lane wait exceeded, no error event of any kind.

Both symptoms persist after the 5.22 upgrade, which included PR #67785 / #67502 ("Assistant response lost during nested lane congestion + delivery context corruption") and several EmbeddedAttemptSessionTakeoverError fixes. PR #67785 description focuses on Discord/ACP, but the Telegram + subagent path appears to retain the same delivery-context-corruption class.

Symptoms are likely connected: gateway re-initializes plugin context per agent run, and during the re-init the delivery context binding (channel, account, chat) is lost when the run is a subagent spawned from a main Telegram turn.

Error Message

  1. Silent assistant reply loss on turns where the main agent spawns a subagent with tool calls: original request_id later appears bound to channel_id=heartbeat instead of the original Telegram channel, and the user-facing outbound is never sent — no EmbeddedAttemptSessionTakeoverError, no lane wait exceeded, no error event of any kind.
  • No EmbeddedAttemptSessionTakeoverError, no lane wait exceeded, no error / warn / fail / timeout event in the journal window.

Root Cause

If gateway re-initializes the full plugin context per agent run (Symptom 1), and the re-init re-binds delivery context, then a subagent run spawned from a Telegram main turn (Symptom 2) would re-bind the delivery context to whichever lane/channel runs next — in our case the heartbeat channel that pre-empts the main session. The original main turn's reply, generated and present at before_agent_reply time, is silently dropped because its delivery context no longer matches the Telegram channel.

Fix Action

Fix / Workaround

  • openclaw 2026.5.22 (a374c3a)
  • Native Telegram channel (channels.telegram gateway-native, not the legacy bot service)
  • Single custom plugin extension (declares: 1 WebSearchProvider, 7 tools via api.registerTool, 5 hooks via registerCompanionHooks, manifest activation.onStartup: true)
  • Primary model: Gemini 3.5 Flash (thinking off)
  • agents.defaults.heartbeat.isolatedSession: true (workaround from #85913)
  • Single user chat, low message volume (~1-2 inbound/minute)

All 7 tool factories and the WebSearchProvider factory are thin object constructors — no DB connections opened, no indexes loaded, no network calls in the factory body. The factory cost is O(1) per call. Yet the journal shows a consistent ~1.5-second silence between the companion.registered log line and the next reply_dispatch hook fire on every inbound:

06:33:40.436 [telegram] Inbound message
06:33:40.761 companion.registered      ← +325ms after inbound
06:33:42.390 reply_dispatch hook       ← +1629ms after companion.registered

Code Example

export default {
  id: "openclaw-companion",
  register(api) {
    api.registerWebSearchProvider(createWebSearchProvider());
    api.registerTool(createTool1());
    // … 6 more registerTool calls
    registerHooks(api);
    api.logger.info("companion.registered", { surface: "api.on", ... });
  },
};

---

{
  "id": "openclaw-companion",
  "activation": { "onStartup": true },
  "enabledByDefault": false,
  "hooks": { "allowConversationAccess": true, "allowPromptInjection": true }
}

---

15 × "companion.registered"

---

06:29:23.250 companion.registered
06:32:36.654 companion.registered
06:32:37.666 companion.registered+1s after previous (double-fire)
06:32:40.749 companion.registered
06:33:40.761 companion.registered   ← 325ms after a telegram inbound
06:33:55.119 companion.registered
06:34:18.701 companion.registered
06:37:11.381 companion.registered
06:37:12.292 companion.registered+1s after previous (double-fire), 1s before a heartbeat before_prompt_build
06:38:45.354 companion.registered

---

06:33:40.436 [telegram] Inbound message
06:33:40.761 companion.registered+325ms after inbound
06:33:42.390 reply_dispatch hook       ← +1629ms after companion.registered

---

T+0.000s   06:35:41.829 Inbound (60 chars, session=agent:main:telegram:direct:<chatId>)
T+2.004s   06:35:43.833 reply_dispatch         req_AAAA delivery_owner=main_agent_plain
T+13.881s  06:35:55.710 before_agent_reply     req_AAAA session=<UUID-A> channel=<chatId>   payload=cleanedBody
T+19.579s  06:36:01.408 before_prompt_build    req_BBBB run_id=<UUID-X> session=<UUID-A>    channel=<chatId>   ← subagent spawned, new request_id
T+22.169s  06:36:03.998 before_tool_call       req_BBBB run_id=<UUID-X> channel=telegram:<chatId>
T+25.411s  06:36:07.243 before_tool_call       req_BBBB run_id=<UUID-X> channel=telegram:<chatId>
T+84.834s  06:37:06.666 before_agent_reply     req_AAAA session=<UUID-B> channel=heartbeat  ← ORIGINAL req_AAAA reused on heartbeat channel
T+89.552s  06:37:11.381 companion.registered
T+90.463s  06:37:12.292 companion.registered                                            ← double
T+178.865s 06:38:40.694 before_agent_reply     req_AAAA session=<UUID-C> channel=heartbeat  ← ORIGINAL req_AAAA reused again on heartbeat channel
...
NO outbound to Telegram, ever (7+ minutes observed)
RAW_BUFFERClick to expand / collapse

Summary

Two related symptoms observed on openclaw 2026.5.22 (a374c3a) in a Telegram-direct deployment (native gateway, single plugin extension, single primary user):

  1. Plugin register(api) is invoked ~15 times in 15 minutes on a quiet bot conversation (≈ once per agent run, including main turn, subagent spawn, and heartbeat). Manifest declares activation: { onStartup: true }, but the entry point is re-invoked per run.
  2. Silent assistant reply loss on turns where the main agent spawns a subagent with tool calls: original request_id later appears bound to channel_id=heartbeat instead of the original Telegram channel, and the user-facing outbound is never sent — no EmbeddedAttemptSessionTakeoverError, no lane wait exceeded, no error event of any kind.

Both symptoms persist after the 5.22 upgrade, which included PR #67785 / #67502 ("Assistant response lost during nested lane congestion + delivery context corruption") and several EmbeddedAttemptSessionTakeoverError fixes. PR #67785 description focuses on Discord/ACP, but the Telegram + subagent path appears to retain the same delivery-context-corruption class.

Symptoms are likely connected: gateway re-initializes plugin context per agent run, and during the re-init the delivery context binding (channel, account, chat) is lost when the run is a subagent spawned from a main Telegram turn.

Environment

  • openclaw 2026.5.22 (a374c3a)
  • Native Telegram channel (channels.telegram gateway-native, not the legacy bot service)
  • Single custom plugin extension (declares: 1 WebSearchProvider, 7 tools via api.registerTool, 5 hooks via registerCompanionHooks, manifest activation.onStartup: true)
  • Primary model: Gemini 3.5 Flash (thinking off)
  • agents.defaults.heartbeat.isolatedSession: true (workaround from #85913)
  • Single user chat, low message volume (~1-2 inbound/minute)

Symptom 1 — register(api) invoked per agent run

Plugin entry (simplified)

export default {
  id: "openclaw-companion",
  register(api) {
    api.registerWebSearchProvider(createWebSearchProvider());
    api.registerTool(createTool1());
    // … 6 more registerTool calls
    registerHooks(api);
    api.logger.info("companion.registered", { surface: "api.on", ... });
  },
};

Manifest:

{
  "id": "openclaw-companion",
  "activation": { "onStartup": true },
  "enabledByDefault": false,
  "hooks": { "allowConversationAccess": true, "allowPromptInjection": true }
}

Observed frequency

Last 15 minutes of journalctl --user -u openclaw-gateway.service:

15 × "companion.registered"

Sample timestamps showing single + double-fires:

06:29:23.250 companion.registered
06:32:36.654 companion.registered
06:32:37.666 companion.registered   ← +1s after previous (double-fire)
06:32:40.749 companion.registered
06:33:40.761 companion.registered   ← 325ms after a telegram inbound
06:33:55.119 companion.registered
06:34:18.701 companion.registered
06:37:11.381 companion.registered
06:37:12.292 companion.registered   ← +1s after previous (double-fire), 1s before a heartbeat before_prompt_build
06:38:45.354 companion.registered

Correlation: at least one re-registration per agent run (main turn, subagent spawn, heartbeat run). Double-fires within 1 second correlate with heartbeat-followed-by-new-run sequences.

Plugin code is not the bottleneck

All 7 tool factories and the WebSearchProvider factory are thin object constructors — no DB connections opened, no indexes loaded, no network calls in the factory body. The factory cost is O(1) per call. Yet the journal shows a consistent ~1.5-second silence between the companion.registered log line and the next reply_dispatch hook fire on every inbound:

06:33:40.436 [telegram] Inbound message
06:33:40.761 companion.registered      ← +325ms after inbound
06:33:42.390 reply_dispatch hook       ← +1629ms after companion.registered

The 1.5s gap is consistent across observed turns. It happens between plugin re-registration and the routing decision, suggesting it is gateway-side work (other plugin re-registrations, autoRecall pre-fetch, context composition) — but it would not be re-paid every run if register(api) were called only at startup as the manifest declares.

Symptom 2 — silent reply loss when main agent spawns subagent

Full timeline

T+0.000s   06:35:41.829 Inbound (60 chars, session=agent:main:telegram:direct:<chatId>)
T+2.004s   06:35:43.833 reply_dispatch         req_AAAA delivery_owner=main_agent_plain
T+13.881s  06:35:55.710 before_agent_reply     req_AAAA session=<UUID-A> channel=<chatId>   payload=cleanedBody
T+19.579s  06:36:01.408 before_prompt_build    req_BBBB run_id=<UUID-X> session=<UUID-A>    channel=<chatId>   ← subagent spawned, new request_id
T+22.169s  06:36:03.998 before_tool_call       req_BBBB run_id=<UUID-X> channel=telegram:<chatId>
T+25.411s  06:36:07.243 before_tool_call       req_BBBB run_id=<UUID-X> channel=telegram:<chatId>
T+84.834s  06:37:06.666 before_agent_reply     req_AAAA session=<UUID-B> channel=heartbeat  ← ORIGINAL req_AAAA reused on heartbeat channel
T+89.552s  06:37:11.381 companion.registered
T+90.463s  06:37:12.292 companion.registered                                            ← double
T+178.865s 06:38:40.694 before_agent_reply     req_AAAA session=<UUID-C> channel=heartbeat  ← ORIGINAL req_AAAA reused again on heartbeat channel
...
NO outbound to Telegram, ever (7+ minutes observed)

Key observations:

  • before_agent_reply fires once with channel=<chatId> (Telegram), then the same request_id fires twice more with channel=heartbeat and different session UUIDs.
  • No message_sending hook ever fires for the Telegram-channel before_agent_reply.
  • No outbound send log line for the chat.
  • No EmbeddedAttemptSessionTakeoverError, no lane wait exceeded, no error / warn / fail / timeout event in the journal window.
  • openclaw gateway status --json returns {} for lane state; openclaw tasks maintenance --json reports runningCronJobs: 0, reconciled: 0, recovered: 0.

The subagent (req_BBBB) shows two before_tool_call hooks then disappears — no before_agent_reply for it, no completion event. The original main-turn reply never reaches the user.

This is silent data loss without diagnostic surface, worse than the symptom in #67502 where at least transcript visibility was preserved.

Likely connection

If gateway re-initializes the full plugin context per agent run (Symptom 1), and the re-init re-binds delivery context, then a subagent run spawned from a Telegram main turn (Symptom 2) would re-bind the delivery context to whichever lane/channel runs next — in our case the heartbeat channel that pre-empts the main session. The original main turn's reply, generated and present at before_agent_reply time, is silently dropped because its delivery context no longer matches the Telegram channel.

Proposed fix shape

  1. Plugin instance caching: honor activation.onStartup: true literally — call register(api) once per plugin per gateway lifetime. Tools, hooks, providers are immutable contracts; per-run reinstantiation is wasteful and (per Symptom 2) appears to be the root of delivery context corruption.
  2. Separate per-run scope for invocation context (request_id, session_id, channel_id, account/chat binding) from plugin lifecycle. The current code path appears to entangle them.
  3. Diagnostic surface for silent drops: at minimum, log a reply_dropped or delivery_context_mismatch event when a before_agent_reply fires but no message_sending follows.
  4. Config fallback as backstop: e.g. plugins.reuseAcrossRuns: true (default true) to allow overriding for plugins that genuinely need per-run reinit.

Related upstream issues

  • #67502 — closed by PR #67785, ACP/Discord focus; this report shows a Telegram + subagent subclass that persists in 5.22.
  • #85913 — heartbeat-vs-channel ETSE race; addressed via isolatedSession: true workaround but does not cover the silent-drop class.
  • #41235 — subagent announce completion timeout / idempotent delivery (closed); related class.

Happy to provide the full journal window, plugin manifest, or run-level CLI snapshots if useful. Splitting into two separate issues is fine if maintainers prefer (Symptom 1 = perf, Symptom 2 = correctness).

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING