openclaw - 💡(How to fix) Fix [Bug]: gateway accepts agent runs before plugin tool registration completes; pre-ready window produces spurious 'No callable tools remain' [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#75130Fetched 2026-05-01 05:37:52
View on GitHub
Comments
1
Participants
2
Timeline
2
Reactions
2
Author
Timeline (top)
closed ×1commented ×1

Gateway accepts agent runs (via WS) before plugin tool registration completes. Any inbound message that arrives in the ~17s window between [gateway] starting HTTP server and [gateway] ready is dispatched to an agent whose tool registry is still empty. The * wildcard in the agent's tool allowlist resolves against the empty registry and the run fails immediately with:

errorCode=UNAVAILABLE
errorMessage=Error: No callable tools remain after resolving explicit tool allowlist
              (agents.talos.tools.allow: *, message); no registered tools matched.
              Fix the allowlist or enable the plugin that registers the requested tool.

The error message is misleading — the allowlist is fine; the registry just hasn't populated yet.

Error Message

errorCode=UNAVAILABLE errorMessage=Error: No callable tools remain after resolving explicit tool allowlist (agents.talos.tools.allow: *, message); no registered tools matched. Fix the allowlist or enable the plugin that registers the requested tool.

Root Cause

Today the symptom is hidden because guardian doesn't restart the gateway often, but any restart during traffic produces a thundering herd of UNAVAILABLE failures across whatever sessions happen to have inbound messages mid-flight.

Fix Action

Fix / Workaround

Gateway accepts agent runs (via WS) before plugin tool registration completes. Any inbound message that arrives in the ~17s window between [gateway] starting HTTP server and [gateway] ready is dispatched to an agent whose tool registry is still empty. The * wildcard in the agent's tool allowlist resolves against the empty registry and the run fails immediately with:

Hold WS agent dispatch until first [gateway] ready event. Two reasonable approaches:

Code Example

errorCode=UNAVAILABLE
errorMessage=Error: No callable tools remain after resolving explicit tool allowlist
              (agents.talos.tools.allow: *, message); no registered tools matched.
              Fix the allowlist or enable the plugin that registers the requested tool.

---

2026-04-30T10:38:08.400-04:00  [agent/embedded] [tools] No callable tools remain ...
2026-04-30T10:38:08.431-04:00  embedded run failover decision: ... reason=none ... rawError=No callable tools remain ...
2026-04-30T10:38:08.433-04:00  [diagnostic] lane task error: lane=subagent ...
2026-04-30T10:38:15.189-04:00  [agent/embedded] [tools] No callable tools remain ...
2026-04-30T10:38:28.400-04:00  [agent/embedded] [tools] No callable tools remain ...
... (12 total in the window)

---

2026-04-30T10:38:34.361-04:00  [gateway] signal SIGTERM received
2026-04-30T10:38:38.713-04:00  [gateway] starting...
2026-04-30T10:38:41.340-04:00  [gateway] starting...
2026-04-30T10:38:49.255-04:00  [gateway] starting HTTP server...
2026-04-30T10:38:50.602-04:00  [gateway] http server listening (10 plugins ...; 9.3s)
2026-04-30T10:38:55.411-04:00  [gateway] ready
RAW_BUFFERClick to expand / collapse

Summary

Gateway accepts agent runs (via WS) before plugin tool registration completes. Any inbound message that arrives in the ~17s window between [gateway] starting HTTP server and [gateway] ready is dispatched to an agent whose tool registry is still empty. The * wildcard in the agent's tool allowlist resolves against the empty registry and the run fails immediately with:

errorCode=UNAVAILABLE
errorMessage=Error: No callable tools remain after resolving explicit tool allowlist
              (agents.talos.tools.allow: *, message); no registered tools matched.
              Fix the allowlist or enable the plugin that registers the requested tool.

The error message is misleading — the allowlist is fine; the registry just hasn't populated yet.

Repro

Any inbound Telegram message during the ~17s window between [gateway] starting HTTP server and [gateway] ready after a kickstart/SIGTERM/guardian restart will trigger this. The window is wider on cold starts when plugin-runtime-deps need staging (observed: ~17s on 2026.4.27).

Evidence (2026-04-30 incident)

Twelve No callable tools remain failures, all between 10:38:08.400 and 10:38:28.726 EDT:

2026-04-30T10:38:08.400-04:00  [agent/embedded] [tools] No callable tools remain ...
2026-04-30T10:38:08.431-04:00  embedded run failover decision: ... reason=none ... rawError=No callable tools remain ...
2026-04-30T10:38:08.433-04:00  [diagnostic] lane task error: lane=subagent ...
2026-04-30T10:38:15.189-04:00  [agent/embedded] [tools] No callable tools remain ...
2026-04-30T10:38:28.400-04:00  [agent/embedded] [tools] No callable tools remain ...
... (12 total in the window)

Gateway readiness sequence on the same boot:

2026-04-30T10:38:34.361-04:00  [gateway] signal SIGTERM received
2026-04-30T10:38:38.713-04:00  [gateway] starting...
2026-04-30T10:38:41.340-04:00  [gateway] starting...
2026-04-30T10:38:49.255-04:00  [gateway] starting HTTP server...
2026-04-30T10:38:50.602-04:00  [gateway] http server listening (10 plugins ...; 9.3s)
2026-04-30T10:38:55.411-04:00  [gateway] ready

After 10:38:55.411, zero No callable tools occurrences. The failures are strictly bounded to the pre-ready window.

The agent's actual config is well-formed (tools: { alsoAllow: ["message"] } for an agent that inherits the wildcard). The runtime view agents.talos.tools.allow: *, message reflects the resolved allowlist; * is the inheritance wildcard, not a literal entry. Steady-state runs against the same config succeed.

Downstream effects

The lost runs trigger 3-retry subagent announce cycles ([warn] Subagent announce give up (retry-limit) ... retries=3 endedAgo=...). Visible to the user as "the bot didn't reply" for any Telegram message that landed during the window.

Cross-reference

Same broad theme as #74864 (orphan-recovery resurrection death-spiral during gateway startup) — different mechanism, same family: lifecycle race during gateway startup creates spurious failures.

Fix shape

Hold WS agent dispatch until first [gateway] ready event. Two reasonable approaches:

  • Queue: buffer WS frames received pre-ready and replay after [gateway] ready. Best UX (user sees normal latency), bounded buffer needed.
  • 503 Try Again: respond with 503 + Retry-After for agent-run requests pre-ready. Simpler, lets the channel layer (Telegram, BlueBubbles, etc.) re-deliver naturally on its own retry cadence.

Today the symptom is hidden because guardian doesn't restart the gateway often, but any restart during traffic produces a thundering herd of UNAVAILABLE failures across whatever sessions happen to have inbound messages mid-flight.

extent analysis

TL;DR

Hold WebSocket agent dispatch until the gateway is fully ready to prevent UNAVAILABLE errors caused by premature message handling.

Guidance

  • Identify the time window between [gateway] starting HTTP server and [gateway] ready when the gateway is not fully ready to handle agent runs.
  • Implement a queuing mechanism to buffer WebSocket frames received during this window and replay them after the gateway is ready.
  • Alternatively, respond with a 503 status code and a Retry-After header for agent-run requests received before the gateway is ready, allowing the channel layer to retry the request.
  • Monitor the gateway's readiness sequence to ensure that the fix is working as expected.

Example

A possible implementation of the queuing approach could involve storing incoming WebSocket frames in a buffer and replaying them after the gateway is ready:

const wsFrameBuffer = [];

// ...

ws.on('message', (frame) => {
  if (gatewayReady) {
    handleAgentRun(frame);
  } else {
    wsFrameBuffer.push(frame);
  }
});

// ...

gateway.on('ready', () => {
  gatewayReady = true;
  wsFrameBuffer.forEach((frame) => handleAgentRun(frame));
  wsFrameBuffer.length = 0;
});

Notes

The choice between the queuing and 503 approaches depends on the specific requirements of the system and the trade-offs between complexity, latency, and user experience.

Recommendation

Apply the queuing workaround to hold WebSocket agent dispatch until the gateway is fully ready, as it provides the best user experience with normal latency.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: gateway accepts agent runs before plugin tool registration completes; pre-ready window produces spurious 'No callable tools remain' [1 comments, 2 participants]