openclaw - ✅(Solved) Fix [Bug]: channels status --probe always times out with 8+ accounts due to sequential probing [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#67937Fetched 2026-04-17 08:28:54
View on GitHub
Comments
0
Participants
1
Timeline
2
Reactions
0
Participants
Assignees
Timeline (top)
assigned ×1cross-referenced ×1

openclaw channels status --probe always times out at the default 10s limit when multiple channel accounts are configured, because the gateway probes accounts sequentially and the total probe time exceeds the CLI timeout.

Error Message

$ openclaw channels status --probe Checking channel status (probe)… Gateway not reachable: Error: gateway timeout after 10000ms Gateway target: ws://127.0.0.1:18789

Root Cause

Two problems compound:

Fix Action

Fix / Workaround

Workaround: openclaw channels status --probe --timeout 30000 succeeds but takes ~61 seconds.

  • Affected: Any deployment with 5+ channel accounts
  • Severity: Medium-High — --probe is the primary tool for verifying channel connectivity after config changes, and it silently returns stale config-only data
  • Frequency: Always reproducible with enough accounts
  • Consequence: Operators cannot verify channel health without knowing the --timeout workaround; misleading "Gateway not reachable" error message suggests a gateway problem when the issue is probe latency

PR fix notes

PR #67959: fix(channels): parallelize status probes

Description (problem / solution / changelog)

Summary

  • Problem: openclaw channels status --probe timed out with multiple configured accounts because the gateway probed accounts sequentially and the CLI default timeout stayed at 10s.
  • Why it matters: operators saw a misleading gateway-timeout error and lost live probe data even when the gateway itself was healthy.
  • What changed: channels.status now overlaps per-channel and per-account probe work while preserving deterministic output order, and channels status --probe now defaults to a 30000ms timeout when --timeout is omitted.
  • What did NOT change (scope boundary): per-account probeAccount/auditAccount contracts, non-probe channels status, and explicit --timeout behavior remain unchanged.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #67938
  • Related #67937
  • This PR fixes a bug or regression

Root Cause (if applicable)

  • Root cause: src/gateway/server-methods/channels.ts awaited probe/audit work in nested loops, so total probe time scaled with the sum of all accounts instead of the slowest account. src/cli/channels-cli.ts and src/commands/channels/status.ts also kept the omitted-timeout default at 10000ms, which was too short for multi-account probe runs.
  • Missing detection / guardrail: there was no regression coverage asserting concurrent probe scheduling or the probe-mode default timeout.
  • Contributing context (if known): slow channels such as Telegram can already consume most of a 10s per-account budget, so sequential probing made the aggregate request exceed the CLI timeout quickly.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/gateway/server-methods/channels.status.test.ts, src/commands/channels.status.command-flow.test.ts, src/cli/channels-cli.test.ts
  • Scenario the test should lock in: probe mode overlaps account work without reordering account output, probe mode uses a 30000ms default when --timeout is omitted, and explicit --timeout still wins.
  • Why this is the smallest reliable guardrail: the regression sits in gateway request assembly and CLI option/default plumbing; these tests hit both seams directly without needing live channel credentials.
  • Existing test that already covers this (if any): N/A
  • If no new test is added, why not: N/A

User-visible / Behavior Changes

  • openclaw channels status --probe now completes much faster on multi-account setups because probe work is no longer serialized across every account.
  • Omitting --timeout in probe mode now uses 30000ms instead of 10000ms.
  • Non-probe openclaw channels status still defaults to 10000ms.

Diagram (if applicable)

Before:
[channels status --probe] -> [gateway probes account 1] -> [account 2] -> [account 3] -> [CLI hits 10s timeout]

After:
[channels status --probe] -> [gateway probes accounts concurrently] -> [results kept in account order] -> [CLI receives payload within the larger default budget]

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation:

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: local Node 22 / pnpm workspace
  • Model/provider: N/A
  • Integration/channel (if any): channel status gateway path
  • Relevant config (redacted): multi-account channel config, no secrets needed for tests

Steps

  1. Configure multiple channel accounts and run openclaw channels status --probe.
  2. Observe that the old path serialized per-account probes and could exceed the default timeout.
  3. Run the same command after this change and verify the request overlaps account work and uses the higher default timeout when omitted.

Expected

  • Probe mode should not time out solely because multiple configured accounts are probed sequentially.

Actual

  • Before this change, the CLI could fail with gateway timeout after 10000ms even when the gateway was reachable.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios: targeted tests for gateway probe concurrency/order, probe-mode timeout defaulting, explicit timeout override behavior, and CLI option passthrough; full pnpm build; full pnpm check.
  • Edge cases checked: non-probe status keeps the 10s default; account payload order remains stable even when work overlaps.
  • What you did not verify: live multi-account probe timing against real external channel credentials.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps:

Risks and Mitigations

  • Risk: parallel probe scheduling could expose ordering assumptions in status payload assembly.
    • Mitigation: results are collected with Promise.all(...), and regression coverage asserts stable account ordering.
  • Risk: a longer default probe timeout could delay failure when a user expects a hard 10s limit.
    • Mitigation: this only applies when --probe is used without an explicit override, and --timeout still takes precedence.

Changed files

  • src/cli/channels-cli.test.ts (added, +70/-0)
  • src/cli/channels-cli.ts (modified, +1/-1)
  • src/commands/channels.status.command-flow.test.ts (modified, +50/-0)
  • src/commands/channels/status.ts (modified, +7/-1)
  • src/gateway/server-methods/channels.status.test.ts (modified, +106/-0)
  • src/gateway/server-methods/channels.ts (modified, +79/-65)

Code Example

$ openclaw channels status --probe
Checking channel status (probe)Gateway not reachable: Error: gateway timeout after 10000ms
Gateway target: ws://127.0.0.1:18789

---

$ openclaw channels status --probe --timeout 30000
# Succeeds after ~61s total, showing all 8 accounts probed successfully

---

for (const plugin of plugins) {        // sequential across channels
  for (const accountId of accountIds) { // sequential across accounts
    probeResult = await plugin.status.probeAccount({
      account,
      timeoutMs,  // each call up to 10s
    });
  }
}

---

$ time openclaw channels status --probe --timeout 30000
# ... all 8 accounts probed ok ...
openclaw channels status --probe --timeout 30000  65.20s user 6.55s system 116% cpu 1:01.56 total
RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

openclaw channels status --probe always times out at the default 10s limit when multiple channel accounts are configured, because the gateway probes accounts sequentially and the total probe time exceeds the CLI timeout.

Steps to reproduce

  1. Configure 8 channel accounts (1 Discord + 6 Telegram + 1 Feishu) in openclaw.json
  2. Run openclaw channels status --probe
  3. Observe timeout after 10 seconds with "Gateway not reachable" error

Workaround: openclaw channels status --probe --timeout 30000 succeeds but takes ~61 seconds.

Expected behavior

channels status --probe should return results within the default timeout, or at minimum report partial results from accounts that finished probing before the deadline.

Actual behavior

$ openclaw channels status --probe
Checking channel status (probe)…
Gateway not reachable: Error: gateway timeout after 10000ms
Gateway target: ws://127.0.0.1:18789

The CLI times out and falls back to config-only status (no probe data). All probe work completed on the gateway side is discarded.

With --timeout 30000:

$ openclaw channels status --probe --timeout 30000
# Succeeds after ~61s total, showing all 8 accounts probed successfully

Root cause analysis

Two problems compound:

1. Sequential account probing in gateway

src/gateway/server-methods/channels.ts:136-164 — accounts are probed in a nested sequential loop:

for (const plugin of plugins) {        // sequential across channels
  for (const accountId of accountIds) { // sequential across accounts
    probeResult = await plugin.status.probeAccount({
      account,
      timeoutMs,  // each call up to 10s
    });
  }
}

With 8 accounts and Telegram probes using 3-retry exponential backoff plus proxy overhead, total time reaches 54-61s.

2. Timeout budget mismatch

  • CLI timeout (src/commands/channels/status.ts:283): default 10s, shared by WS connect + auth + full probe cycle
  • Per-account probe timeout (channels.ts:84-85): also 10s
  • Maximum sequential time: 8 accounts × 10s = 80s possible

The CLI timeout (10s) < minimum expected probe time for multi-account deployments.

OpenClaw version

2026.4.15

Operating system

macOS 26.2 (Darwin 25.2.0)

Install method

npm global

Model

zai/glm-5.1

Provider / routing chain

openclaw -> zai

Logs, screenshots, and evidence

Successful run with extended timeout showing real probe duration:

$ time openclaw channels status --probe --timeout 30000
# ... all 8 accounts probed ok ...
openclaw channels status --probe --timeout 30000  65.20s user 6.55s system 116% cpu 1:01.56 total

openclaw health completes in ~600ms because it uses a different code path (single aggregated health check vs per-account probe).

Impact and severity

  • Affected: Any deployment with 5+ channel accounts
  • Severity: Medium-High — --probe is the primary tool for verifying channel connectivity after config changes, and it silently returns stale config-only data
  • Frequency: Always reproducible with enough accounts
  • Consequence: Operators cannot verify channel health without knowing the --timeout workaround; misleading "Gateway not reachable" error message suggests a gateway problem when the issue is probe latency

Suggested fix

Parallel probing + adjusted timeout budget:

  1. In channels.ts, replace the sequential for...await loop with Promise.allSettled() to probe all accounts concurrently
  2. Increase the CLI default timeout for --probe from 10s to 30s (accounts for one slow/failing probe without penalizing all others)

Expected improvement: 8 accounts probed in ~8s (time of slowest single probe) instead of ~60s (sum of all probes), well within a 30s default timeout.

This is a change to src/gateway/server-methods/channels.ts only — the per-account probeAccount contract stays the same.

Additional information

NOT_ENOUGH_INFO

extent analysis

TL;DR

Increase the default timeout for --probe and implement parallel probing to reduce the total probe time.

Guidance

  • Identify the bottleneck in the sequential account probing process and consider using Promise.allSettled() to probe accounts concurrently.
  • Adjust the timeout budget to account for the maximum expected probe time, considering the number of accounts and potential slow or failing probes.
  • Review the code in src/gateway/server-methods/channels.ts to implement the suggested fix.
  • Test the changes with different numbers of accounts to ensure the new timeout and parallel probing work as expected.

Example

const probeResults = await Promise.allSettled(
  accountIds.map(async (accountId) => {
    return plugin.status.probeAccount({
      account,
      timeoutMs,
    });
  })
);

Notes

The suggested fix only requires changes to src/gateway/server-methods/channels.ts and does not affect the per-account probeAccount contract.

Recommendation

Apply the workaround by increasing the default timeout for --probe to 30s and implementing parallel probing to reduce the total probe time, as this should significantly improve the performance and reliability of the channels status --probe command.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

channels status --probe should return results within the default timeout, or at minimum report partial results from accounts that finished probing before the deadline.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Bug]: channels status --probe always times out with 8+ accounts due to sequential probing [1 pull requests, 1 participants]