openclaw - ✅(Solved) Fix Bug: 工具不存在时进入无限循环调用无法退出 [2 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#62971Fetched 2026-04-09 07:59:58
View on GitHub
Comments
1
Participants
2
Timeline
6
Reactions
0
Author
Participants
Timeline (top)
referenced ×3cross-referenced ×2commented ×1

Fix Action

Fixed

PR fix notes

PR #63023: fix(agents): break repeated unknown-tool loops

Description (problem / solution / changelog)

Fixes #62971

Summary

Fixes a runaway tool-loop case where the model keeps retrying the same missing tool name after receiving a tool not found / unknown tool error.

Current generic tool-loop protection is not enough for this case:

  • tools.loopDetection is disabled by default
  • even when enabled, generic-repeat warnings start at 10 and blocking only happens much later
  • unknown/missing tools are not recoverable by retrying, so the session can burn tokens until the user manually resets it

This PR adds a narrow, dedicated breaker for repeated unknown-tool failures.

Problem

When the model calls a tool name that does not exist, the current turn can get stuck in a retry loop:

  1. model calls a missing tool
  2. tool execution returns an error like Tool <name> not found
  3. model retries the same missing tool
  4. loop continues until manual user intervention (/reset, /new, etc.)

Root cause

The repo already has generic tool-loop detection, but it is not sufficient for this failure mode:

  • it is opt-in (tools.loopDetection.enabled)
  • generic repeated calls are warn-only for a long time
  • the existing detectors are tuned for repeated no-progress tool usage, not for immediately fatal missing-tool retries

So repeated unknown-tool failures were being treated like ordinary tool errors instead of a fast-stop condition.

Changes

  • add a dedicated unknown_tool_repeat detector
  • classify repeated unknown tool / tool not found outcomes in tool-call history
  • block the tool call once the same missing tool has already failed twice consecutively with the same arguments
  • keep the existing generic loop-detection defaults unchanged
  • emit the usual diagnostic tool.loop event/log for this dedicated breaker

Scope boundary

This PR does not:

  • change the global default for tools.loopDetection
  • retune generic repeat / poll / ping-pong thresholds
  • change normal recoverable tool-error handling
  • affect successful or progressing tool calls

It only adds a fast-stop path for repeated missing-tool retries.

Reproduction

Minimal local repro before this fix:

const tool = wrapToolWithBeforeToolCallHook(
  {
    name: "some_non_existent_tool",
    execute: async () => {
      throw new Error("Tool some_non_existent_tool not found");
    },
  },
  {
    agentId: "main",
    sessionKey: "repro:unknown-tool-loop",
    loopDetection: { enabled: false },
  },
);

Before this fix:

attempt 1: throws Tool some_non_existent_tool not found
attempt 2: throws Tool some_non_existent_tool not found
attempt 3+: same error continues, no automatic stop

After this fix:

attempt 1: original missing-tool error
attempt 2: original missing-tool error
attempt 3: blocked with a critical breaker message telling the model to stop retrying and ask for correction / installation

Validation

Tests:

pnpm test src/agents/tool-loop-detection.test.ts src/agents/pi-tools.before-tool-call.e2e.test.ts

Build:

pnpm build

Manual repro after patch:

  • repeated the same missing-tool call with loopDetection.enabled=false
  • confirmed the third identical attempt is blocked by the dedicated breaker
  • confirmed diagnostic logging emits detector=unknown_tool_repeat Observed repro output after patch:
attempt 1: threw -> Error: Tool some_non_existent_tool not found
attempt 2: threw -> Error: Tool some_non_existent_tool not found
attempt 3: threw -> Error: CRITICAL: some_non_existent_tool is not an available tool and has already failed 2 consecutive times. Stop retrying this tool call and ask the user to correct the tool name or install the missing tool.

Files

  • src/agents/tool-loop-detection.ts
  • src/agents/pi-tools.before-tool-call.ts
  • src/agents/pi-tools.before-tool-call.runtime.ts
  • src/logging/diagnostic-session-state.ts
  • src/logging/diagnostic.ts
  • src/infra/diagnostic-events.ts
  • src/agents/tool-loop-detection.test.ts
  • src/agents/pi-tools.before-tool-call.e2e.test.ts

AI Assistance

Made with Codex

Changed files

  • src/agents/pi-tools.before-tool-call.e2e.test.ts (modified, +39/-2)
  • src/agents/tool-loop-detection.test.ts (modified, +112/-0)
  • src/agents/tool-loop-detection.ts (modified, +103/-1)
  • src/infra/diagnostic-events.ts (modified, +6/-1)
  • src/logging/diagnostic-session-state.ts (modified, +2/-0)
  • src/logging/diagnostic.ts (modified, +6/-1)

PR #63041: fix: detect and abort infinite loop when agent repeatedly calls non-existent tools

Description (problem / solution / changelog)

Summary

Fixes #62971

When the agent calls a tool that doesn't exist in the registered tool set, the pi-agent-core SDK returns an immediate error ("Tool X not found") and bypasses all beforeToolCall/afterToolCall hooks. This means OpenClaw's existing tool-loop-detection framework never sees these failures, causing the agent to retry the same non-existent tool indefinitely — burning tokens and never terminating.

This PR adds a dedicated unknown_tool detector that catches this specific failure mode and aborts the run after 3 consecutive unknown-tool errors.

Root Cause

The SDK's prepareToolCall() short-circuits for unknown tools: it returns an error result directly without invoking beforeToolCall or afterToolCall. The only OpenClaw code path that observes these errors is the tool_execution_end event handler (handleToolExecutionEnd).

Changes

Detection (tool-loop-detection.ts)

  • Added "unknown_tool" to LoopDetectorKind
  • Added isUnknownToolErrorText() to match SDK's "Tool X not found" error pattern
  • Added getUnknownToolStreak() and detectUnknownToolLoop() with a critical threshold of 3 consecutive errors
  • This detector is always active regardless of the main loop-detection enabled flag, since unknown-tool loops are always unrecoverable

Integration (pi-embedded-subscribe.handlers.tools.ts)

  • In handleToolExecutionEnd, when isToolError is true: check if the error matches the unknown-tool pattern, record it in diagnostic session state, and run the detector
  • If critical threshold is reached, set sessionState.unknownToolLoopDetected flag

Termination (attempt.ts)

  • Added a streamFn wrapper that checks unknownToolLoopDetected before each LLM call
  • If the flag is set, immediately aborts the run via runAbortController.abort() with a descriptive error

Type Extensions

  • config/types.tools.ts: added unknownTool detector config and unknownToolCriticalThreshold
  • logging/diagnostic-session-state.ts: added unknownTool flag on ToolCallRecord and unknownToolLoopDetected on SessionState
  • logging/diagnostic.ts & infra/diagnostic-events.ts: added "unknown_tool" to detector kind union
  • pi-tools.before-tool-call.runtime.ts: exported new detection functions for lazy loading

Test Plan

  • Unit tests for isUnknownToolErrorText — matches SDK error format, rejects unrelated errors
  • Unit tests for detectUnknownToolLoop — threshold detection, streak reset by normal calls, mixed unknown tools, custom threshold, independent of main enabled flag
  • pnpm check (tsgo + oxlint) passes
  • Existing tool-loop-detection tests remain green

Changed files

  • src/agents/pi-embedded-runner/run/attempt.ts (modified, +40/-0)
  • src/agents/pi-embedded-subscribe.handlers.tools.ts (modified, +67/-0)
  • src/agents/pi-tools.before-tool-call.runtime.ts (modified, +4/-0)
  • src/agents/tool-loop-detection.test.ts (modified, +127/-0)
  • src/agents/tool-loop-detection.ts (modified, +78/-1)
  • src/config/schema.base.generated.ts (modified, +16/-0)
  • src/config/types.tools.ts (modified, +4/-0)
  • src/config/zod-schema.agent-runtime.ts (modified, +2/-0)
  • src/infra/diagnostic-events.ts (modified, +6/-1)
  • src/logging/diagnostic-session-state.ts (modified, +6/-0)
  • src/logging/diagnostic.ts (modified, +6/-1)
RAW_BUFFERClick to expand / collapse

Bug: 工具不存在时进入无限循环调用无法退出

描述

当 LLM 调用一个不存在的工具(工具名拼写错误/工具未安装)时,系统会进入无限循环:LLM 一直重复调用同一个不存在的工具,无法自行退出,必须用户手动执行 /reset/new 重启会话才能恢复。

复现步骤

  1. 让 LLM 调用一个不存在的工具名称,例如 some_non_existent_tool
  2. 系统返回错误:Tool 'some_non_existent_tool' not found
  3. LLM 收到错误后,仍然重复尝试调用同一个不存在的工具,陷入死循环
  4. 循环持续,直到用户手动干预重启

预期行为

  • 连续 N 次(建议 2 次)调用同一个不存在的工具后,系统应该自动停止循环
  • 抛出明确错误给用户,请求用户介入纠正工具名称/安装工具
  • 不应该让 LLM 无限重试浪费 token 和时间

环境信息

  • 操作系统: macOS 25.3.0 (arm64)
  • Node 版本: v25.8.2
  • OpenClaw 版本: 当前最新开发版

建议修复方向

  1. 在工具调用层增加计数器:对同一个会话中连续调用同一个不存在工具的次数进行统计
  2. 超过阈值(比如 2 次)后,停止调度,直接返回错误给用户,终止当前轮次
  3. 或者在错误提示信息中明确指示 LLM:工具不存在,停止调用,等待用户修正,打破循环

extent analysis

TL;DR

Implement a counter to track consecutive calls to non-existent tools and stop the loop after a threshold, such as 2 attempts, to prevent infinite recursion.

Guidance

  • Introduce a counter in the tool invocation layer to track the number of consecutive calls to the same non-existent tool.
  • Set a threshold (e.g., 2 attempts) and stop the loop when exceeded, returning an error to the user and terminating the current session.
  • Consider modifying error messages to explicitly instruct the LLM to stop calling the non-existent tool and wait for user correction.
  • Review the current error handling mechanism to ensure it properly handles and propagates errors from tool invocations.

Example

let consecutiveFailures = 0;
const maxAttempts = 2;

// Within the tool invocation function
if (toolExists(toolName)) {
  // Proceed with tool invocation
} else {
  consecutiveFailures++;
  if (consecutiveFailures >= maxAttempts) {
    // Return error to user and terminate session
    return { error: `Tool '${toolName}' not found. Stopping attempts.` };
  } else {
    // Handle single failure, potentially retrying
  }
}

Notes

This solution assumes that the tool invocation mechanism can be modified to include a counter and threshold check. The exact implementation details may vary depending on the existing codebase and architecture.

Recommendation

Apply workaround: Implement the counter and threshold mechanism to prevent infinite loops when calling non-existent tools, as it directly addresses the issue without requiring version upgrades or significant changes to the underlying system.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING