openclaw - ✅(Solved) Fix [Feature]: Add run-scoped cross-tool consecutive error cascade detection [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#75923Fetched 2026-05-03 04:44:14
View on GitHub
Comments
1
Participants
2
Timeline
17
Reactions
2
Author
Timeline (top)
mentioned ×6subscribed ×6cross-referenced ×3commented ×1

Add a run-scoped loop-detection guard for cross-tool consecutive error cascades.

Error Message

Add a run-scoped loop-detection guard for cross-tool consecutive error cascades.

  • count consecutive completed tool outcomes whose resultHash encodes an error Important non-goal: this is not asking for a per-turn tool-call cap or retry budget. The request is intentionally narrower: stop obvious cross-tool error cascades within one run, using the same runId-scoped semantics the current loop detector already uses.
  • Retry budgets keyed by error digest: useful, but a larger design surface than needed for this specific failure mode. This request is narrower than those threads: it is specifically about a run-scoped, cross-tool, consecutive error cascade guard.

Root Cause

Because each tool call is different, genericRepeat does not trigger. That lets a run keep burning tokens and time even though the outcome is uniformly failure.

Fix Action

Fixed

PR fix notes

PR #75924: fix(loop-detection): block run-scoped consecutive cross-tool error cascades

Description (problem / solution / changelog)

Summary

  • Problem: loop detection misses run-scoped cross-tool error cascades, so a run can keep pivoting across different failing tools without tripping the existing repeated-call guards.
  • Why it matters: this wastes tokens and runtime, and hides the fact that the run is already stuck on a shared underlying failure.
  • What changed: added a narrow consecutive_errors detector plus config/schema/docs/test coverage, all aligned to existing runId-scoped loop-detection semantics.
  • What did NOT change (scope boundary): no per-turn tool-call cap, no session-level counters, no wall-clock turn inference, no Control UI/browser-bundle work.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #75923
  • Related #72555
  • Related #75468
  • Related #75841
  • Related #53329
  • Related #42933
  • This PR fixes a bug or regression

Root Cause (if applicable)

  • Root cause: the loop detector tracked tool names, args, and repeated no-progress patterns, but it had no run-scoped detector for consecutive failing outcomes across different tools.
  • Missing detection / guardrail: no guard existed for the pattern “different tools, same run, every completed call fails.”
  • Contributing context (if known): a previous broader draft tried to solve this with a per-turn counter, but review showed that wall-clock turn inference was semantically wrong.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file:
    • src/agents/tool-loop-detection.test.ts
    • src/agents/pi-tools.before-tool-call.e2e.test.ts
  • Scenario the test should lock in:
    • same-run consecutive cross-tool failures eventually block
    • fresh run ids do not inherit the previous run's error streak
  • Why this is the smallest reliable guardrail: the pure detector test locks the counting semantics while the before-tool-call e2e test proves the real hook path and runId scoping.
  • Existing test that already covers this (if any): none before this PR.
  • If no new test is added, why not: N/A

User-visible / Behavior Changes

  • OpenClaw can now block obvious run-scoped cross-tool error cascades via tools.loopDetection.consecutiveErrorThreshold.
  • No behavior change unless loop detection is enabled.

Diagram (if applicable)

Before:
read(error) -> list(error) -> write(error) -> exec(error) -> keep trying tools

After:
read(error) -> list(error) -> write(error) -> threshold reached -> next tool call blocked

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: local source checkout
  • Model/provider: N/A for targeted tests; local dev gateway for runtime smoke
  • Integration/channel (if any): agent tool loop detection + local gateway runtime
  • Relevant config (redacted): dev profile config under ~/.openclaw-dev/openclaw.json

Steps

  1. Run pnpm test src/agents/tool-loop-detection.test.ts src/agents/pi-tools.before-tool-call.e2e.test.ts
  2. Run pnpm build
  3. Start local dev gateway from this checkout: pnpm openclaw --dev setup && pnpm openclaw --dev gateway run --port 19001 --verbose
  4. Probe health: curl -sv --max-time 15 http://127.0.0.1:19001/healthz
  5. Probe root HTTP surface: curl -I --max-time 10 http://127.0.0.1:19001/

Expected

  • targeted tests pass
  • build passes
  • gateway reaches ready
  • /healthz returns live status
  • root HTTP surface returns 200

Actual

  • targeted tests passed
  • build passed
  • gateway reached ready
  • /healthz returned {"ok":true,"status":"live"}
  • root HTTP surface returned HTTP/1.1 200 OK

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Initial failing proof before implementation:

  • new unit tests failed because CONSECUTIVE_ERROR_THRESHOLD and the detector logic did not exist yet

Runtime smoke evidence after implementation:

  • local dev gateway log reached ready
  • /healthz response: {"ok":true,"status":"live"}
  • root HTTP response: HTTP/1.1 200 OK

Human Verification (required)

  • Verified scenarios:
    • targeted detector tests
    • hook-path run-scoped blocking/isolation tests
    • full local build
    • local dev gateway startup from the modified checkout
    • local health and HTTP probes against the running instance
  • Edge cases checked:
    • same-run block after cross-tool failures
    • cross-run isolation via runId
  • What you did not verify:
    • a real provider-driven long autonomous run naturally hitting the new detector
    • globally installed managed gateway service restart path

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: some legitimate workflows may intentionally try several failing tools in a row before succeeding.
    • Mitigation: the detector is optional through loop detection enablement, thresholded, and naturally reset by any successful tool outcome.
  • Risk: overlap confusion with broader retry-budget / turn-cap proposals.
    • Mitigation: this PR stays narrow and explicitly excludes per-turn or session-wide counters.

Changed files

  • CHANGELOG.md (modified, +2/-0)
  • docs/.generated/config-baseline.sha256 (modified, +2/-2)
  • docs/gateway/config-tools.md (modified, +4/-0)
  • docs/tools/loop-detection.md (modified, +3/-1)
  • src/agents/pi-tools.before-tool-call.e2e.test.ts (modified, +108/-0)
  • src/agents/pi-tools.before-tool-call.ts (modified, +0/-7)
  • src/agents/tool-loop-detection.test.ts (modified, +98/-0)
  • src/agents/tool-loop-detection.ts (modified, +39/-1)
  • src/config/config-misc.test.ts (modified, +25/-0)
  • src/config/schema.base.generated.ts (modified, +18/-0)
  • src/config/schema.help.ts (modified, +2/-0)
  • src/config/schema.labels.ts (modified, +1/-0)
  • src/config/types.tools.ts (modified, +2/-0)
  • src/config/zod-schema.agent-runtime.ts (modified, +13/-0)
  • src/infra/diagnostic-events.ts (modified, +2/-1)
  • src/logging/diagnostic.ts (modified, +2/-1)

Code Example

{
  "tools": {
    "loopDetection": {
      "enabled": true,
      "consecutiveErrorThreshold": 10
    }
  }
}
RAW_BUFFERClick to expand / collapse

Summary

Add a run-scoped loop-detection guard for cross-tool consecutive error cascades.

Problem to solve

OpenClaw already detects repeated identical calls, known polling loops, ping-pong alternation, repeated unknown tools, and some no-progress repetition. It still misses a common failure mode where the agent keeps pivoting across different tools while every completed tool call fails.

Example pattern inside one run:

  1. read fails
  2. list fails
  3. write fails
  4. exec fails
  5. the run keeps trying more tools instead of surfacing the shared failure

Because each tool call is different, genericRepeat does not trigger. That lets a run keep burning tokens and time even though the outcome is uniformly failure.

Proposed solution

Add a new run-scoped loop detector named consecutive_errors.

Behavior:

  • inspect the tail of the current run-scoped tool-call history
  • count consecutive completed tool outcomes whose resultHash encodes an error
  • once the streak reaches a configurable threshold, block the next tool call with a critical loop-detection failure

Suggested config:

{
  "tools": {
    "loopDetection": {
      "enabled": true,
      "consecutiveErrorThreshold": 10
    }
  }
}

Important non-goal: this is not asking for a per-turn tool-call cap or retry budget. The request is intentionally narrower: stop obvious cross-tool error cascades within one run, using the same runId-scoped semantics the current loop detector already uses.

Alternatives considered

  • Per-turn or session-wide tool-call caps: broader, but easier to implement incorrectly and already overlaps with other upstream threads.
  • Retry budgets keyed by error digest: useful, but a larger design surface than needed for this specific failure mode.
  • Do nothing and rely on existing repeat detectors: insufficient, because those detectors are mostly pattern-specific and do not catch cross-tool failure pivots.

Impact

Affected: agents using tools in iterative or autonomous workflows Severity: Medium to High Frequency: Intermittent but costly when it happens Consequence: wasted tokens, longer runs, more manual intervention, and poorer failure reporting when the system should surface a shared underlying problem earlier

Evidence/examples

This narrower fix was locally validated with:

  • targeted unit tests for the detector
  • hook-path e2e tests proving same-run blocking and cross-run isolation
  • successful local build
  • local dev gateway startup from the modified checkout
  • successful /healthz probe on the running instance

Related upstream threads for context:

  • #72555
  • #75468
  • #75841
  • #53329
  • #42933

This request is narrower than those threads: it is specifically about a run-scoped, cross-tool, consecutive error cascade guard.

Additional information

A local implementation was validated on branch fix/loop-detection-consecutive-errors-only at commit b864528a59fcd85efec9dc09ffa4de7bda2c1f26 before drafting this request.

extent analysis

TL;DR

Implement a run-scoped loop-detection guard to prevent cross-tool consecutive error cascades by adding a consecutive_errors detector.

Guidance

  • Review the proposed consecutive_errors detector behavior to ensure it aligns with the desired failure mode detection.
  • Configure the consecutiveErrorThreshold value based on the specific use case and tolerance for consecutive errors.
  • Verify the detector's effectiveness by testing it with targeted unit tests and hook-path e2e tests, as done in the local implementation.
  • Consider the impact on agents using tools in iterative or autonomous workflows and monitor for any unintended consequences.

Example

{
  "tools": {
    "loopDetection": {
      "enabled": true,
      "consecutiveErrorThreshold": 10
    }
  }
}

This configuration enables the consecutive_errors detector with a threshold of 10 consecutive errors.

Notes

The proposed solution is intentionally narrower in scope, focusing on run-scoped, cross-tool, consecutive error cascades. It does not address per-turn or session-wide tool-call caps or retry budgets.

Recommendation

Apply the proposed consecutive_errors detector workaround to prevent cross-tool consecutive error cascades, as it has been locally validated and addresses a specific failure mode.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Feature]: Add run-scoped cross-tool consecutive error cascade detection [1 pull requests, 1 comments, 2 participants]