hermes - 💡(How to fix) Fix [Bug]: background-review fork advertises the full tool schema to LOCAL endpoints, making weak local models thrash the deny-wall

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Root Cause

In _run_review_in_thread, the review AIAgent(...) is constructed with:

enabled_toolsets=getattr(agent, "enabled_toolsets", None),
disabled_toolsets=getattr(agent, "disabled_toolsets", None),

unconditionally. The full-inheritance choice (#29704, "salvage #29568") exists solely to keep tools[] byte-identical for prefix-cache parity — a property that only matters for remote cache-backed providers. The codebase already distinguishes local endpoints via is_local_endpoint(base_url) (agent/model_metadata.py, used in agent_init.py and the local stream-timeout paths), but that predicate is not consulted at the review-fork construction site.

Fix Action

Fix / Workaround

The background memory/skill review fork (agent/background_review.py, _run_review_in_thread) advertises the parent agent's full toolset in the request tools[], then restricts actual dispatch to memory+skills via a thread-local runtime whitelist. This split is deliberate and correct for cache-backed providers: keeping tools[] byte-identical preserves the Anthropic/OpenRouter prefix cache (see the lineage in #16569 → #25434 / #17276 → #29704 for why the schema was un-narrowed).

On a local endpoint, the review fork should advertise only the toolsets it can actually use (["memory", "skills"]), matching its real permissions. On remote/cache-backed endpoints, behavior is unchanged (full inheritance, cache preserved). The runtime whitelist stays as a belt-and-suspenders dispatch guard in both cases.

No config knob needed (cf. #36967, which adds a manual override for tool-count-capped providers — orthogonal; this auto-fires on local). I have a tested patch ready and will open a PR.

Code Example

Background review denied non-whitelisted tool: write_file. Only memory/skill tools are allowed.

---

enabled_toolsets=getattr(agent, "enabled_toolsets", None),
disabled_toolsets=getattr(agent, "disabled_toolsets", None),
RAW_BUFFERClick to expand / collapse

Bug Description

The background memory/skill review fork (agent/background_review.py, _run_review_in_thread) advertises the parent agent's full toolset in the request tools[], then restricts actual dispatch to memory+skills via a thread-local runtime whitelist. This split is deliberate and correct for cache-backed providers: keeping tools[] byte-identical preserves the Anthropic/OpenRouter prefix cache (see the lineage in #16569 → #25434 / #17276 → #29704 for why the schema was un-narrowed).

On a LOCAL endpoint (omlx / Ollama / llama.cpp via a custom provider) there is no prefix cache to preserve — the cache-parity rationale doesn't apply. But the fork still advertises the full schema. A weaker local model then imitates the snapshot conversation history (full of write_file / read_file / terminal / delegate_task calls), emits those tool calls, and burns review turns hitting the whitelist deny-wall:

Background review denied non-whitelisted tool: write_file. Only memory/skill tools are allowed.

Advertising the wide schema buys nothing here and actively misleads the model — pure wasted local compute. (Observed repeatedly with Qwen3-Coder via a local OpenAI-compatible endpoint: ~5 denials per review fork, gone after narrowing.) This compounds other local bg-review token waste reported in #25758 and #12340.

Root Cause

In _run_review_in_thread, the review AIAgent(...) is constructed with:

enabled_toolsets=getattr(agent, "enabled_toolsets", None),
disabled_toolsets=getattr(agent, "disabled_toolsets", None),

unconditionally. The full-inheritance choice (#29704, "salvage #29568") exists solely to keep tools[] byte-identical for prefix-cache parity — a property that only matters for remote cache-backed providers. The codebase already distinguishes local endpoints via is_local_endpoint(base_url) (agent/model_metadata.py, used in agent_init.py and the local stream-timeout paths), but that predicate is not consulted at the review-fork construction site.

Expected Behavior

On a local endpoint, the review fork should advertise only the toolsets it can actually use (["memory", "skills"]), matching its real permissions. On remote/cache-backed endpoints, behavior is unchanged (full inheritance, cache preserved). The runtime whitelist stays as a belt-and-suspenders dispatch guard in both cases.

Proposed Fix

Gate the review fork's enabled_toolsets/disabled_toolsets on the existing is_local_endpoint(base_url):

  • localenabled_toolsets=["memory","skills"], disabled_toolsets=None
  • otherwise → keep getattr(agent, "enabled_toolsets"/"disabled_toolsets", None) (current behavior, cache parity intact)

No config knob needed (cf. #36967, which adds a manual override for tool-count-capped providers — orthogonal; this auto-fires on local). I have a tested patch ready and will open a PR.

Related

  • #29704 / #25434 / #17276 / #16569 — the cache-parity lineage this threads the needle through
  • #25758, #12340 — other facets of local bg-review burning tokens
  • #36967 — adjacent: config-overridable review toolsets for provider tool-count caps (manual, not local-gated)

Environment

  • Hermes: current main
  • OS: macOS (Apple Silicon)
  • Provider: custom / local (OpenAI-compatible, Qwen3-Coder)

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING