hermes - 💡(How to fix) Fix [Bug]: background-review fork advertises the full tool schema to LOCAL endpoints, making weak local models thrash the deny-wall

hermes2026-06-05 17:42:05

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Root Cause

In _run_review_in_thread, the review AIAgent(...) is constructed with:

enabled_toolsets=getattr(agent, "enabled_toolsets", None),
disabled_toolsets=getattr(agent, "disabled_toolsets", None),

unconditionally. The full-inheritance choice (#29704, "salvage #29568") exists solely to keep tools[] byte-identical for prefix-cache parity — a property that only matters for remote cache-backed providers. The codebase already distinguishes local endpoints via is_local_endpoint(base_url) (agent/model_metadata.py, used in agent_init.py and the local stream-timeout paths), but that predicate is not consulted at the review-fork construction site.

Fix Action

Fix / Workaround

The background memory/skill review fork (agent/background_review.py, _run_review_in_thread) advertises the parent agent's full toolset in the request tools[], then restricts actual dispatch to memory+skills via a thread-local runtime whitelist. This split is deliberate and correct for cache-backed providers: keeping tools[] byte-identical preserves the Anthropic/OpenRouter prefix cache (see the lineage in #16569 → #25434 / #17276 → #29704 for why the schema was un-narrowed).

On a local endpoint, the review fork should advertise only the toolsets it can actually use (["memory", "skills"]), matching its real permissions. On remote/cache-backed endpoints, behavior is unchanged (full inheritance, cache preserved). The runtime whitelist stays as a belt-and-suspenders dispatch guard in both cases.

No config knob needed (cf. #36967, which adds a manual override for tool-count-capped providers — orthogonal; this auto-fires on local). I have a tested patch ready and will open a PR.

Code Example

Background review denied non-whitelisted tool: write_file. Only memory/skill tools are allowed.

---

enabled_toolsets=getattr(agent, "enabled_toolsets", None),
disabled_toolsets=getattr(agent, "disabled_toolsets", None),

RAW_BUFFERClick to expand / collapse

Bug Description

On a LOCAL endpoint (omlx / Ollama / llama.cpp via a custom provider) there is no prefix cache to preserve — the cache-parity rationale doesn't apply. But the fork still advertises the full schema. A weaker local model then imitates the snapshot conversation history (full of write_file / read_file / terminal / delegate_task calls), emits those tool calls, and burns review turns hitting the whitelist deny-wall:

Background review denied non-whitelisted tool: write_file. Only memory/skill tools are allowed.

Advertising the wide schema buys nothing here and actively misleads the model — pure wasted local compute. (Observed repeatedly with Qwen3-Coder via a local OpenAI-compatible endpoint: ~5 denials per review fork, gone after narrowing.) This compounds other local bg-review token waste reported in #25758 and #12340.

Root Cause

In _run_review_in_thread, the review AIAgent(...) is constructed with:

enabled_toolsets=getattr(agent, "enabled_toolsets", None),
disabled_toolsets=getattr(agent, "disabled_toolsets", None),

Expected Behavior

Proposed Fix

Gate the review fork's enabled_toolsets/disabled_toolsets on the existing is_local_endpoint(base_url):

local → enabled_toolsets=["memory","skills"], disabled_toolsets=None
otherwise → keep getattr(agent, "enabled_toolsets"/"disabled_toolsets", None) (current behavior, cache parity intact)

No config knob needed (cf. #36967, which adds a manual override for tool-count-capped providers — orthogonal; this auto-fires on local). I have a tested patch ready and will open a PR.

#29704 / #25434 / #17276 / #16569 — the cache-parity lineage this threads the needle through
#25758, #12340 — other facets of local bg-review burning tokens
#36967 — adjacent: config-overridable review toolsets for provider tool-count caps (manual, not local-gated)

Environment

Hermes: current main
OS: macOS (Apple Silicon)
Provider: custom / local (OpenAI-compatible, Qwen3-Coder)

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix [Bug]: background-review fork advertises the full tool schema to LOCAL endpoints, making weak local models thrash the deny-wall

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Code Example

Bug Description

Root Cause

Expected Behavior

Proposed Fix

Related

Environment

Still need to ship something?

TRENDING