openclaw - 💡(How to fix) Fix [Bug]: `exec` tool schema exposes `security`/`elevated`/`ask` fields as model-controllable; model self-imposes denial [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#75811Fetched 2026-05-02 05:29:41
View on GitHub
Comments
1
Participants
2
Timeline
3
Reactions
2
Author
Timeline (top)
cross-referenced ×2commented ×1

The exec tool schema in OpenClaw 2026.4.9 exposes the security field (enum: deny / allowlist / default, possibly others) as part of the tool's argument surface visible to the model. Smaller models — in our testing, Hermes-3-Llama-3.1-8B served via vLLM — populate this field non-deterministically when generating an exec toolCall, sampling from the enum's values. The gateway honors the model-supplied value, returns a denial when the value is restrictive, and the model then "honestly" relays the resulting denial to the user. But the denial only happened because the model populated the field.

Other tool-schema fields with the same property (likely elevated, host, ask, possibly node) are presumably affected by the same pattern. We've only specifically confirmed security.

This is a schema-design issue: security/permission knobs should be operator-controlled, not model-controllable. Different from existing exec-related issues (#72858, #69386, #56775, #58748, #74379) which discuss what to do when exec is denied by operator policy — this issue is about the model causing denials by populating fields it shouldn't have access to.

Error Message

| T2.C2.1.g (date) | "deny" | {"status":"error","tool":"exec","error":"exec denied: host=gateway security=deny"} | | T2.C2.2.b (chained fetch+date) | "allowlist" | {"status":"error","tool":"exec","error":"exec denied: allowlist miss"} | 2. The denial looks like a real denial. The exec denied: host=gateway security=deny error is structurally identical to what an operator-enforced denial would look like. Without inspecting the toolCall arguments, the model's reply ("the exec tool was denied") is indistinguishable from a legitimate refusal.

  • (a) Drop the operator-only fields from the JSON Schema sent to the model. Anything the model supplies for those fields gets ignored at the gateway, or treated as a validation error so the bug is loud.

Root Cause

The exec tool schema in OpenClaw 2026.4.9 exposes the security field (enum: deny / allowlist / default, possibly others) as part of the tool's argument surface visible to the model. Smaller models — in our testing, Hermes-3-Llama-3.1-8B served via vLLM — populate this field non-deterministically when generating an exec toolCall, sampling from the enum's values. The gateway honors the model-supplied value, returns a denial when the value is restrictive, and the model then "honestly" relays the resulting denial to the user. But the denial only happened because the model populated the field.

Fix Action

Fix / Workaround

A test harness (scripts/test/t2-integration.sh in our nemoclaw repo) invokes openclaw agent --json with prompts that should drive an exec dispatch:

Test rowsecurity value model suppliedGateway response
T2.C2.1.g (date)"deny"{"status":"error","tool":"exec","error":"exec denied: host=gateway security=deny"}
T2.C2.2.b (chained fetch+date)"allowlist"{"status":"error","tool":"exec","error":"exec denied: allowlist miss"}
T2.C8.4 (cat missing file)"default"(would have allowed; model didn't actually dispatch in that run)
T2.C2.1.f (uname -m)(omitted? or "default"?)succeeded
T2.C2.1.h (wc -l)(omitted? or "default"?)succeeded

The model isn't reasoning about security — it's filling in values from the schema enum. Other exec calls in the same runs succeeded with different (or no) security values, so the dispatch path itself is fine. It's the schema exposure that drives the intermittent self-denial.

Code Example

Use the 'exec' tool to run 'date -u +%Y-%m-%d' and tell me the current year.

---

{
  "id": "chatcmpl-tool-...",
  "name": "exec",
  "arguments": {
    "command": "date -u +%Y-%m-%d",
    "workdir": "",
    "env": {},
    "yieldMs": 10000,
    "background": false,
    "timeout": 0,
    "pty": false,
    "elevated": false,
    "host": "auto",
    "security": "deny",
    "ask": "off",
    "node": "auto"
  }
}
RAW_BUFFERClick to expand / collapse

Summary

The exec tool schema in OpenClaw 2026.4.9 exposes the security field (enum: deny / allowlist / default, possibly others) as part of the tool's argument surface visible to the model. Smaller models — in our testing, Hermes-3-Llama-3.1-8B served via vLLM — populate this field non-deterministically when generating an exec toolCall, sampling from the enum's values. The gateway honors the model-supplied value, returns a denial when the value is restrictive, and the model then "honestly" relays the resulting denial to the user. But the denial only happened because the model populated the field.

Other tool-schema fields with the same property (likely elevated, host, ask, possibly node) are presumably affected by the same pattern. We've only specifically confirmed security.

This is a schema-design issue: security/permission knobs should be operator-controlled, not model-controllable. Different from existing exec-related issues (#72858, #69386, #56775, #58748, #74379) which discuss what to do when exec is denied by operator policy — this issue is about the model causing denials by populating fields it shouldn't have access to.

Stack

  • OpenClaw 2026.4.9 (npm-global, gateway via systemd-user on 127.0.0.1:18789)
  • vLLM vllm/vllm-openai:latest on 127.0.0.1:8002, --enable-auto-tool-choice --tool-call-parser hermes --max-model-len 32768
  • Model: NousResearch/Hermes-3-Llama-3.1-8B, served-name hermes-3-llama-3.1-8b
  • Sampling: reproduces at default sampling (non-deterministic) and at temperature=0 + seed=42 (deterministic, picks security: "deny" consistently for some prompts)

Repro

A test harness (scripts/test/t2-integration.sh in our nemoclaw repo) invokes openclaw agent --json with prompts that should drive an exec dispatch:

Use the 'exec' tool to run 'date -u +%Y-%m-%d' and tell me the current year.

Three values for the security field have been observed in toolCall arguments across runs from the same model on the same stack:

Test rowsecurity value model suppliedGateway response
T2.C2.1.g (date)"deny"{"status":"error","tool":"exec","error":"exec denied: host=gateway security=deny"}
T2.C2.2.b (chained fetch+date)"allowlist"{"status":"error","tool":"exec","error":"exec denied: allowlist miss"}
T2.C8.4 (cat missing file)"default"(would have allowed; model didn't actually dispatch in that run)
T2.C2.1.f (uname -m)(omitted? or "default"?)succeeded
T2.C2.1.h (wc -l)(omitted? or "default"?)succeeded

The model isn't reasoning about security — it's filling in values from the schema enum. Other exec calls in the same runs succeeded with different (or no) security values, so the dispatch path itself is fine. It's the schema exposure that drives the intermittent self-denial.

The full toolCall arguments object the model emits, from a session jsonl, shows it populates all the schema fields not just security:

{
  "id": "chatcmpl-tool-...",
  "name": "exec",
  "arguments": {
    "command": "date -u +%Y-%m-%d",
    "workdir": "",
    "env": {},
    "yieldMs": 10000,
    "background": false,
    "timeout": 0,
    "pty": false,
    "elevated": false,
    "host": "auto",
    "security": "deny",
    "ask": "off",
    "node": "auto"
  }
}

12 properties populated for a one-line command. The model is filling defaults from the schema rather than supplying only the args the prompt actually needs. security is the one with the highest-impact failure mode — "deny" rejects everything, "allowlist" rejects unlisted commands — but elevated, ask, and host could plausibly produce other footgun behaviors we haven't fully characterized yet.

Why this is a real bug

A few angles, in roughly increasing severity:

  1. Operator intent is silently overridden. A gateway operator who set up security: "default" (or whatever the local-policy value is) expects exec to use that policy. The model can override it at request time and the gateway doesn't notice.

  2. The denial looks like a real denial. The exec denied: host=gateway security=deny error is structurally identical to what an operator-enforced denial would look like. Without inspecting the toolCall arguments, the model's reply ("the exec tool was denied") is indistinguishable from a legitimate refusal.

  3. Subsequent fabrication is the pathological case. When exec is self-denied, the model often falls back to confabulating a result from training data instead of relaying the denial honestly. We have a separate row from the same test session where the model fabricated date=2023-05-01 (training-prior value) after an allowlist-mode self-denial of date -u +%Y-%m-%d — a confidently-wrong answer that an operator would have to inspect the jsonl to detect. (That confabulation pattern is being tracked separately on #45049 and #49876; this issue is just about the schema exposure that enables the denial in the first place.)

  4. Other tool schemas have the same shape. process, nodes, and cron likely expose similar operator-only fields to the model. A schema-design fix here would be reusable.

Suggested fix

Split the exec tool schema (and analogous tool schemas) into two surfaces:

  • Model-controllable args: command, workdir, env, timeout, pty, background, yieldMs. The fields the model legitimately needs to specify.
  • Operator-only args: security, elevated, host, ask, node. Set from session config or gateway policy at the dispatch boundary; not visible to the model in the OpenAI-format tool definition.

Either approach works:

  • (a) Drop the operator-only fields from the JSON Schema sent to the model. Anything the model supplies for those fields gets ignored at the gateway, or treated as a validation error so the bug is loud.
  • (b) Strip / normalize model-supplied values for those fields at the pre-dispatch boundary, replacing them with the operator-policy values. Log when the model supplied something the gateway is overriding.

(b) is the safer iteration step — doesn't break existing skill / config flows that might depend on the field names. (a) is the cleaner long-term answer.

This is essentially the same operator-vs-model boundary that the existing tool-policy infrastructure (pi-tools.policy.ts, tool-fs-policy.ts referenced in the #45049 clawsweeper review) already enforces for paths. We're asking for the same boundary applied to security-relevant scalar fields on exec.

Workaround we're using locally

In our extension-candidates ledger this is filed as a gateway-side exec argument normalization wrapper. Intent is to strip model-supplied values for the operator-only fields at our layer, log the override, and dispatch with operator-policy values. About half a day of work as a wrapping shim. We'd far rather contribute the upstream fix.

Reproducibility

Deterministic at agents.defaults.params.temperature=0 + seed=42. Non-deterministic at default sampling — model samples a different security value across re-runs of the same prompt. Can share the full session jsonls, harness output, and the schema-rendered tool definition the model receives if useful.

Related upstream

  • Not duplicates: #72858, #69386, #56775, #58748, #74379 (operator-side exec denial issues — this is model-side).
  • Adjacent: #45049, #44179, #49876 (the post-denial fabrication that follows this bug).
  • Adjacent: NemoClaw#2731 (Hermes-3 tool-call template fragility — same model class, related symptom area).

extent analysis

TL;DR

The most likely fix is to split the exec tool schema into model-controllable and operator-only args, ignoring or overriding model-supplied values for security-relevant fields.

Guidance

  • Identify the exec tool schema and its fields, focusing on security-relevant fields like security, elevated, host, ask, and node.
  • Determine which fields should be model-controllable and which should be operator-only, based on the desired security and policy requirements.
  • Implement a solution to ignore or override model-supplied values for operator-only fields, either by dropping them from the JSON Schema or normalizing them at the pre-dispatch boundary.
  • Log when the model supplies values that are overridden by the gateway to detect potential issues.

Example

No specific code example is provided, as the implementation details depend on the specific OpenClaw and gateway configurations.

Notes

This fix assumes that the exec tool schema is the primary cause of the issue, and that splitting the schema into model-controllable and operator-only args will resolve the problem. However, other tool schemas may have similar issues, and a more comprehensive solution may be necessary.

Recommendation

Apply workaround (b) by stripping or normalizing model-supplied values for operator-only fields at the pre-dispatch boundary, replacing them with operator-policy values, as this is a safer iteration step that doesn't break existing skill or config flows.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING