openclaw - 💡(How to fix) Fix [Bug]: `exec` tool schema exposes `security`/`elevated`/`ask` fields as model-controllable; model self-imposes denial [1 comments, 2 participants]

openclaw2026-05-01 20:47:17

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#75811•Fetched 2026-05-02 05:29:41

View on GitHub

Comments

Participants

Timeline

Reactions

Author

camerono

Participants

camerono

clawsweeper[bot]

Timeline (top)

cross-referenced ×2commented ×1

The exec tool schema in OpenClaw 2026.4.9 exposes the security field (enum: deny / allowlist / default, possibly others) as part of the tool's argument surface visible to the model. Smaller models — in our testing, Hermes-3-Llama-3.1-8B served via vLLM — populate this field non-deterministically when generating an exec toolCall, sampling from the enum's values. The gateway honors the model-supplied value, returns a denial when the value is restrictive, and the model then "honestly" relays the resulting denial to the user. But the denial only happened because the model populated the field.

Other tool-schema fields with the same property (likely elevated, host, ask, possibly node) are presumably affected by the same pattern. We've only specifically confirmed security.

This is a schema-design issue: security/permission knobs should be operator-controlled, not model-controllable. Different from existing exec-related issues (#72858, #69386, #56775, #58748, #74379) which discuss what to do when exec is denied by operator policy — this issue is about the model causing denials by populating fields it shouldn't have access to.

Error Message

| T2.C2.1.g (date) | "deny" | {"status":"error","tool":"exec","error":"exec denied: host=gateway security=deny"} | | T2.C2.2.b (chained fetch+date) | "allowlist" | {"status":"error","tool":"exec","error":"exec denied: allowlist miss"} | 2. The denial looks like a real denial. The exec denied: host=gateway security=deny error is structurally identical to what an operator-enforced denial would look like. Without inspecting the toolCall arguments, the model's reply ("the exec tool was denied") is indistinguishable from a legitimate refusal.

(a) Drop the operator-only fields from the JSON Schema sent to the model. Anything the model supplies for those fields gets ignored at the gateway, or treated as a validation error so the bug is loud.

Root Cause

Fix Action

Fix / Workaround

A test harness (scripts/test/t2-integration.sh in our nemoclaw repo) invokes openclaw agent --json with prompts that should drive an exec dispatch:

Test row	`security` value model supplied	Gateway response
`T2.C2.1.g` (date)	`"deny"`	`{"status":"error","tool":"exec","error":"exec denied: host=gateway security=deny"}`
`T2.C2.2.b` (chained fetch+date)	`"allowlist"`	`{"status":"error","tool":"exec","error":"exec denied: allowlist miss"}`
`T2.C8.4` (cat missing file)	`"default"`	(would have allowed; model didn't actually dispatch in that run)
`T2.C2.1.f` (uname -m)	(omitted? or `"default"`?)	succeeded
`T2.C2.1.h` (wc -l)	(omitted? or `"default"`?)	succeeded

The model isn't reasoning about security — it's filling in values from the schema enum. Other exec calls in the same runs succeeded with different (or no) security values, so the dispatch path itself is fine. It's the schema exposure that drives the intermittent self-denial.

Code Example

Use the 'exec' tool to run 'date -u +%Y-%m-%d' and tell me the current year.

---

{
  "id": "chatcmpl-tool-...",
  "name": "exec",
  "arguments": {
    "command": "date -u +%Y-%m-%d",
    "workdir": "",
    "env": {},
    "yieldMs": 10000,
    "background": false,
    "timeout": 0,
    "pty": false,
    "elevated": false,
    "host": "auto",
    "security": "deny",
    "ask": "off",
    "node": "auto"
  }
}

RAW_BUFFERClick to expand / collapse

Summary

Other tool-schema fields with the same property (likely elevated, host, ask, possibly node) are presumably affected by the same pattern. We've only specifically confirmed security.

Stack

OpenClaw 2026.4.9 (npm-global, gateway via systemd-user on 127.0.0.1:18789)
vLLM vllm/vllm-openai:latest on 127.0.0.1:8002, --enable-auto-tool-choice --tool-call-parser hermes --max-model-len 32768
Model: NousResearch/Hermes-3-Llama-3.1-8B, served-name hermes-3-llama-3.1-8b
Sampling: reproduces at default sampling (non-deterministic) and at temperature=0 + seed=42 (deterministic, picks security: "deny" consistently for some prompts)

Repro

A test harness (scripts/test/t2-integration.sh in our nemoclaw repo) invokes openclaw agent --json with prompts that should drive an exec dispatch:

Use the 'exec' tool to run 'date -u +%Y-%m-%d' and tell me the current year.

Three values for the security field have been observed in toolCall arguments across runs from the same model on the same stack:

Test row	`security` value model supplied	Gateway response
`T2.C2.1.g` (date)	`"deny"`	`{"status":"error","tool":"exec","error":"exec denied: host=gateway security=deny"}`
`T2.C2.2.b` (chained fetch+date)	`"allowlist"`	`{"status":"error","tool":"exec","error":"exec denied: allowlist miss"}`
`T2.C8.4` (cat missing file)	`"default"`	(would have allowed; model didn't actually dispatch in that run)
`T2.C2.1.f` (uname -m)	(omitted? or `"default"`?)	succeeded
`T2.C2.1.h` (wc -l)	(omitted? or `"default"`?)	succeeded

The full toolCall arguments object the model emits, from a session jsonl, shows it populates all the schema fields not just security:

{
  "id": "chatcmpl-tool-...",
  "name": "exec",
  "arguments": {
    "command": "date -u +%Y-%m-%d",
    "workdir": "",
    "env": {},
    "yieldMs": 10000,
    "background": false,
    "timeout": 0,
    "pty": false,
    "elevated": false,
    "host": "auto",
    "security": "deny",
    "ask": "off",
    "node": "auto"
  }
}

12 properties populated for a one-line command. The model is filling defaults from the schema rather than supplying only the args the prompt actually needs. security is the one with the highest-impact failure mode — "deny" rejects everything, "allowlist" rejects unlisted commands — but elevated, ask, and host could plausibly produce other footgun behaviors we haven't fully characterized yet.

Why this is a real bug

A few angles, in roughly increasing severity:

Operator intent is silently overridden. A gateway operator who set up security: "default" (or whatever the local-policy value is) expects exec to use that policy. The model can override it at request time and the gateway doesn't notice.
The denial looks like a real denial. The exec denied: host=gateway security=deny error is structurally identical to what an operator-enforced denial would look like. Without inspecting the toolCall arguments, the model's reply ("the exec tool was denied") is indistinguishable from a legitimate refusal.
Subsequent fabrication is the pathological case. When exec is self-denied, the model often falls back to confabulating a result from training data instead of relaying the denial honestly. We have a separate row from the same test session where the model fabricated date=2023-05-01 (training-prior value) after an allowlist-mode self-denial of date -u +%Y-%m-%d — a confidently-wrong answer that an operator would have to inspect the jsonl to detect. (That confabulation pattern is being tracked separately on #45049 and #49876; this issue is just about the schema exposure that enables the denial in the first place.)
Other tool schemas have the same shape. process, nodes, and cron likely expose similar operator-only fields to the model. A schema-design fix here would be reusable.

Suggested fix

Split the exec tool schema (and analogous tool schemas) into two surfaces:

Model-controllable args: command, workdir, env, timeout, pty, background, yieldMs. The fields the model legitimately needs to specify.
Operator-only args: security, elevated, host, ask, node. Set from session config or gateway policy at the dispatch boundary; not visible to the model in the OpenAI-format tool definition.

Either approach works:

(a) Drop the operator-only fields from the JSON Schema sent to the model. Anything the model supplies for those fields gets ignored at the gateway, or treated as a validation error so the bug is loud.
(b) Strip / normalize model-supplied values for those fields at the pre-dispatch boundary, replacing them with the operator-policy values. Log when the model supplied something the gateway is overriding.

(b) is the safer iteration step — doesn't break existing skill / config flows that might depend on the field names. (a) is the cleaner long-term answer.

This is essentially the same operator-vs-model boundary that the existing tool-policy infrastructure (pi-tools.policy.ts, tool-fs-policy.ts referenced in the #45049 clawsweeper review) already enforces for paths. We're asking for the same boundary applied to security-relevant scalar fields on exec.

Workaround we're using locally

In our extension-candidates ledger this is filed as a gateway-side exec argument normalization wrapper. Intent is to strip model-supplied values for the operator-only fields at our layer, log the override, and dispatch with operator-policy values. About half a day of work as a wrapping shim. We'd far rather contribute the upstream fix.

Reproducibility

Deterministic at agents.defaults.params.temperature=0 + seed=42. Non-deterministic at default sampling — model samples a different security value across re-runs of the same prompt. Can share the full session jsonls, harness output, and the schema-rendered tool definition the model receives if useful.

Related upstream

Not duplicates: #72858, #69386, #56775, #58748, #74379 (operator-side exec denial issues — this is model-side).
Adjacent: #45049, #44179, #49876 (the post-denial fabrication that follows this bug).
Adjacent: NemoClaw#2731 (Hermes-3 tool-call template fragility — same model class, related symptom area).

extent analysis

TL;DR

The most likely fix is to split the exec tool schema into model-controllable and operator-only args, ignoring or overriding model-supplied values for security-relevant fields.

Guidance

Identify the exec tool schema and its fields, focusing on security-relevant fields like security, elevated, host, ask, and node.
Determine which fields should be model-controllable and which should be operator-only, based on the desired security and policy requirements.
Implement a solution to ignore or override model-supplied values for operator-only fields, either by dropping them from the JSON Schema or normalizing them at the pre-dispatch boundary.
Log when the model supplies values that are overridden by the gateway to detect potential issues.

Example

No specific code example is provided, as the implementation details depend on the specific OpenClaw and gateway configurations.

Notes

This fix assumes that the exec tool schema is the primary cause of the issue, and that splitting the schema into model-controllable and operator-only args will resolve the problem. However, other tool schemas may have similar issues, and a more comprehensive solution may be necessary.

Recommendation

Apply workaround (b) by stripping or normalizing model-supplied values for operator-only fields at the pre-dispatch boundary, replacing them with operator-policy values, as this is a safer iteration step that doesn't break existing skill or config flows.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#container setup #orchestration issue #cache issue #memory leak #API versioning

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix [Bug]: `exec` tool schema exposes `security`/`elevated`/`ask` fields as model-controllable; model self-imposes denial [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Code Example

Summary

Stack

Repro

Why this is a real bug

Suggested fix

Workaround we're using locally

Reproducibility

Related upstream

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: `exec` tool schema exposes `security`/`elevated`/`ask` fields as model-controllable; model self-imposes denial [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Code Example

Summary

Stack

Repro

Why this is a real bug

Suggested fix

Workaround we're using locally

Reproducibility

Related upstream

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING