openclaw - ✅(Solved) Fix [Feature]: Channel-mediated approval for MCP tool calls (consent envelope) [1 pull requests, 8 comments, 4 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#78308Fetched 2026-05-07 03:38:27
View on GitHub
Comments
8
Participants
4
Timeline
12
Reactions
2
Author
Timeline (top)
commented ×8cross-referenced ×2mentioned ×1subscribed ×1

Let MCP servers opt into the same /approve <id> channel-mediated approval pipeline that already gates shell-exec calls, by returning a small standard envelope from tools/call.

Error Message

  • Severity: medium → high. A model error on vault.reveal is a leaked secret. A model error on email.send_direct is a sent email that can't be unsent.

Root Cause

Let MCP servers opt into the same /approve <id> channel-mediated approval pipeline that already gates shell-exec calls, by returning a small standard envelope from tools/call.

Fix Action

Fixed

PR fix notes

PR #78303: feat(mcp): channel-mediated approval for MCP tool calls (consent envelope)

Description (problem / solution / changelog)

Closes #78308

Summary

Today, when a bundle-MCP tool returns to the agent, OpenClaw passes the result straight through to the model — there's no approval gate analogous to the one exec-approvals provides for shell commands. For act-tier MCP tools (send email, create vault entry, set HA state, etc.), the only thing standing between the user and the action is the model deciding to call it.

This PR adds a small, additive contract — the MCP consent envelope — that lets MCP servers opt into the existing /approve <id> allow-once|allow-always|deny pipeline that already backs shell-exec approvals. Servers that don't opt in are unaffected.

The contract

When a server wants user approval for a call, it returns:

{
  "ok": false,
  "requires_confirmation": true,
  "action_id": "<server-side single-use token>",
  "summary": "<one-line description of the action>",
  "expires_in_seconds": 60
}

…in either structuredContent or as JSON inside content[0].text. OpenClaw recognises the envelope, suppresses the result, and:

  1. Issues a plugin.approval.request to the local gateway (reuses everything: channel-approval-auth, ID-prefix routing, the magic-word reply parser).
  2. Blocks until the user replies on the trusted channel.
  3. On allow-once / allow-always: re-calls the tool with confirmation_token = action_id set on the input. Returns that second result to the agent.
  4. On deny / expiry / error: returns a synthetic {ok:false, approved:false, reason} result. The model never sees action_id.

Why the trust boundary actually moves

The model never sees action_id and confirmation_token is stripped from any model-supplied input on the first call. Even if a malicious or careless agent fabricates a token, the MCP server's redemption check rejects it (the server is responsible for one-shot/TTL enforcement of action_id).

This is the same pattern OpenClaw already enforces for shell exec via exec-approvalsagent asks, user authorises, runtime executes. The PR just generalises that pattern to MCP tools that opt into the contract.

What changed

FileChange
src/agents/pi-bundle-mcp-consent.tsNew. Envelope detection, the default approval requester (calls plugin.approval.request + waitDecision), confirmation_token scrubber, denied-result builder. ~240 lines.
src/agents/pi-bundle-mcp-materialize.tsWraps every tool's execute() in a new callMcpToolWithConsent orchestrator. Plumbs requestApproval, consentEnabled, agentId, and sessionKey through materializeBundleMcpToolsForRun and createBundleMcpToolRuntime.
src/agents/pi-embedded-runner/run/attempt.ts, compact.tsPass the agent harness's existing agentId + sessionKey into the materialize layer so the gateway forwarder can resolve the user's actual delivery channel (WhatsApp / Telegram / Slack / web UI).
src/auto-reply/reply/commands-approve.tsAccept bare /approve allow-once (no id) when there's exactly one pending approval — typing a uuid on a phone is unrealistic UX. On ambiguity (>1 pending), refuses with a hint including the explicit form.
src/infra/plugin-approvals.tsCleaner reply prompt: shows all three decisions (allow-once | allow-always | deny) inline, copy-paste-ready, with the id on its own line for the explicit fallback.
src/config/types.mcp.tsAdds mcp.approvals.enabled?: boolean (default true).
src/agents/pi-bundle-mcp-consent.test.tsNew. 21 unit tests including a live-caught decision: null regression and a agentId/sessionKey propagation regression.
docs/tools/mcp-consent-envelope.mdNew. Server-author contract, configuration, trust-boundary explanation.
CHANGELOG.mdEntry under Unreleased.

No new approval kind is introduced — the implementation reuses the existing "plugin" ApprovalKind.

Test plan

  • pnpm install
  • pnpm lint — 8234 files, 213 rules, 0 warnings, 0 errors.
  • ✅ Full agents-core suite: 272 files passed (272), 3515 tests passed | 4 skipped.
  • pi-bundle-mcp-consent.test.ts — 21/21 green.
  • commands-approve.test.ts — 18/18 green (existing tests still pass after the implicit-id parser change).

Real behavior proof

Behavior addressed: Without channel-mediated approval, the model is the trust gate for MCP tool calls that want user approval. Captured live on vanilla OpenClaw 2026.4.24 with HomeBrain's MCP servers: when the agent fired homebrain-nextcloud__nc-files_share, the consent envelope (with action_id) passed straight to the model, which leaked the token verbatim into chat, offered to self-confirm, and hallucinated a second action_id off-by-one when it lost track of the first. Full screenshot + analysis: #78308 comment. This PR replaces that prompt-level honour-system with runtime-enforced channel approval.

Environment tested: [email protected]. x86_64 Ubuntu 24.04. AMD Radeon RX 9060 XT (16 GB VRAM). llama.cpp + Qwen3.6-35B-A3B (Q5_K_XL). HomeBrain reference MCP servers from oalterg/HomeBrain. OpenClaw built directly from this PR branch and installed live in place of the npm-shipped 2026.4.24 binary for the runtime-side tests.

Steps run after the patch:

# Local: build + pack the PR branch
pnpm build && npm pack
# Deploy the .tgz to the target box
scp openclaw-2026.5.6.tgz [email protected]:/tmp/
ssh [email protected] 'echo admin | sudo -S bash -c "
  cd /usr/lib/node_modules
  rm -rf openclaw && mkdir openclaw && cd openclaw
  tar xzf /tmp/openclaw-2026.5.6.tgz --strip-components=1
  npm install --omit=dev --no-audit --no-fund --silent"'
# Set the missing approvals.plugin config (the forwarder needs it to deliver)
ssh [email protected] 'sudo -u homebrain jq ".approvals = {plugin:{enabled:true,mode:\"session\"}}" \
  ~/.openclaw/openclaw.json > /tmp/c && sudo -u homebrain cp /tmp/c ~/.openclaw/openclaw.json'
ssh [email protected] 'sudo -u homebrain systemctl --user restart openclaw-gateway'

# User sends on WhatsApp:
#   "Create a public share link to /uploads in Nextcloud, expires in 7 days."
# OpenClaw routes through the agent harness, which fires nc.files_share.
# My code intercepts the consent envelope, calls plugin.approval.request.
# The forwarder delivers the prompt to the user's WhatsApp + web UI.
# User taps Allow once on the web UI (or replies /approve allow-once on WhatsApp).
# Runtime re-calls the tool with confirmation_token. Real share URL returns.

Evidence after fix:

The web UI's native plugin-approval modal renders my PR's plugin.approval.request exactly right — title MCP tool approval, description homebrain-nextcloud.nc.files_share — Nextcloud: create public share link for /uploads (expires in 7 days), severity warning, plugin mcp:homebrain-nextcloud, three buttons Allow once / Always allow / Deny, expiry countdown. The user pressed Allow once. Screenshot captured (not committed; happy to attach for maintainer review). Same approval was offered on the user's WhatsApp simultaneously via the routing-fix commit.

$ openclaw --version  # the deployed PR build
OpenClaw 2026.5.6 (a2cd999)

# Gateway log (verbatim) at the moment the user tapped Allow once:
2026-05-06T20:01:22 ⇄ res ✓ plugin.approval.waitDecision 59412ms
                    conn=6f5914ca…dfe2 id=0c6f887a…7e6d

# /var/log/homebrain/mcp-nextcloud-audit.log — the result of the consent loop:
2026-05-06T20:01:22 {"server":"nextcloud","action":"share",
                     "path":"/uploads","expire_days":7}

# Nextcloud confirms the actual share resource exists on disk:
$ curl -u admin:<app-pass> -H 'OCS-APIRequest: true' \
  http://127.0.0.1:8080/ocs/v2.php/apps/files_sharing/api/v1/shares
{"id":"2","share_type":3,"path":"/uploads","stime":1778097682,
 "url":"http://127.0.0.1:8080/s/iiaRr54CKLnZ7sC", ...}

Observed result after the fix:

  • Vanilla baseline (without PR): the model leaks action_id to chat, offers to self-confirm, hallucinates a second action_id off-by-one. Trust boundary is the model's behavior — soft gate. Screenshot + analysis in #78308 comment.
  • PR build: the agent fires the tool, the consent envelope is intercepted, plugin.approval.request is issued to the gateway with the correct agentId + sessionKey so the forwarder can route. Approval prompt arrives in the web UI's native modal (gorgeous render, all three decisions one tap) AND on the user's WhatsApp simultaneously. User taps Allow once (or replies /approve allow-once). Runtime re-calls the tool with confirmation_token = action_id. Real Nextcloud share is created — verified by the audit log entry, the gateway waitDecision 59412ms ✓ line, and the actual share row in NC's files_sharing API.
  • A live regression bug was caught and fixed during this very run: decision: null from the gateway in two-phase mode means "pending, call waitDecision", not "deny". Fixed in ef30bbf with a regression test. Without the live test cycle, that would have shipped as a latent bug masking real user approvals as denials.
  • Forwarder routing was also live-caught: the PR initially didn't propagate agentId/sessionKey from the agent harness to plugin.approval.request, so the forwarder had no session binding and prompts auto-cancelled silently. Fixed in 2056d25 by plumbing both fields through materializeBundleMcpToolsForRun + createBundleMcpToolRuntime from attempt.ts and compact.ts.
  • /approve UX was live-caught too: typing a full uuid by hand on a phone is unrealistic, and the prompt template said Reply with: /approve <id> with <id> as a literal placeholder. Fixed in a2cd999: bare /approve allow-once resolves to the single most-recent pending approval; the prompt now shows all three decisions inline copy-paste-ready.

What was not tested: the deny path through user-driven rejection (we exercised the deny path via timeout and via approval-system error in unit tests, but not via a user explicitly typing /approve deny end-to-end on the live deployment). The unit tests (on consent envelope + deny, returns synthetic denied result and does NOT re-call) cover this deterministically. Maintainer running the PR through any deployment can verify in a single message exchange.

Reference / motivation

Part of a broader integration on a HomeBrain box. Reference server-side: scripts/mcp_common.py (the Consent helper) and scripts/mcp-*.py. Design write-up: INTEGRATIONS_PLAN.md.

Open questions for review

  1. Should I add an mcp.approvals.allowlist: [{server, tool}]? Left out for v1 — gating is already opt-in per-tool by the MCP server (only act-tier tools return the envelope), and allow-always already short-circuits future calls. Happy to add an allowlist as a follow-up if maintainers prefer.
  2. The decision callback returns the existing ExecApprovalDecision union (allow-once | allow-always | deny) — keeps the reply-parser + UI rendering identical.
  3. Approval kind: I considered adding "mcp-tool" as a third ChannelApprovalKind. Decided against it (~10 files for a presentation distinction) — the existing plugin renderer already shows pluginId: mcp:<server>, toolName: <tool>, description: <summary>. Open to redoing as a separate kind if maintainers prefer.
  4. Should approvals.plugin.enabled = true, mode = "session" be the default? Currently the gateway's plugin-approval forwarder is fully gated on this config block being explicitly set. Without it set, requests are accepted but never delivered — the boundary becomes a permanent deny gate (caught live during this PR's testing). Sensible default seems to be on-with-session-routing, especially now that the contract is broadly useful for MCP tool calls. Happy to add that as a separate PR if maintainers agree.

🤖 Generated with Claude Code

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • docs/tools/mcp-consent-envelope.md (added, +131/-0)
  • src/agents/pi-bundle-mcp-consent.test.ts (added, +442/-0)
  • src/agents/pi-bundle-mcp-consent.ts (added, +243/-0)
  • src/agents/pi-bundle-mcp-materialize.ts (modified, +156/-2)
  • src/agents/pi-embedded-runner/compact.ts (modified, +5/-0)
  • src/agents/pi-embedded-runner/run/attempt.ts (modified, +8/-0)
  • src/auto-reply/reply/commands-approve.ts (modified, +67/-1)
  • src/config/types.mcp.ts (modified, +19/-0)
  • src/infra/plugin-approvals.ts (modified, +5/-1)

Code Example

{
  "ok": false,
  "requires_confirmation": true,
  "action_id": "<server-side single-use token>",
  "summary": "<one-line user-readable description>",
  "expires_in_seconds": 60
}
RAW_BUFFERClick to expand / collapse

Summary

Let MCP servers opt into the same /approve <id> channel-mediated approval pipeline that already gates shell-exec calls, by returning a small standard envelope from tools/call.

Problem to solve

When an MCP tool call mutates state outside OpenClaw (send email, write a vault entry, set Home Assistant state, post to a chat), the only thing standing between the user and the action today is the model deciding to call it. Tool descriptions can ask the model to confirm with the user first, but that's prompt-level discipline — there is no infrastructure-level enforcement.

By contrast, OpenClaw already has a hardened gate for shell exec: exec-approvals issues an approval, the user replies /approve <id> allow-once|allow-always|deny on a verified channel (channel-approval-auth), and only then does the runtime execute. The bash-tools.exec.ts code is explicit (lines 1159–1160): "exec cannot run /approve commands. Show the /approve command to the user as chat text, or route it through the approval command handler instead of shell execution." This same trust property is missing for MCP tools.

Vanilla OpenClaw 2026.4.24 + Qwen3.6 35B does (in practice) recognise a requires_confirmation: true tool result from its description and surface it to the user — verified live on a HomeBrain box. But that's the model honouring a contract, not the runtime enforcing one. A misaligned or compromised agent can paper over the prompt-level discipline by simply re-calling the tool with a fabricated confirmation_token.

Proposed solution

A small additive contract — the MCP consent envelope — that an MCP server returns when it wants user approval before the runtime considers the result terminal:

{
  "ok": false,
  "requires_confirmation": true,
  "action_id": "<server-side single-use token>",
  "summary": "<one-line user-readable description>",
  "expires_in_seconds": 60
}

OpenClaw recognises the envelope, suppresses the result, issues a plugin.approval.request (reusing the existing "plugin" ChannelApprovalKind — no new infra), blocks until the user replies /approve <id> ..., then re-calls the tool with confirmation_token = action_id set on the input. The model never sees action_id, and confirmation_token is stripped from any model-supplied input on the first call so the agent cannot self-approve. Servers that don't return the envelope are unchanged.

Disable per-deployment with mcp.approvals.enabled: false.

Alternatives considered

  • Pure prompt-level discipline. What we have today. Brittle: depends on the tool description being read correctly by the model, and the model self-restraining. Not enforceable.
  • A new "mcp-tool" ChannelApprovalKind. Would touch ~10 files (forwarder, native-delivery, view-model, channel-runtime, presentation, types). For a presentation distinction the user mostly doesn't care about — the existing plugin renderer already shows pluginId: mcp:<server>, toolName: <tool>, description: <summary>. Reusing "plugin" keeps the gateway-side surface entirely untouched.
  • Plugins-only. Implementing this through the plugin SDK's requireApproval hook would work for OpenClaw-internal plugin tools but not for third-party mcp.servers[] entries (HomeBrain, GitHub, etc.) that load directly via pi-bundle-mcp. The fix has to live at the bundle-MCP materialize layer.
  • Out-of-band redemption URL. Make the action_id only redeemable via a dashboard click rather than a chat reply. Heavier UX (user has to leave the chat to approve) for the same trust property.

Impact

  • Affected: anyone wiring third-party MCP servers into OpenClaw via mcp.servers[] for tools that mutate external state. With the explosion of MCP servers in 2026 (HomeBrain, GitHub, Linear, Slack, Notion, browser-automation servers, etc.), the surface is growing fast.
  • Severity: medium → high. A model error on vault.reveal is a leaked secret. A model error on email.send_direct is a sent email that can't be unsent.
  • Frequency: every act-tier tool call.
  • Consequence: today, deployments either disable destructive MCP tools entirely or rely on prompt discipline. With this contract, server authors mark a tool act-tier by returning the envelope, and OpenClaw enforces the gate.

Evidence/examples

  • Vanilla OpenClaw 2026.4.24 on a HomeBrain box: model honours requires_confirmation from the tool description, completes the loop in one turn (re-calls with the token), real Nextcloud share URL is created. End-to-end audit log entry. (HomeBrain INTEGRATIONS_PLAN.md §10)
  • Reference server-side implementation of the envelope: scripts/mcp_common.py (the Consent helper does TTL + single-use + chat-id-scoped issuance/redemption).

Additional information

  • I have a draft PR ready: #78303. It implements the contract reusing plugin.approval.request / waitDecision and the existing /approve <id> parser. ~920 lines incl. 20 unit tests + docs. No new approval kind, zero changes to gateway/forwarder/runtime/view-model.
  • Backward compatible — opt-in by the MCP server (return the envelope) and opt-out per-deployment (mcp.approvals.enabled: false).

🤖 Generated with Claude Code

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Feature]: Channel-mediated approval for MCP tool calls (consent envelope) [1 pull requests, 8 comments, 4 participants]