claude-code - 💡(How to fix) Fix [FEATURE] File-Reference Support for Tool Parameters to Reduce Output Token Waste [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#53014Fetched 2026-04-25 06:14:42
View on GitHub
Comments
0
Participants
1
Timeline
3
Reactions
0
Participants
Timeline (top)
labeled ×3

Root Cause

After Claude renders a widget preview, I often ask for small tweaks — "change the header color," "make the font Trebuchet MS," "move the signature up." Each tweak requires Claude to regenerate the entire HTML payload from scratch because there's no way to surgically edit the previous output and re-reference it. A 2-token CSS change costs 3,500+ output tokens.

Fix Action

Fix / Workaround

Allow tool parameters to accept a file-path reference that the tool-calling runtime resolves before dispatch.

Cost: Requires a new resolution step in the tool-calling layer between Claude's output and MCP dispatch.

Code Example

{
  "name": "visualize:show_widget",
  "parameters": {
    "title": "email_preview",
    "loading_messages": ["Rendering preview"],
    "widget_code": { "$file": "/home/claude/rendered_email.html" }
  }
}

---

{
  "$template": "email_preview_v1",
  "$vars": {
    "SUBJECT": "Q3 Update",
    "BODY_HTML": "<p>Content here</p>"
  }
}
RAW_BUFFERClick to expand / collapse

Preflight Checklist

  • I have searched existing requests and this feature hasn't been requested yet
  • This is a single feature request (not multiple features)

Problem Statement

When Claude calls tools that accept large inline payloads (e.g., visualize:show_widget's widget_code parameter, or any MCP tool expecting HTML/JSON/code as a string parameter), the entire payload must be regenerated as output tokens on every call — even when the content is identical or nearly identical to something Claude just read from disk or produced in a previous step.

There is currently no mechanism for Claude to say: "use the contents of this file as the value for this parameter." Instead, Claude must read a file (input tokens), then re-emit its contents character by character into the tool call (output tokens). This effectively doubles the token cost for any template-driven workflow.

Concrete example

A user has a standardized HTML email template stored in a skill file (~3,000 tokens of boilerplate). On each use:

  1. Claude reads the template file → ~3,000 input tokens (cheap)
  2. Claude generates the tool call, re-emitting the full template with content slotted in → ~3,500 output tokens (expensive)

The ~3,000 tokens of static boilerplate are paid as output tokens every single invocation, even though they never change. Over a session with 5 email drafts, that's ~15,000 output tokens spent reproducing known-static content.

Scope of impact

This is not specific to one tool. It affects any workflow where:

  • A tool parameter expects a large string payload (HTML, JSON, code, SVG, markdown)
  • The payload follows a repeatable template or structure
  • The user iterates multiple times in a session (previews, drafts, revisions)
  • MCP servers accept inline content rather than file references

Examples across Claude surfaces: show_widget (visualize), create_view (Excalidraw), validate_and_render_mermaid_diagram (Mermaid), artifact creation, and any custom MCP tool that accepts template-like input.

Proposed Solution

Option A: file:// Reference in Tool Parameters (Recommended)

Allow tool parameters to accept a file-path reference that the tool-calling runtime resolves before dispatch.

{
  "name": "visualize:show_widget",
  "parameters": {
    "title": "email_preview",
    "loading_messages": ["Rendering preview"],
    "widget_code": { "$file": "/home/claude/rendered_email.html" }
  }
}

How it works: Claude writes the payload to a file (one generation), then references it in subsequent tool calls. The runtime reads the file and injects its contents into the parameter before sending to the MCP server.

Advantages:

  • Claude generates the content once, references it N times
  • Works with any tool, no tool-side changes needed
  • Enables str_replace workflows: edit the file surgically, then re-reference it
  • Composable with existing file-manipulation tools
  • Doesn't require tool authors to change anything

Cost: Requires a new resolution step in the tool-calling layer between Claude's output and MCP dispatch.

It also unlocks a secondary benefit: iterative refinement without full regeneration. Today, if a user says "change the header color in that widget," Claude must regenerate the entire widget payload. With file references, Claude would str_replace the color value in the file and re-call the tool with the same $file reference — costing only the edit tokens, not the full payload.

Alternative Solutions

Option B: Content-Addressed Caching

The runtime hashes large string parameters and caches them. If Claude produces an identical (or near-identical) payload, the cached version is used and output tokens are not charged for the repeated portion.

Advantages:

  • Fully transparent to Claude and tool authors
  • No new syntax or API surface

Disadvantages:

  • Only helps with exact or near-exact repetitions
  • Doesn't help with template-with-variable-content patterns
  • Complex to implement for "near-identical" matching

Option C: Template Registry

Allow Claude to register a named template with placeholder syntax, then invoke tools by template ID + variable substitutions.

{
  "$template": "email_preview_v1",
  "$vars": {
    "SUBJECT": "Q3 Update",
    "BODY_HTML": "<p>Content here</p>"
  }
}

Advantages:

  • Maximum token savings — only variable content is generated after registration
  • Explicit and debuggable

Disadvantages:

  • New abstraction layer to maintain
  • Template registration is itself a tool call
  • Scoped to session (lost on conversation end unless persisted)

Why Option A over B and C

Option A requires the least new infrastructure (file I/O already exists), composes with existing tools (create_file, str_replace, view), and works for any tool parameter without requiring tool authors to change anything. Options B and C are interesting but either too narrow (B only helps with exact matches) or too heavy (C introduces a new abstraction layer).

Priority

High - Significant impact on productivity

Feature Category

MCP server integration

Use Case Example

Template-driven email formatting

I maintain a standardized HTML email template (~3,000 tokens of boilerplate) as a skill file. When I ask Claude to format an email, it reads the template (input tokens), then regenerates the entire template with my content slotted in as the widget_code parameter of show_widget (output tokens). The boilerplate never changes, but I pay for it as output tokens every time. In a session where I draft 5 emails, that's ~15,000 output tokens on static content.

Iterative revision workflows

After Claude renders a widget preview, I often ask for small tweaks — "change the header color," "make the font Trebuchet MS," "move the signature up." Each tweak requires Claude to regenerate the entire HTML payload from scratch because there's no way to surgically edit the previous output and re-reference it. A 2-token CSS change costs 3,500+ output tokens.

Recurring report generation

Dashboard widgets, slide decks, and recurring reports follow the same template with different data each time. The template structure is known and static, but the tool-calling architecture forces full regeneration on every invocation.

Surfaces affected

  • claude.ai (show_widget, artifacts)
  • Claude Code (any MCP tool with large params)
  • Claude Desktop (MCP tools)
  • Not specific to any one MCP server — this is a tool-calling layer concern

Additional Context

Related Issues

  • #12836Support Tool Search and Programmatic Tool Use betas for reduced token consumption: Same philosophical lineage — "don't pay for what you already have." That issue targets input token waste from tool definitions loaded at session start. This issue targets output token waste from tool parameter payloads regenerated on every call. Together they represent the two sides of the token efficiency problem for tool-heavy workflows: definitions (input) and invocations (output).

  • #16546Model attempts file edits without reading file first: Related symptom (wasted output tokens), different root cause. That issue is behavioral (Claude guesses file content instead of reading). This issue is architectural (Claude has the content but must re-emit it because there's no reference mechanism).

  • #42647High token burn due to redundant context resubmission: Another architectural inefficiency where unchanged content gets re-sent. Same theme of "the system should recognize when content hasn't changed and avoid re-processing it."

Broader context

This pattern becomes increasingly costly as users adopt template-driven workflows (email formatting, slide generation, dashboard widgets, recurring reports). The current architecture forces a tradeoff: produce minimal tool payloads at the expense of output quality, or produce high-quality payloads at significant and avoidable token cost. A file-reference primitive resolves this tension.

The iterative case is especially painful. When a user asks to tweak one CSS property in a rendered widget, the entire HTML payload must be regenerated from scratch — there's no way to surgically edit the previous output and re-reference it. This makes revision-heavy workflows (draft → review → tweak → re-render) disproportionately expensive relative to the actual changes being made.

extent analysis

TL;DR

Implementing a file:// reference in tool parameters allows Claude to write the payload to a file and reference it in subsequent tool calls, reducing output token costs.

Guidance

  • Introduce a new resolution step in the tool-calling layer to support file references, enabling Claude to generate content once and reference it multiple times.
  • Update the tool-calling syntax to accept a file-path reference, such as { "$file": "/home/claude/rendered_email.html" }, to allow for efficient reuse of generated content.
  • Consider the advantages of this approach, including composability with existing file-manipulation tools and the ability to perform iterative refinement without full regeneration.
  • Evaluate the potential impact on various surfaces, including claude.ai, Claude Code, and Claude Desktop, to ensure a unified solution.

Example

{
  "name": "visualize:show_widget",
  "parameters": {
    "title": "email_preview",
    "loading_messages": ["Rendering preview"],
    "widget_code": { "$file": "/home/claude/rendered_email.html" }
  }
}

Notes

This solution assumes that the file I/O infrastructure already exists and can be leveraged to support the new file-reference mechanism. The implementation should ensure that the file reference is properly resolved before dispatching the tool call to the MCP server.

Recommendation

Apply the file:// reference workaround to reduce output token costs and improve the efficiency of template-driven workflows. This approach requires minimal new infrastructure and composes well with existing tools, making it a practical solution for addressing the issue.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix [FEATURE] File-Reference Support for Tool Parameters to Reduce Output Token Waste [1 participants]