ollama - ✅(Solved) Fix Feature: Capability-aware tool presentation based on model size [1 pull requests, 3 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
ollama/ollama#15067Fetched 2026-04-08 01:31:29
View on GitHub
Comments
3
Participants
2
Timeline
13
Reactions
0
Author
Participants
Timeline (top)
commented ×3mentioned ×3subscribed ×3cross-referenced ×2

When using /api/chat with tools, Ollama presents all tool definitions identically regardless of model size. A 0.8B model receives the same 80 tool descriptions as a 35B model. This wastes prompt tokens and degrades tool selection accuracy for smaller models.

Since Ollama already knows the loaded model's parameter count, it could adapt tool presentation automatically.

Root Cause

When using /api/chat with tools, Ollama presents all tool definitions identically regardless of model size. A 0.8B model receives the same 80 tool descriptions as a 35B model. This wastes prompt tokens and degrades tool selection accuracy for smaller models.

Since Ollama already knows the loaded model's parameter count, it could adapt tool presentation automatically.

PR fix notes

PR #15116: Expose numeric parameter_count in model details

Description (problem / solution / changelog)

Fixes #15067

Summary

Expose a numeric parameter_count field in model metadata so clients can adapt tool definitions based on model size (Option B from the issue: expose model metadata to clients).

Changes

  • API:
    • Extend ModelDetails with a new parameter_count field (in addition to the existing human-readable parameter_size).
  • Server:
    • /api/show:
      • Populate details.parameter_count from:
        • Safetensors LLM metadata (general.parameter_count), when available.
        • Image generation manifest ParameterCount, when available.
        • Otherwise, derive an approximate numeric count from details.parameter_size (e.g., "7B", "430M", "15K").
    • /api/ps:
      • Include details.parameter_count by deriving it from the existing details.parameter_size.
    • /api/tags:
      • Include details.parameter_count for listed models, also derived from details.parameter_size.
  • Tests:
    • Update existing ShowHandler test expectations to account for the new parameter_count field while keeping prior behavior unchanged.

Rationale

Clients that use /api/chat with tools can now call /api/show (or reuse /api/ps / /api/tags) to read a numeric parameter_count and implement capability-aware tool presentation on their side (e.g., different tool tiers for small vs. large models), without changing the current chat/tool prompting behavior on the server.

Changed files

  • api/types.go (modified, +1/-0)
  • server/routes.go (modified, +55/-1)
  • server/routes_create_test.go (modified, +1/-0)

Code Example

P(correct tool) = P(correct family) × P(correct tool | correct family)

---

{
  "type": "function",
  "function": {
    "name": "file_read",
    "description": "Read file contents with line numbers, offset, and encoding",
    "parameters": { ... },
    "tiers": {
      "small": {
        "description": "Read file",
        "parameters": {
          "type": "object",
          "properties": { "path": {"type": "string"} },
          "required": ["path"]
        }
      }
    }
  }
}
RAW_BUFFERClick to expand / collapse

Summary

When using /api/chat with tools, Ollama presents all tool definitions identically regardless of model size. A 0.8B model receives the same 80 tool descriptions as a 35B model. This wastes prompt tokens and degrades tool selection accuracy for smaller models.

Since Ollama already knows the loaded model's parameter count, it could adapt tool presentation automatically.

The Problem

Benchmarked with Ollama's native tool calling API (/api/chat with tools), 80 tools, 50 prompts:

ModelAccuracyPrompt tokens
qwen2.5:1.5b50%3,408
qwen3.5:9b80%5,272
gpt-oss:20b80%2,143
qwen3.5:35b88%5,272

Small models waste 3,000-5,000 tokens on tool descriptions they can't effectively use.

Key Finding

Tool selection accuracy decomposes as:

P(correct tool) = P(correct family) × P(correct tool | correct family)

Even qwen2.5:1.5b achieves 89% within-family accuracy. The bottleneck is navigation, not selection.

Proposed Feature

When the loaded model has fewer parameters, Ollama could:

Option A: Server-side tool adaptation

Ollama automatically shortens tool descriptions and reduces parameter schemas for smaller models. The client sends full tools; Ollama adapts them before prompting.

Option B: Expose model metadata to clients

Add model parameter count to /api/show response (if not already), so clients can adapt tools before sending. This is the lighter-touch option.

Option C: Support tier hints in tool definitions

Allow an optional tiers field in each tool definition:

{
  "type": "function",
  "function": {
    "name": "file_read",
    "description": "Read file contents with line numbers, offset, and encoding",
    "parameters": { ... },
    "tiers": {
      "small": {
        "description": "Read file",
        "parameters": {
          "type": "object",
          "properties": { "path": {"type": "string"} },
          "required": ["path"]
        }
      }
    }
  }
}

Ollama picks the right tier based on loaded model size. Unknown tiers fall back to the top-level definition.

Benchmark Evidence

With tier-adapted presentation on Ollama:

Strategy1.5B20B
Baseline (all 80 tools)50%80%
Hybrid (8 detailed + 72 name-only)60% (+10pp)76%
Semantic reorder + category hint54%88% (+8pp)

Token reduction: 83-97% depending on strategy.

References

All benchmarks were run on Ollama's /api/chat with native tool calling. Happy to provide additional data.

extent analysis

Fix Plan

To address the issue of Ollama presenting all tool definitions identically regardless of model size, we will implement Option C: Support tier hints in tool definitions. This involves adding an optional tiers field in each tool definition.

Step-by-Step Solution:

  1. Update Tool Definitions: Add a tiers field to each tool definition with descriptions and parameters tailored for smaller models.
  2. Modify Ollama's API: Update the /api/chat endpoint to parse the tiers field and select the appropriate tier based on the loaded model size.
  3. Implement Tier Selection Logic: Develop a logic to determine the correct tier based on the model size and select the corresponding tool definition.

Example Code:

// Updated tool definition with tiers
{
  "type": "function",
  "function": {
    "name": "file_read",
    "description": "Read file contents with line numbers, offset, and encoding",
    "parameters": { ... },
    "tiers": {
      "small": {
        "description": "Read file",
        "parameters": {
          "type": "object",
          "properties": { "path": {"type": "string"} },
          "required": ["path"]
        }
      }
    }
  }
}
# Example tier selection logic in Python
def select_tier(model_size):
    if model_size < 10e9:  # Small model
        return "small"
    else:
        return "default"

def get_tool_definition(tool, model_size):
    tier = select_tier(model_size)
    if tier in tool["tiers"]:
        return tool["tiers"][tier]
    else:
        return tool

Verification

To verify the fix, test the updated /api/chat endpoint with different model sizes and tool definitions. Measure the prompt tokens used and tool selection accuracy to ensure they have improved.

Extra Tips

  • Ensure the tiers field is optional to maintain backward compatibility with existing tool definitions.
  • Consider adding a default tier or fallback mechanism for unknown model sizes or missing tier definitions.
  • Continuously monitor and evaluate the performance of the tier selection logic to ensure it adapts effectively to different model sizes and use cases.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING