ollama - ✅(Solved) Fix Feature: Capability-aware tool presentation based on model size [1 pull requests, 3 comments, 2 participants]

ollama2026-03-26 05:02:17

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

ollama/ollama#15067•Fetched 2026-04-08 01:31:29

View on GitHub

Comments

Participants

Timeline

Reactions

Author

spranab

Participants

PiyushInt

spranab

Timeline (top)

commented ×3mentioned ×3subscribed ×3cross-referenced ×2

When using /api/chat with tools, Ollama presents all tool definitions identically regardless of model size. A 0.8B model receives the same 80 tool descriptions as a 35B model. This wastes prompt tokens and degrades tool selection accuracy for smaller models.

Since Ollama already knows the loaded model's parameter count, it could adapt tool presentation automatically.

Root Cause

Since Ollama already knows the loaded model's parameter count, it could adapt tool presentation automatically.

PR fix notes

PR #15116: Expose numeric parameter_count in model details

Repository: ollama/ollama
Author: PiyushInt
State: open | merged: False
Link: https://github.com/ollama/ollama/pull/15116

Description (problem / solution / changelog)

Fixes #15067

Summary

Expose a numeric parameter_count field in model metadata so clients can adapt tool definitions based on model size (Option B from the issue: expose model metadata to clients).

Changes

API:
- Extend ModelDetails with a new parameter_count field (in addition to the existing human-readable parameter_size).
Server:
- /api/show:
  - Populate details.parameter_count from:
    - Safetensors LLM metadata (general.parameter_count), when available.
    - Image generation manifest ParameterCount, when available.
    - Otherwise, derive an approximate numeric count from details.parameter_size (e.g., "7B", "430M", "15K").
- /api/ps:
  - Include details.parameter_count by deriving it from the existing details.parameter_size.
- /api/tags:
  - Include details.parameter_count for listed models, also derived from details.parameter_size.
Tests:
- Update existing ShowHandler test expectations to account for the new parameter_count field while keeping prior behavior unchanged.

Rationale

Clients that use /api/chat with tools can now call /api/show (or reuse /api/ps / /api/tags) to read a numeric parameter_count and implement capability-aware tool presentation on their side (e.g., different tool tiers for small vs. large models), without changing the current chat/tool prompting behavior on the server.

Changed files

api/types.go (modified, +1/-0)
server/routes.go (modified, +55/-1)
server/routes_create_test.go (modified, +1/-0)

Code Example

P(correct tool) = P(correct family) × P(correct tool | correct family)

---

{
  "type": "function",
  "function": {
    "name": "file_read",
    "description": "Read file contents with line numbers, offset, and encoding",
    "parameters": { ... },
    "tiers": {
      "small": {
        "description": "Read file",
        "parameters": {
          "type": "object",
          "properties": { "path": {"type": "string"} },
          "required": ["path"]
        }
      }
    }
  }
}

RAW_BUFFERClick to expand / collapse

Summary

Since Ollama already knows the loaded model's parameter count, it could adapt tool presentation automatically.

The Problem

Benchmarked with Ollama's native tool calling API (/api/chat with tools), 80 tools, 50 prompts:

Model	Accuracy	Prompt tokens
qwen2.5:1.5b	50%	3,408
qwen3.5:9b	80%	5,272
gpt-oss:20b	80%	2,143
qwen3.5:35b	88%	5,272

Small models waste 3,000-5,000 tokens on tool descriptions they can't effectively use.

Key Finding

Tool selection accuracy decomposes as:

P(correct tool) = P(correct family) × P(correct tool | correct family)

Even qwen2.5:1.5b achieves 89% within-family accuracy. The bottleneck is navigation, not selection.

Proposed Feature

When the loaded model has fewer parameters, Ollama could:

Option A: Server-side tool adaptation

Ollama automatically shortens tool descriptions and reduces parameter schemas for smaller models. The client sends full tools; Ollama adapts them before prompting.

Option B: Expose model metadata to clients

Add model parameter count to /api/show response (if not already), so clients can adapt tools before sending. This is the lighter-touch option.

Option C: Support tier hints in tool definitions

Allow an optional tiers field in each tool definition:

{
  "type": "function",
  "function": {
    "name": "file_read",
    "description": "Read file contents with line numbers, offset, and encoding",
    "parameters": { ... },
    "tiers": {
      "small": {
        "description": "Read file",
        "parameters": {
          "type": "object",
          "properties": { "path": {"type": "string"} },
          "required": ["path"]
        }
      }
    }
  }
}

Ollama picks the right tier based on loaded model size. Unknown tiers fall back to the top-level definition.

Benchmark Evidence

With tier-adapted presentation on Ollama:

Strategy	1.5B	20B
Baseline (all 80 tools)	50%	80%
Hybrid (8 detailed + 72 name-only)	60% (+10pp)	76%
Semantic reorder + category hint	54%	88% (+8pp)

Token reduction: 83-97% depending on strategy.

References

Whitepaper: DOI: 10.5281/zenodo.19228710
SDK: yantrikos-sdk on PyPI
Benchmark: github.com/yantrikos/tier (harness_v3.py, all data in JSONL)

All benchmarks were run on Ollama's /api/chat with native tool calling. Happy to provide additional data.

extent analysis

Fix Plan

To address the issue of Ollama presenting all tool definitions identically regardless of model size, we will implement Option C: Support tier hints in tool definitions. This involves adding an optional tiers field in each tool definition.

Step-by-Step Solution:

Update Tool Definitions: Add a tiers field to each tool definition with descriptions and parameters tailored for smaller models.
Modify Ollama's API: Update the /api/chat endpoint to parse the tiers field and select the appropriate tier based on the loaded model size.
Implement Tier Selection Logic: Develop a logic to determine the correct tier based on the model size and select the corresponding tool definition.

Example Code:

// Updated tool definition with tiers
{
  "type": "function",
  "function": {
    "name": "file_read",
    "description": "Read file contents with line numbers, offset, and encoding",
    "parameters": { ... },
    "tiers": {
      "small": {
        "description": "Read file",
        "parameters": {
          "type": "object",
          "properties": { "path": {"type": "string"} },
          "required": ["path"]
        }
      }
    }
  }
}

# Example tier selection logic in Python
def select_tier(model_size):
    if model_size < 10e9:  # Small model
        return "small"
    else:
        return "default"

def get_tool_definition(tool, model_size):
    tier = select_tier(model_size)
    if tier in tool["tiers"]:
        return tool["tiers"][tier]
    else:
        return tool

Verification

To verify the fix, test the updated /api/chat endpoint with different model sizes and tool definitions. Measure the prompt tokens used and tool selection accuracy to ensure they have improved.

Extra Tips

Ensure the tiers field is optional to maintain backward compatibility with existing tool definitions.
Consider adding a default tier or fallback mechanism for unknown model sizes or missing tier definitions.
Continuously monitor and evaluate the performance of the tier selection logic to ensure it adapts effectively to different model sizes and use cases.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #device allocation #model download #tokenizer error #prompt formatting

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

ollama - ✅(Solved) Fix Feature: Capability-aware tool presentation based on model size [1 pull requests, 3 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

PR fix notes

PR #15116: Expose numeric parameter_count in model details

Description (problem / solution / changelog)

Summary

Changes

Rationale

Changed files

Code Example

Summary

The Problem

Key Finding

Proposed Feature

Option A: Server-side tool adaptation

Option B: Expose model metadata to clients

Option C: Support tier hints in tool definitions

Benchmark Evidence

References

extent analysis

Fix Plan

Step-by-Step Solution:

Example Code:

Verification

Extra Tips

Still need to ship something?

TRENDING

ollama - ✅(Solved) Fix Feature: Capability-aware tool presentation based on model size [1 pull requests, 3 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

PR fix notes

PR #15116: Expose numeric parameter_count in model details

Description (problem / solution / changelog)

Summary

Changes

Rationale

Changed files

Code Example

Summary

The Problem

Key Finding

Proposed Feature

Option A: Server-side tool adaptation

Option B: Expose model metadata to clients

Option C: Support tier hints in tool definitions

Benchmark Evidence

References

extent analysis

Fix Plan

Step-by-Step Solution:

Example Code:

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING