ollama - 💡(How to fix) Fix Bug? Logprobs are not temperature-scaled

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

func (s *Sampler) sample(tokens []token) (token, error) {

Root Cause

Root cause (from source)

Code Example

func (s *Sampler) sample(tokens []token) (token, error) {
    if s.temperature == 0 {
        return greedy(tokens), nil
    }

    tokens = topK(tokens, s.topK)
    temperature(tokens, s.temperature)  // scaling happens here
    softmax(tokens)                     // probs computed here
    tokens = topP(tokens, s.topP)
    tokens = minP(tokens, s.minP)
    // ...weighted sampling
}

---

func temperature(ts []token, temp float32) {
    temp = max(temp, 1e-7)
    for i := range ts {
        ts[i].value = ts[i].value / temp
    }
}

---

// after temperature(tokens, s.temperature) and softmax(tokens)
// convert probabilities back to log space for reporting
for i := range tokens {
    tokens[i].logprob = math.Log(float64(tokens[i].value))
}

---
RAW_BUFFERClick to expand / collapse

What is the issue?

When requesting logprobs from the Ollama API, the returned log-probabilities appear to be computed from the raw (unscaled) logits — before temperature scaling is applied. This means the reported logprobs do not reflect the actual distribution that tokens were sampled from, and do not change regardless of the temperature setting.

I confirmed this by running the same prompt across multiple epochs at temperature=0 and different temperatures, and observing that the logprob values in top_logprobs are identical across runs regardless of temperature.


Root cause (from source)

In sample/samplers.go, the sample() function applies temperature scaling and softmax inside the sampling path, but logprobs appear to be computed from the raw logits before this pipeline runs:

func (s *Sampler) sample(tokens []token) (token, error) {
    if s.temperature == 0 {
        return greedy(tokens), nil
    }

    tokens = topK(tokens, s.topK)
    temperature(tokens, s.temperature)  // scaling happens here
    softmax(tokens)                     // probs computed here
    tokens = topP(tokens, s.topP)
    tokens = minP(tokens, s.minP)
    // ...weighted sampling
}

And in sample/transforms.go, temperature() divides logits in-place:

func temperature(ts []token, temp float32) {
    temp = max(temp, 1e-7)
    for i := range ts {
        ts[i].value = ts[i].value / temp
    }
}

The logprobs returned to the API are log(softmax(raw_logits)) — the unscaled model output — rather than log(softmax(logits / temperature)).


Expected behaviour

The logprobs returned should reflect the distribution actually used during sampling, i.e. they should be temperature-scaled. This is consistent with how OpenAI returns logprobs.

For use cases like sampling analysis, confidence estimation, or replicating the sampling distribution, unscaled logprobs are misleading.


Possible fix

The fix would be to compute logprobs after temperature scaling, so the returned values match the actual sampling distribution. Something like:

// after temperature(tokens, s.temperature) and softmax(tokens)
// convert probabilities back to log space for reporting
for i := range tokens {
    tokens[i].logprob = math.Log(float64(tokens[i].value))
}

But for users relying on logprobs to understand the sampling distribution (e.g. for analysis of decoding behaviour), temperature-scaled logprobs would be more correct.

If unintentional, I'm happy to submit a PR with the fix. Can someone confirm which interpretation is intended before I proceed?

Relevant log output

OS

Windows

GPU

Nvidia

CPU

Intel

Ollama version

0.17.1

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

ollama - 💡(How to fix) Fix Bug? Logprobs are not temperature-scaled