ollama - 💡(How to fix) Fix The new version encounters an error when running gpt-oss!!! [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
ollama/ollama#14898Fetched 2026-04-08 00:48:06
View on GitHub
Comments
1
Participants
2
Timeline
3
Reactions
0
Author
Participants
Timeline (top)
closed ×1commented ×1labeled ×1

Error Message

When I upgraded ollama to the latest version, I found that my original gpt-oss:120b stopped working, with the following error: Error: 500 Internal Server Error: model failed to load, this may be due to resource limitations or an internal error, check ollama server logs for details Later, I tried gpt-oss:20b, but it also failed to run. Due to the significant upgrade, it was functioning properly on version 0.13.0 for me. I am posting the relevant error logs.

RAW_BUFFERClick to expand / collapse

What is the issue?

When I upgraded ollama to the latest version, I found that my original gpt-oss:120b stopped working, with the following error: Error: 500 Internal Server Error: model failed to load, this may be due to resource limitations or an internal error, check ollama server logs for details Later, I tried gpt-oss:20b, but it also failed to run. Due to the significant upgrade, it was functioning properly on version 0.13.0 for me. I am posting the relevant error logs. my GPU is nvidia h20 * 8. NVIDIA-SMI 575.51.03 Driver Version: 575.51.03 CUDA Version: 12.9

ai_ollama_2026-03-17_14_38_33.log

Relevant log output

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.18.1

extent analysis

Fix Plan

The fix involves adjusting the GPU memory allocation to accommodate the larger model sizes.

  • Step 1: Update Ollama Configuration Update the ollama configuration file to increase the GPU memory allocation. Add the following lines to the configuration file:
    gpu:
      memory_limit: 32G
  • Step 2: Set Environment Variables Set the following environment variables before running ollama:
    export CUDA_VISIBLE_DEVICES=0
    export NVIDIA_DRIVER_CAPABILITIES=compute,utility
  • Step 3: Run Ollama with Increased Memory Run ollama with the updated configuration and environment variables:
    ollama --config /path/to/config.yml --model gpt-oss:120b
  • Step 4: Monitor GPU Memory Usage Monitor the GPU memory usage using nvidia-smi to ensure that the increased memory allocation is sufficient.

Verification

Verify that the fix worked by checking the ollama server logs for successful model loading and by monitoring the GPU memory usage.

Extra Tips

  • Ensure that the NVIDIA driver version is compatible with the CUDA version.
  • Consider upgrading the GPU drivers to the latest version for improved performance and stability.
  • Refer to the ollama documentation for more information on configuring GPU memory allocation and environment variables.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING