ollama - 💡(How to fix) Fix Not compatible with Glaude code Cli when using local model

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Code Example

There's an issue with the selected model (gemma4). It may not exist or you may not have access to it. Run /model to pick a different model.

---

`
OLLAMA_CONTEXT_LENGTH=32768 OLLAMA_FLASH_ATTENTION="1" OLLAMA_KV_CACHE_TYPE="q8_0" /opt/homebrew/opt/ollama/bin/ollama serve 
time=2026-05-10T21:00:48.128-07:00 level=INFO source=routes.go:1752 msg="server config" env="map[HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:32768 OLLAMA_DEBUG:INFO OLLAMA_DEBUG_LOG_REQUESTS:false OLLAMA_EDITOR: OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/Users/Roland/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_CLOUD:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false http_proxy: https_proxy: no_proxy:]"
time=2026-05-10T21:00:48.128-07:00 level=INFO source=routes.go:1754 msg="Ollama cloud disabled: false"
time=2026-05-10T21:00:48.129-07:00 level=INFO source=images.go:517 msg="total blobs: 4"
time=2026-05-10T21:00:48.129-07:00 level=INFO source=images.go:524 msg="total unused blobs removed: 0"
time=2026-05-10T21:00:48.129-07:00 level=INFO source=routes.go:1810 msg="Listening on 127.0.0.1:11434 (version 0.21.0)"
time=2026-05-10T21:00:48.129-07:00 level=INFO source=runner.go:67 msg="discovering available GPUs..."
time=2026-05-10T21:00:48.131-07:00 level=INFO source=server.go:444 msg="starting runner" cmd="/opt/homebrew/Cellar/ollama/0.21.0_1/libexec/ollama runner --ollama-engine --port 64320"
time=2026-05-10T21:00:48.199-07:00 level=INFO source=types.go:42 msg="inference compute" id=0 filter_id=0 library=Metal compute=0.0 name=Metal description="Apple M4 Pro" libdirs="" driver=0.0 pci_id="" type=discrete total="17.8 GiB" available="17.8 GiB"
time=2026-05-10T21:00:48.199-07:00 level=INFO source=routes.go:1860 msg="vram-based default context" total_vram="17.8 GiB" default_num_ctx=4096

[GIN] 2026/05/10 - 21:01:00 | 200 |      32.292µs |       127.0.0.1 | HEAD     "/"
[GIN] 2026/05/10 - 21:01:31 | 404 |     870.791µs |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:01:31 | 404 |         345µs |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:01:31 | 404 |    5.920958ms |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:01:31 | 404 |    4.227208ms |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:01:56 | 404 |    1.328417ms |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:01:56 | 404 |    8.260375ms |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:01:56 | 404 |     361.833µs |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:01:56 | 404 |    5.545291ms |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:03:53 | 404 |    1.258459ms |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:03:53 | 404 |    5.141625ms |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:03:53 | 404 |     396.167µs |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:03:53 | 404 |    4.383209ms |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:13:25 | 200 |      16.625µs |       127.0.0.1 | HEAD     "/"
[GIN] 2026/05/10 - 21:13:29 | 404 |      1.5325ms |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:13:29 | 404 |     356.875µs |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:13:29 | 404 |    6.074334ms |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:13:29 | 404 |    5.227291ms |       127.0.0.1 | POST     "/v1/messages?beta=true"
`
RAW_BUFFERClick to expand / collapse

What is the issue?

When use Ollama to serve Gemma 4 model, and use Claude Code as cli on macOS (26.3), Claude Code report

There's an issue with the selected model (gemma4). It may not exist or you may not have access to it. Run /model to pick a different model.

for every prompts.

The same Ollama + Gemma4 setup can work with opencode quite smoothly. It seems Ollama may have issue with Claude's anthropic API.

Claude Code CLI version: v2.1.123

Relevant log output

`
OLLAMA_CONTEXT_LENGTH=32768 OLLAMA_FLASH_ATTENTION="1" OLLAMA_KV_CACHE_TYPE="q8_0" /opt/homebrew/opt/ollama/bin/ollama serve 
time=2026-05-10T21:00:48.128-07:00 level=INFO source=routes.go:1752 msg="server config" env="map[HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:32768 OLLAMA_DEBUG:INFO OLLAMA_DEBUG_LOG_REQUESTS:false OLLAMA_EDITOR: OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/Users/Roland/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_CLOUD:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false http_proxy: https_proxy: no_proxy:]"
time=2026-05-10T21:00:48.128-07:00 level=INFO source=routes.go:1754 msg="Ollama cloud disabled: false"
time=2026-05-10T21:00:48.129-07:00 level=INFO source=images.go:517 msg="total blobs: 4"
time=2026-05-10T21:00:48.129-07:00 level=INFO source=images.go:524 msg="total unused blobs removed: 0"
time=2026-05-10T21:00:48.129-07:00 level=INFO source=routes.go:1810 msg="Listening on 127.0.0.1:11434 (version 0.21.0)"
time=2026-05-10T21:00:48.129-07:00 level=INFO source=runner.go:67 msg="discovering available GPUs..."
time=2026-05-10T21:00:48.131-07:00 level=INFO source=server.go:444 msg="starting runner" cmd="/opt/homebrew/Cellar/ollama/0.21.0_1/libexec/ollama runner --ollama-engine --port 64320"
time=2026-05-10T21:00:48.199-07:00 level=INFO source=types.go:42 msg="inference compute" id=0 filter_id=0 library=Metal compute=0.0 name=Metal description="Apple M4 Pro" libdirs="" driver=0.0 pci_id="" type=discrete total="17.8 GiB" available="17.8 GiB"
time=2026-05-10T21:00:48.199-07:00 level=INFO source=routes.go:1860 msg="vram-based default context" total_vram="17.8 GiB" default_num_ctx=4096

[GIN] 2026/05/10 - 21:01:00 | 200 |      32.292µs |       127.0.0.1 | HEAD     "/"
[GIN] 2026/05/10 - 21:01:31 | 404 |     870.791µs |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:01:31 | 404 |         345µs |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:01:31 | 404 |    5.920958ms |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:01:31 | 404 |    4.227208ms |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:01:56 | 404 |    1.328417ms |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:01:56 | 404 |    8.260375ms |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:01:56 | 404 |     361.833µs |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:01:56 | 404 |    5.545291ms |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:03:53 | 404 |    1.258459ms |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:03:53 | 404 |    5.141625ms |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:03:53 | 404 |     396.167µs |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:03:53 | 404 |    4.383209ms |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:13:25 | 200 |      16.625µs |       127.0.0.1 | HEAD     "/"
[GIN] 2026/05/10 - 21:13:29 | 404 |      1.5325ms |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:13:29 | 404 |     356.875µs |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:13:29 | 404 |    6.074334ms |       127.0.0.1 | POST     "/v1/messages?beta=true"
[GIN] 2026/05/10 - 21:13:29 | 404 |    5.227291ms |       127.0.0.1 | POST     "/v1/messages?beta=true"
`

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.21.0

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING