ollama - 💡(How to fix) Fix Runner IPC uses plain HTTP on localhost with zero auth — any local process can hijack inference

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

The Ollama server communicates with the llama.cpp runner subprocess over HTTP on 127.0.0.1 with a random TCP port:

// StartRunner picks a random port:
port = rand.Intn(65535-49152) + 49152

// Then communicates without any authentication:
r, _ := http.NewRequestWithContext(ctx, "POST", 
    fmt.Sprintf("http://127.0.0.1:%d/load", s.port), ...)

The runner exposes /load, /health, /completion, /tokenize, /detokenize, and /embedding endpoints on localhost with zero authentication. Any process on the same machine can:

  1. Scan the ephemeral port range (49152-65535) to find the runner
  2. Inject prompts via /completion and read model outputs
  3. Steal tokenized context via /tokenize
  4. Disrupt service via /load with malicious payloads

Root Cause

The Ollama server communicates with the llama.cpp runner subprocess over HTTP on 127.0.0.1 with a random TCP port:

// StartRunner picks a random port:
port = rand.Intn(65535-49152) + 49152

// Then communicates without any authentication:
r, _ := http.NewRequestWithContext(ctx, "POST", 
    fmt.Sprintf("http://127.0.0.1:%d/load", s.port), ...)

The runner exposes /load, /health, /completion, /tokenize, /detokenize, and /embedding endpoints on localhost with zero authentication. Any process on the same machine can:

  1. Scan the ephemeral port range (49152-65535) to find the runner
  2. Inject prompts via /completion and read model outputs
  3. Steal tokenized context via /tokenize
  4. Disrupt service via /load with malicious payloads

Code Example

// StartRunner picks a random port:
port = rand.Intn(65535-49152) + 49152

// Then communicates without any authentication:
r, _ := http.NewRequestWithContext(ctx, "POST", 
    fmt.Sprintf("http://127.0.0.1:%d/load", s.port), ...)

---

// Generate a random token for runner auth
token := make([]byte, 32)
rand.Read(token)
cmd.Env = append(cmd.Env, "OLLAMA_RUNNER_TOKEN="+hex.EncodeToString(token))
RAW_BUFFERClick to expand / collapse

CWE-862: Runner IPC on localhost with No Authentication — Local Process Can Hijack Inference

Severity: MEDIUM (CVSS 6.1)

Location

llm/server.goinitModel(), Completion(), and StartRunner()

Description

The Ollama server communicates with the llama.cpp runner subprocess over HTTP on 127.0.0.1 with a random TCP port:

// StartRunner picks a random port:
port = rand.Intn(65535-49152) + 49152

// Then communicates without any authentication:
r, _ := http.NewRequestWithContext(ctx, "POST", 
    fmt.Sprintf("http://127.0.0.1:%d/load", s.port), ...)

The runner exposes /load, /health, /completion, /tokenize, /detokenize, and /embedding endpoints on localhost with zero authentication. Any process on the same machine can:

  1. Scan the ephemeral port range (49152-65535) to find the runner
  2. Inject prompts via /completion and read model outputs
  3. Steal tokenized context via /tokenize
  4. Disrupt service via /load with malicious payloads

Impact

  • Model weights are executed in the runner process — a local attacker can abuse the model for their own inference
  • If the model has been fine-tuned on sensitive data, outputs could leak that data
  • Tokenized context from other users' sessions could be extracted

Remediation

  1. Use Unix domain sockets instead of TCP for runner IPC (prevents port scanning)
  2. Generate a random shared secret passed via environment variable to the runner process, required as an Authorization header on all requests
  3. Or use os.Pipe() / stdin/stdout for communication instead of HTTP
// Generate a random token for runner auth
token := make([]byte, 32)
rand.Read(token)
cmd.Env = append(cmd.Env, "OLLAMA_RUNNER_TOKEN="+hex.EncodeToString(token))

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING