openclaw - 💡(How to fix) Fix [Bug] Ollama cold-start timeout silently exfiltrates data via fallback chain [1 comments, 2 participants]

openclaw2026-03-23 12:05:05

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#52818•Fetched 2026-04-08 01:18:52

View on GitHub

Comments

Participants

Timeline

Reactions

Author

ai-nurmamat

Participants

ai-nurmamat

Hollychou924

Timeline (top)

commented ×1

When using local Ollama models, the first request after model load triggers a cold-start that takes 13–60+ seconds depending on model size. OpenClaw's default LLM request timeout fires before Ollama can respond, resulting in a timeout-triggered fallback (HTTP 408) to the next model in the fallback chain — typically a cloud provider. User data intended for local processing is silently routed to cloud providers without any user notification or consent. This is a privacy-critical issue.

Error Message

→ HTTP 408 / timeout error | Fallback trigger | Auth failure or explicit model error | Silent timeout-based fallback |

Root Cause

Where the problem lives:

The LLM request timeout is currently global (not per-provider), enforced in src/model/client.ts or equivalent request dispatch layer. When ollama-remote is the first candidate:

Request flow:
  user message → openclaw 
    → LLM request to ollama-remote (qwen3.5:122b)
    → Ollama starts cold-loading model (0 bytes in memory)
    → openclaw waits... timeout fires at T seconds (likely 30s default)
    → HTTP 408 / timeout error
    → fallback chain activated: openai/gpt-4.1-mini
    → user data sent to OpenAI servers  ← SILENT EXFILTRATION

The timeout is enforced before Ollama even finishes loading the model. No distinction is made between:

A genuinely slow/hung model (should fail fast)
A model that just needs more time to load from disk (should wait, or pre-warm)

Code location (estimated):

src/infra/model-loader.ts — likely where loadModel() is called, no pre-warm check
src/model/providers/ollama.ts — the Ollama provider adapter, likely missing requestTimeout override
src/model/client.ts — the fallback chain dispatcher, likely applies global timeout equally to all providers

Fix Action

Fix / Workaround

The LLM request timeout is currently global (not per-provider), enforced in src/model/client.ts or equivalent request dispatch layer. When ollama-remote is the first candidate:

Code location (estimated):

src/infra/model-loader.ts — likely where loadModel() is called, no pre-warm check
src/model/providers/ollama.ts — the Ollama provider adapter, likely missing requestTimeout override
src/model/client.ts — the fallback chain dispatcher, likely applies global timeout equally to all providers

// In ollama provider, before dispatching first request:
async preWarm(modelId: string): Promise<void> {
  const resp = await fetch(`${this.baseUrl}/api/ps`);
  const data = await resp.json();
  if (!data.models?.some(m => m.name === modelId)) {
    // Trigger load without blocking — let it warm in background
    fetch(`${this.baseUrl}/api/generate`, {
      method: 'POST',
      body: JSON.stringify({ model: modelId, prompt: "", stream: false }),
    }).catch(() => {}); // fire-and-forget
  }
}

Code Example

Request flow:
  user message → openclaw 
    → LLM request to ollama-remote (qwen3.5:122b)
    → Ollama starts cold-loading model (0 bytes in memory)
    → openclaw waits... timeout fires at T seconds (likely 30s default)
    → HTTP 408 / timeout error
    → fallback chain activated: openai/gpt-4.1-mini
    → user data sent to OpenAI servers  ← SILENT EXFILTRATION

---

{
  "models": {
    "providers": {
      "ollama-remote": {
        "baseUrl": "http://192.168.178.122:11434",
        "api": "ollama",
        "requestTimeoutMs": 120000,
        "models": [{ "id": "qwen3.5:122b" }]
      }
    }
  }
}

---

// In ollama provider, before dispatching first request:
async preWarm(modelId: string): Promise<void> {
  const resp = await fetch(`${this.baseUrl}/api/ps`);
  const data = await resp.json();
  if (!data.models?.some(m => m.name === modelId)) {
    // Trigger load without blocking — let it warm in background
    fetch(`${this.baseUrl}/api/generate`, {
      method: 'POST',
      body: JSON.stringify({ model: modelId, prompt: "", stream: false }),
    }).catch(() => {}); // fire-and-forget
  }
}

RAW_BUFFERClick to expand / collapse

Bug: Ollama cold-start causes silent data exfiltration via timeout-triggered fallback

Issue type

Performance + Privacy bug

Summary

Root Cause Analysis

Where the problem lives:

The LLM request timeout is currently global (not per-provider), enforced in src/model/client.ts or equivalent request dispatch layer. When ollama-remote is the first candidate:

Request flow:
  user message → openclaw 
    → LLM request to ollama-remote (qwen3.5:122b)
    → Ollama starts cold-loading model (0 bytes in memory)
    → openclaw waits... timeout fires at T seconds (likely 30s default)
    → HTTP 408 / timeout error
    → fallback chain activated: openai/gpt-4.1-mini
    → user data sent to OpenAI servers  ← SILENT EXFILTRATION

The timeout is enforced before Ollama even finishes loading the model. No distinction is made between:

A genuinely slow/hung model (should fail fast)
A model that just needs more time to load from disk (should wait, or pre-warm)

Code location (estimated):

src/infra/model-loader.ts — likely where loadModel() is called, no pre-warm check
src/model/providers/ollama.ts — the Ollama provider adapter, likely missing requestTimeout override
src/model/client.ts — the fallback chain dispatcher, likely applies global timeout equally to all providers

Environment

Ollama version: 0.5+ (all versions)
Model sizes tested: 27B (~13s cold-start), 122B (~46s cold-start)
OpenClaw: any version with Ollama provider support

Steps to Reproduce

Configure Ollama provider with a large model: ollama-remote + qwen3.5:122b
Ensure model is NOT pre-loaded: ollama ps returns empty
Send any message through OpenClaw targeting ollama-remote
Observe: timeout fires within ~30s, fallback to cloud model, data sent externally

Expected vs Actual

	Expected	Actual
Cold-start duration	Model loads (13-60s), then inference	Timeout at ~30s before model loads
Fallback trigger	Auth failure or explicit model error	Silent timeout-based fallback
Data routing	Stays on local provider	Silently routes to cloud
User notification	Warning + consent	None

Proposed Fix (Two-part)

Part 1 — Per-provider requestTimeout override (minimal)

Add requestTimeoutMs to provider config, applied only to that provider's HTTP calls:

{
  "models": {
    "providers": {
      "ollama-remote": {
        "baseUrl": "http://192.168.178.122:11434",
        "api": "ollama",
        "requestTimeoutMs": 120000,
        "models": [{ "id": "qwen3.5:122b" }]
      }
    }
  }
}

Part 2 — Ollama pre-warm endpoint (robust solution)

Add a POST /api/ps or GET /api/tags check before first request to detect if model needs loading, and trigger async pre-warm:

// In ollama provider, before dispatching first request:
async preWarm(modelId: string): Promise<void> {
  const resp = await fetch(`${this.baseUrl}/api/ps`);
  const data = await resp.json();
  if (!data.models?.some(m => m.name === modelId)) {
    // Trigger load without blocking — let it warm in background
    fetch(`${this.baseUrl}/api/generate`, {
      method: 'POST',
      body: JSON.stringify({ model: modelId, prompt: "", stream: false }),
    }).catch(() => {}); // fire-and-forget
  }
}

Privacy Impact

HIGH — If a user configured local-only processing for data sovereignty reasons, a timeout-triggered fallback silently violates that intent. Combined with no user-visible fallback notification, this can expose sensitive data to cloud providers without the user's knowledge or consent.

Labels

bug, performance, privacy, provider: ollama

extent analysis

Fix Plan

To address the silent data exfiltration issue due to timeout-triggered fallback during Ollama's cold-start, we will implement a two-part solution:

Per-provider requestTimeout override:
- Update the provider configuration to include a requestTimeoutMs parameter specific to each provider.
- Apply this timeout only to the respective provider's HTTP calls.
Ollama pre-warm endpoint:
- Introduce a pre-warm mechanism that checks if the model is loaded before dispatching the first request.
- If the model is not loaded, trigger an asynchronous pre-warm process.

Code Changes

Part 1: Per-provider `requestTimeout` override

Update src/model/providers/ollama.ts to include the requestTimeoutMs configuration:

interface OllamaProviderConfig {
  baseUrl: string;
  api: string;
  requestTimeoutMs: number; // Add this line
  models: { id: string }[];
}

class OllamaProvider {
  private config: OllamaProviderConfig;

  constructor(config: OllamaProviderConfig) {
    this.config = config;
  }

  async makeRequest(request: any): Promise<any> {
    const timeoutMs = this.config.requestTimeoutMs;
    // Apply the per-provider timeout
    const response = await fetch(this.config.baseUrl, {
      method: 'POST',
      body: JSON.stringify(request),
      timeout: timeoutMs,
    });
    return response.json();
  }
}

Part 2: Ollama pre-warm endpoint

Implement the preWarm method in src/model/providers/ollama.ts:

class OllamaProvider {
  // ...

  async preWarm(modelId: string): Promise<void> {
    const resp = await fetch(`${this.config.baseUrl}/api/ps`);
    const data = await resp.json();
    if (!data.models?.some(m => m.name === modelId)) {
      // Trigger load without blocking — let it warm in background
      fetch(`${this.config.baseUrl}/api/generate`, {
        method: 'POST',
        body: JSON.stringify({ model: modelId, prompt: "", stream: false }),
      }).catch(() => {}); // fire-and-forget
    }
  }

  async makeRequest(request: any): Promise<any> {
    await this.preWarm(request.modelId); // Call preWarm before making the request
    // ...
  }
}

Verification

To verify the fix, follow these steps:

Configure the Ollama provider with a large model and an extended requestTimeoutMs.
Ensure the model is not pre-loaded.
Send a message through OpenClaw targeting the Ollama provider.
Observe that the model loads without triggering a timeout and that no data is sent to cloud providers.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #authentication issue #prompt issue #agent setup #request timeout

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix [Bug] Ollama cold-start timeout silently exfiltrates data via fallback chain [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Code Example

Bug: Ollama cold-start causes silent data exfiltration via timeout-triggered fallback

Issue type

Summary

Root Cause Analysis

Environment

Steps to Reproduce

Expected vs Actual

Proposed Fix (Two-part)

Privacy Impact

Labels

extent analysis

Fix Plan

Code Changes

Part 1: Per-provider `requestTimeout` override

Part 2: Ollama pre-warm endpoint

Verification

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix [Bug] Ollama cold-start timeout silently exfiltrates data via fallback chain [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Code Example

Bug: Ollama cold-start causes silent data exfiltration via timeout-triggered fallback

Issue type

Summary

Root Cause Analysis

Environment

Steps to Reproduce

Expected vs Actual

Proposed Fix (Two-part)

Privacy Impact

Labels

extent analysis

Fix Plan

Code Changes

Part 1: Per-provider requestTimeout override

Part 2: Ollama pre-warm endpoint

Verification

Still need to ship something?

RELATED_DISCOVERY

TRENDING

Part 1: Per-provider `requestTimeout` override