openclaw - 💡(How to fix) Fix feat: add timeoutMs to ModelProviderConfig for per-provider request timeout [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#55490Fetched 2026-04-08 01:38:57
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0
Author
Participants

Fix Action

Workaround

Currently none at the config level. The only mitigations are:

  • --no-mmap --mlock on the llama-server side to eliminate cold starts
  • Setting timeoutSeconds on individual cron jobs (works for scheduled runs, not interactive sessions)

A per-provider timeoutMs config option would be a clean, targeted fix.

Code Example

{
  models: {
    providers: {
      llamacpp: {
        baseUrl: "http://localhost:8080",
        apiKey: "local",
        api: "openai-completions",
        timeoutMs: 120000,  // 2 min for cold-start recovery
        models: [...]
      }
    }
  }
}

---

export type ModelProviderConfig = {
  baseUrl: string;
  apiKey?: SecretInput;
  auth?: ModelProviderAuthMode;
  api?: ModelApi;
  injectNumCtxForOpenAICompat?: boolean;
  headers?: Record<string, SecretInput>;
  authHeader?: boolean;
  models: ModelDefinitionConfig[];
  // ← no timeoutMs
};
RAW_BUFFERClick to expand / collapse

Problem

ModelProviderConfig (in src/config/types.models.ts) has no timeoutMs field, making it impossible to configure a per-provider request timeout for local LLM providers (llamacpp, vllm, ollama, etc.).

This is a real pain point for operators running large local models (80–120GB) on unified-memory hardware (DGX Spark / Strix Halo). After a period of idle, the OS evicts model pages from cache and the first request after reload can take 10–30 seconds of TTFT before --mlock kicks in or pages are re-faulted. The default HTTP timeout at the network layer often fires before the model finishes the cold reload, killing the request before a single token is generated.

There is currently no way to set a longer timeout for local providers without patching OpenClaw itself.

Expected behavior

{
  models: {
    providers: {
      llamacpp: {
        baseUrl: "http://localhost:8080",
        apiKey: "local",
        api: "openai-completions",
        timeoutMs: 120000,  // 2 min for cold-start recovery
        models: [...]
      }
    }
  }
}

Current ModelProviderConfig type

export type ModelProviderConfig = {
  baseUrl: string;
  apiKey?: SecretInput;
  auth?: ModelProviderAuthMode;
  api?: ModelApi;
  injectNumCtxForOpenAICompat?: boolean;
  headers?: Record<string, SecretInput>;
  authHeader?: boolean;
  models: ModelDefinitionConfig[];
  // ← no timeoutMs
};

Proposed change

Add optional timeoutMs?: number to ModelProviderConfig and thread it through to the fetch/stream layer when making LLM requests against that provider.

Use case

  • Local models on unified-memory hardware (DGX Spark GB10, Strix Halo Ryzen AI Max+) where cold-start page-fault reloads cause slow TTFT after idle
  • Self-hosted LiteLLM / vLLM / sglang deployments with slow startup or high load
  • Any local provider where the operator knows their p99 TTFT and wants to set an appropriate timeout rather than relying on a global default

Workaround

Currently none at the config level. The only mitigations are:

  • --no-mmap --mlock on the llama-server side to eliminate cold starts
  • Setting timeoutSeconds on individual cron jobs (works for scheduled runs, not interactive sessions)

A per-provider timeoutMs config option would be a clean, targeted fix.

extent analysis

Fix Plan

To add a per-provider request timeout for local LLM providers, we need to modify the ModelProviderConfig type to include an optional timeoutMs field. Here are the steps:

  • Update the ModelProviderConfig type in src/config/types.models.ts to include the timeoutMs field:
export type ModelProviderConfig = {
  baseUrl: string;
  apiKey?: SecretInput;
  auth?: ModelProviderAuthMode;
  api?: ModelApi;
  injectNumCtxForOpenAICompat?: boolean;
  headers?: Record<string, SecretInput>;
  authHeader?: boolean;
  models: ModelDefinitionConfig[];
  timeoutMs?: number; // Add this line
};
  • Thread the timeoutMs value through to the fetch/stream layer when making LLM requests against that provider. This will likely involve updating the code that makes the requests to use the timeoutMs value from the provider config.

Example:

const fetchOptions: FetchOptions = {
  timeout: providerConfig.timeoutMs,
  // Other fetch options...
};

fetch(providerConfig.baseUrl, fetchOptions)
  .then((response) => {
    // Handle response...
  })
  .catch((error) => {
    // Handle error...
  });
  • Update any relevant documentation to reflect the new timeoutMs option.

Verification

To verify that the fix worked, you can:

  • Update your ModelProviderConfig to include a timeoutMs value, for example:
{
  models: {
    providers: {
      llamacpp: {
        baseUrl: "http://localhost:8080",
        apiKey: "local",
        api: "openai-completions",
        timeoutMs:

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

{
  models: {
    providers: {
      llamacpp: {
        baseUrl: "http://localhost:8080",
        apiKey: "local",
        api: "openai-completions",
        timeoutMs: 120000,  // 2 min for cold-start recovery
        models: [...]
      }
    }
  }
}

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING