openclaw - 💡(How to fix) Fix long model timeouts . [1 participants]

Error Message

17:11:34 warn agent/embedded Profile ollama:default timed out. Trying next account... 17:11:34 warn agent/embedded embedded run failover decision 17:11:34 error diagnostic lane task error: lane=main durationMs=122445 error="FailoverError: LLM request timed out." 17:11:34 error diagnostic lane task error: lane=session:agent:main:main durationMs=122447 error="FailoverError: LLM request timed out." 17:11:34 warn model-fallback/decision model fallback decision 17:12:19 warn model-fallback/decision model fallback decision

Root Cause

LLM generated summary : When performing complex tasks, the system's model selection is unreliable (e.g., asking for a specific model fails) and performance degrades severely as context grows, even if context clean-up is performed. To me it would timeout and these is not way that , I or the LLM can change the timeouts it would just remove the setting on gateway restart .

Code Example

17:11:34
warn
agent/embedded
Profile ollama:default timed out. Trying next account...
17:11:34
warn
agent/embedded
embedded run failover decision
17:11:34
error
diagnostic
lane task error: lane=main durationMs=122445 error="FailoverError: LLM request timed out."
17:11:34
error
diagnostic
lane task error: lane=session:agent:main:main durationMs=122447 error="FailoverError: LLM request timed out."
17:11:34
warn
model-fallback/decision
model fallback decision
17:12:19
warn
model-fallback/decision
model fallback decision

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

Summary

Steps to reproduce

using a large local model qwen3.5:27b on a ( strix halo ) at over 20% ctx , or about 50% on qwen3.5:9b would timeout , then fall back to qwen3.5:27b , timeout again and run out of models to use.

Expected behavior

LLM generated , The system should reliably allow the user to select any installed model (e.g., ollama/qwen3.5:27b) via a clear, working CLI command. Performance should remain stable across the full range of models, regardless of context size, assuming the context size is manageable (e.g., under 50k tokens).

Actual behavior

LLM generated , Model switching commands fail or result in the system sticking to the default model despite explicit requests (e.g., asking for 27B results in 9B response). Performance degrades noticeably when context accumulates, indicating that the model parameters are the primary limiting factor, not the context window ceiling. To me , being able to set a larger timeout ?

OpenClaw version

4.23 , the latest .

Operating system

ubuntu server 20.04 ?

Install method

via your install .

Model

ollama local , various qwen3.5 and gemma4 models

Provider / routing chain

openclaw / gateway on VM > ollama direct on host .

Additional provider/model setup details

NAME ID SIZE MODIFIED
qwen3.5:9b 6488c96fa5fa 6.6 GB 4 days ago
qwen3.5:35b-a3b 3460ffeede54 23 GB 5 days ago
qwen3.5:27b 7653528ba5cb 17 GB 5 days ago
gemma4:26b 5571076f3d70 17 GB 5 days ago
gemma4:31b 6316f0629137 19 GB 10 days ago
gemma4:e4b c6eb396dbd59 9.6 GB 10 days ago

Logs, screenshots, and evidence

17:11:34
warn
agent/embedded
Profile ollama:default timed out. Trying next account...
17:11:34
warn
agent/embedded
embedded run failover decision
17:11:34
error
diagnostic
lane task error: lane=main durationMs=122445 error="FailoverError: LLM request timed out."
17:11:34
error
diagnostic
lane task error: lane=session:agent:main:main durationMs=122447 error="FailoverError: LLM request timed out."
17:11:34
warn
model-fallback/decision
model fallback decision
17:12:19
warn
model-fallback/decision
model fallback decision

Impact and severity

ask it to make a skill with qwen3.5:9b , this is what I want you to do , make this skill with these rules , making the skill but as the context window is filling up the model is getting slower to until timeout , swaps modes , I dead system without a /new or /clear.

Additional information

ollama overrides . [Service] Environment="HSA_OVERRIDE_GFX_VERSION=11.5.1" Environment="HSA_ENABLE_SDMA=0" Environment="OLLAMA_HOST=0.0.0.0" Environment="OLLAMA_NUM_PARALLEL=4 Environment=OLLAMA_REQUEST_TIMEOUT=10m # API request timeout Environment=OLLAMA_KEEP_ALIVE=-1 # Model keep-alive duration Environment=OLLAMA_MAX_QUEUE=512 # Maximum queued requests

extent analysis

TL;DR

Increase the OLLAMA_REQUEST_TIMEOUT environment variable to a higher value to prevent timeouts when using large models with big context sizes.

Guidance

The issue seems to be related to the model request timeout, which is currently set to 10 minutes (OLLAMA_REQUEST_TIMEOUT=10m). Increasing this value may help prevent timeouts when using large models.
Verify that the model selection is working correctly by checking the logs for any error messages related to model fallback or timeouts.
Consider adjusting the OLLAMA_KEEP_ALIVE environment variable to a positive value to enable model keep-alive functionality, which may help improve performance.
Check the system resources (e.g., CPU, memory, disk space) to ensure they are sufficient to handle the large models and context sizes.

Example

No code snippet is provided as it is not necessary for this issue.

Notes

The optimal value for OLLAMA_REQUEST_TIMEOUT will depend on the specific use case and system configuration. It may be necessary to experiment with different values to find the best balance between preventing timeouts and avoiding excessive wait times.

Recommendation

Apply a workaround by increasing the OLLAMA_REQUEST_TIMEOUT environment variable to a higher value, such as 30 minutes (OLLAMA_REQUEST_TIMEOUT=30m), to prevent timeouts when using large models with big context sizes.

FAQ

openclaw - 💡(How to fix) Fix long model timeouts . [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Bug type

Beta release blocker

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING