openclaw - 💡(How to fix) Fix [Bug]: voice-call embedded responder ignores agent tools.allow and times out with Ollama [1 pull requests]

Q: Expected behavior

When the routed voice agent has `tools.allow: []` and the prompt says no tool use, the voice-call embedded response run should be LLM-only or otherwise respect the agent's no-tool configuration. The request to Ollama should not include OpenClaw tool schemas, and the model should return a short spoken JSON response within the configured timeout.

openclaw2026-05-08 19:29:25

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

The voice-call plugin's streaming/conversation embedded response path ignores a routed voice agent's tools.allow: [] / no-tool configuration, sends tool schemas to an Ollama response model anyway, and times out before producing a spoken response.

Error Message

Gateway log excerpt, redacted

[voice-call] Transcript for <twilio-call-sid>: Hey, how you doing? (chars=19) [voice-call] Auto-responding to inbound call <call-id>: "Hey, how you doing?" [voice-call] Media stream disconnected: <twilio-call-sid> (<stream-sid>) [voice-call] Auto-ending call <call-id> after stream disconnect grace 2026-05-07T23:24:58.096-07:00 [agent/embedded] embedded run timeout: runId=voice:<call-id>:1778221476764 sessionId=4f36d284-b20b-4f27-a7eb-b4209c0282a1 timeoutMs=20000 2026-05-07T23:24:58.115-07:00 [agent/embedded] embedded run failover decision: runId=voice:<call-id>:1778221476764 stage=assistant decision=surface_error reason=timeout from=ollama/qwen2.5:1.5b profile=- [voice-call] Response generation error: Response generation was aborted

Voice trajectory evidence from ~/.openclaw/agents/voice/sessions/<session>.trajectory.jsonl

session.started: agentId: voice messageProvider: voice provider: ollama modelId: qwen2.5:1.5b toolCount: 13

context.compiled: systemPrompt: You are a fast phone voice assistant. Reply only as valid JSON: {"spoken":"..."}. Keep spoken under 18 words. No markdown. No tool use. Be direct, warm, and conversational. prompt: Hey, how you doing? messages: [] tools included despite tools.allow=[]: image, image_generate, memory_get, memory_search, session_status, sessions_history, sessions_list, sessions_send, sessions_spawn, sessions_yield, subagents, update_plan, video_generate

model.completed: aborted: true timedOut: true promptError: request timed out | request timed out assistantTexts: [] finalPromptText: Hey, how you doing?

Direct Ollama control test, same model, no tools

POST http://127.0.0.1:11434/api/chat model=qwen2.5:1.5b elapsed: ~2.2s response content: "I'm great, thanks for asking! How can I help today?"

Direct Ollama comparison with the same 13 tool schemas included

payload chars: ~16082, tools: 13 elapsed: ~7.8s response returned successfully in direct test, but the real embedded voice run timed out at 20s.

Root Cause

Fix Action

Fixed

Fixed by PR: fix(voice-call): pass agent tools.allow to embedded response runner (https://github.com/openclaw/openclaw/pull/79508)

Code Example

{
     "id": "voice",
     "model": { "primary": "ollama/qwen2.5:1.5b", "fallbacks": [] },
     "thinkingDefault": "off",
     "reasoningDefault": "off",
     "fastModeDefault": true,
     "params": { "temperature": 0.2, "maxTokens": 80, "cacheRetention": "none" },
     "tools": { "allow": [] },
     "systemPromptOverride": "You are a fast phone voice assistant. Reply only as valid JSON: {\"spoken\":\"...\"}. Keep spoken under 18 words. No tool use."
   }

---

{
     "plugins": {
       "entries": {
         "voice-call": {
           "config": {
             "responseModel": "ollama/qwen2.5:1.5b",
             "responseTimeoutMs": 20000,
             "numbers": {
               "+REDACTED": {
                 "agentId": "voice",
                 "responseModel": "ollama/qwen2.5:1.5b",
                 "responseTimeoutMs": 20000
               }
             }
           }
         }
       }
     }
   }

---

# Gateway log excerpt, redacted
[voice-call] Transcript for <twilio-call-sid>: Hey, how you doing? (chars=19)
[voice-call] Auto-responding to inbound call <call-id>: "Hey, how you doing?"
[voice-call] Media stream disconnected: <twilio-call-sid> (<stream-sid>)
[voice-call] Auto-ending call <call-id> after stream disconnect grace
2026-05-07T23:24:58.096-07:00 [agent/embedded] embedded run timeout: runId=voice:<call-id>:1778221476764 sessionId=4f36d284-b20b-4f27-a7eb-b4209c0282a1 timeoutMs=20000
2026-05-07T23:24:58.115-07:00 [agent/embedded] embedded run failover decision: runId=voice:<call-id>:1778221476764 stage=assistant decision=surface_error reason=timeout from=ollama/qwen2.5:1.5b profile=-
[voice-call] Response generation error: Response generation was aborted

# Voice trajectory evidence from ~/.openclaw/agents/voice/sessions/<session>.trajectory.jsonl
session.started:
  agentId: voice
  messageProvider: voice
  provider: ollama
  modelId: qwen2.5:1.5b
  toolCount: 13

context.compiled:
  systemPrompt: You are a fast phone voice assistant. Reply only as valid JSON: {"spoken":"..."}. Keep spoken under 18 words. No markdown. No tool use. Be direct, warm, and conversational.
  prompt: Hey, how you doing?
  messages: []
  tools included despite tools.allow=[]:
    image, image_generate, memory_get, memory_search, session_status,
    sessions_history, sessions_list, sessions_send, sessions_spawn,
    sessions_yield, subagents, update_plan, video_generate

model.completed:
  aborted: true
  timedOut: true
  promptError: request timed out | request timed out
  assistantTexts: []
  finalPromptText: Hey, how you doing?

# Direct Ollama control test, same model, no tools
POST http://127.0.0.1:11434/api/chat model=qwen2.5:1.5b
elapsed: ~2.2s
response content: "I'm great, thanks for asking! How can I help today?"

# Direct Ollama comparison with the same 13 tool schemas included
payload chars: ~16082, tools: 13
elapsed: ~7.8s
response returned successfully in direct test, but the real embedded voice run timed out at 20s.

RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

Summary

Steps to reproduce

Run OpenClaw 2026.5.7 with @openclaw/voice-call 2026.5.7 configured for Twilio inbound calls, streaming transcription enabled, and conversation mode.

Configure a dedicated voice agent with a fast Ollama model and no tools, for example:

{
  "id": "voice",
  "model": { "primary": "ollama/qwen2.5:1.5b", "fallbacks": [] },
  "thinkingDefault": "off",
  "reasoningDefault": "off",
  "fastModeDefault": true,
  "params": { "temperature": 0.2, "maxTokens": 80, "cacheRetention": "none" },
  "tools": { "allow": [] },
  "systemPromptOverride": "You are a fast phone voice assistant. Reply only as valid JSON: {\"spoken\":\"...\"}. Keep spoken under 18 words. No tool use."
}

Route the voice-call plugin/number to that agent and model:

{
  "plugins": {
    "entries": {
      "voice-call": {
        "config": {
          "responseModel": "ollama/qwen2.5:1.5b",
          "responseTimeoutMs": 20000,
          "numbers": {
            "+REDACTED": {
              "agentId": "voice",
              "responseModel": "ollama/qwen2.5:1.5b",
              "responseTimeoutMs": 20000
            }
          }
        }
      }
    }
  }
}

Call the Twilio number from an allowed caller and speak a short utterance, e.g. Hey, how you doing?.
Observe that STT succeeds and handleInboundResponse() starts, but no AI response is spoken; the embedded run times out and the call is ended after stream disconnect grace.
Compare with a direct Ollama /api/chat request using the same model and no tools; the same simple prompt returns in ~2 seconds.

Expected behavior

When the routed voice agent has tools.allow: [] and the prompt says no tool use, the voice-call embedded response run should be LLM-only or otherwise respect the agent's no-tool configuration. The request to Ollama should not include OpenClaw tool schemas, and the model should return a short spoken JSON response within the configured timeout.

Actual behavior

The voice-call embedded run compiles 13 OpenClaw tools despite the routed voice agent having tools.allow: []. The compiled trajectory shows tools such as image, image_generate, memory_get, memory_search, session_status, sessions_history, sessions_list, sessions_send, sessions_spawn, sessions_yield, subagents, update_plan, and video_generate being included. The Ollama request times out at responseTimeoutMs=20000, producing no assistant text, and voice-call logs Response generation error: Response generation was aborted.

OpenClaw version

2026.5.7 (eeef486)

Operating system

Linux 6.8.0-111-generic (x64)

Install method

npm global / OpenClaw Gateway running as systemd user service

Model

ollama/qwen2.5:1.5b for the voice response model

Provider / routing chain

Twilio Programmable Voice -> Tailscale Funnel -> OpenClaw voice-call webhook -> OpenAI streaming transcription -> OpenClaw embedded voice agent response -> local Ollama qwen2.5:1.5b -> OpenAI TTS

Additional provider/model setup details

Voice-call plugin: @openclaw/voice-call 2026.5.7
Voice-call mode: Twilio inbound, streaming transcription enabled, realtime disabled, conversation mode
STT provider: OpenAI gpt-4o-transcribe
TTS provider: OpenAI gpt-4o-mini-tts
Dedicated routed voice agent: agentId: voice
Voice agent model: ollama/qwen2.5:1.5b
Voice agent config includes tools.allow: []
Voice-call number route also sets agentId: voice and responseModel: ollama/qwen2.5:1.5b
Direct Ollama /api/chat to qwen2.5:1.5b with the same short prompt works in ~2.2s when no tools are supplied.
Direct Ollama /api/chat with the same 13 OpenClaw tool schemas takes materially longer (~7.8s in a warmed direct test) and the observed OpenClaw embedded run timed out at 20s during the real voice path.

Logs, screenshots, and evidence

# Gateway log excerpt, redacted
[voice-call] Transcript for <twilio-call-sid>: Hey, how you doing? (chars=19)
[voice-call] Auto-responding to inbound call <call-id>: "Hey, how you doing?"
[voice-call] Media stream disconnected: <twilio-call-sid> (<stream-sid>)
[voice-call] Auto-ending call <call-id> after stream disconnect grace
2026-05-07T23:24:58.096-07:00 [agent/embedded] embedded run timeout: runId=voice:<call-id>:1778221476764 sessionId=4f36d284-b20b-4f27-a7eb-b4209c0282a1 timeoutMs=20000
2026-05-07T23:24:58.115-07:00 [agent/embedded] embedded run failover decision: runId=voice:<call-id>:1778221476764 stage=assistant decision=surface_error reason=timeout from=ollama/qwen2.5:1.5b profile=-
[voice-call] Response generation error: Response generation was aborted

# Voice trajectory evidence from ~/.openclaw/agents/voice/sessions/<session>.trajectory.jsonl
session.started:
  agentId: voice
  messageProvider: voice
  provider: ollama
  modelId: qwen2.5:1.5b
  toolCount: 13

context.compiled:
  systemPrompt: You are a fast phone voice assistant. Reply only as valid JSON: {"spoken":"..."}. Keep spoken under 18 words. No markdown. No tool use. Be direct, warm, and conversational.
  prompt: Hey, how you doing?
  messages: []
  tools included despite tools.allow=[]:
    image, image_generate, memory_get, memory_search, session_status,
    sessions_history, sessions_list, sessions_send, sessions_spawn,
    sessions_yield, subagents, update_plan, video_generate

model.completed:
  aborted: true
  timedOut: true
  promptError: request timed out | request timed out
  assistantTexts: []
  finalPromptText: Hey, how you doing?

# Direct Ollama control test, same model, no tools
POST http://127.0.0.1:11434/api/chat model=qwen2.5:1.5b
elapsed: ~2.2s
response content: "I'm great, thanks for asking! How can I help today?"

# Direct Ollama comparison with the same 13 tool schemas included
payload chars: ~16082, tools: 13
elapsed: ~7.8s
response returned successfully in direct test, but the real embedded voice run timed out at 20s.

Impact and severity

Affected: Users running voice-call streaming/conversation mode with a routed low-latency local/Ollama voice agent configured with no tools.

Severity: High for that setup. STT succeeds, but the voice assistant produces no spoken response after the caller speaks.

Frequency: Observed on the tested inbound call path with the routed voice agent and ollama/qwen2.5:1.5b response model.

Consequence: Calls appear connected and transcription works, but the assistant goes silent after user speech; calls can then be ended by disconnect handling, making the voice responder unusable for this configuration.

Additional information

The local source appears to show why this happens. In the installed @openclaw/voice-call 2026.5.7 response-generator bundle, generateVoiceResponse() calls agentRuntime.runEmbeddedPiAgent({ ... }) with agentId, provider, model, timeoutMs, etc., but does not appear to pass an explicit toolsAllow from the routed agent config or disableTools: true for no-tool voice runs.

This looks related to the existing voice-call embedded-run plumbing issues, but is not the same symptom:

#60118 — voice-call responseModel config ignored / model override plumbing
#56367 — voice-call ignores agent sandbox defaults
#71262 — realtime voice tool-bridging gap
#17613 — embedded local LLM timeout while direct local request works

Possible fix direction: when voiceConfig.agentId routes to an agent with tools.allow, carry that allowlist into runEmbeddedPiAgent() as toolsAllow. If the effective allowlist is an empty array, pass disableTools: true or otherwise ensure no tool schemas are compiled into the embedded voice response run.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

#api #generation error #memory leak #API versioning #request timeout

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - 💡(How to fix) Fix [Bug]: voice-call embedded responder ignores agent tools.allow and times out with Ollama [1 pull requests]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Gateway log excerpt, redacted

Voice trajectory evidence from ~/.openclaw/agents/voice/sessions/<session>.trajectory.jsonl

Direct Ollama control test, same model, no tools

Direct Ollama comparison with the same 13 tool schemas included

Root Cause

Fix Action

Fixed

Code Example

Bug type

Beta release blocker

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING