openclaw - 💡(How to fix) Fix Azure OpenAI Responses stalls before first event when memory tools are exposed

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Azure OpenAI Responses can return HTTP 200 headers for an OpenClaw embedded run and then never deliver a first parsed Responses event when memory tools are exposed. The same Azure deployment streams normally for direct Responses API calls, and the same OpenClaw run succeeds when the memory tools are excluded from tools.allow.

This is related to the broader Responses stream-stall reports in #76305 and Azure Responses adapter problems in #79570, but the trigger isolated here is the OpenClaw tool surface, specifically memory_get / memory tools.

Error Message

Azure OpenAI Responses should either stream a first event promptly or fail with a bounded, actionable error when a provider cannot handle a tool payload. It should not leave the embedded run in model_call:started until timeout after headers have already arrived.

Root Cause

Azure OpenAI Responses can return HTTP 200 headers for an OpenClaw embedded run and then never deliver a first parsed Responses event when memory tools are exposed. The same Azure deployment streams normally for direct Responses API calls, and the same OpenClaw run succeeds when the memory tools are excluded from tools.allow.

This is related to the broader Responses stream-stall reports in #76305 and Azure Responses adapter problems in #79570, but the trigger isolated here is the OpenClaw tool surface, specifically memory_get / memory tools.

Fix Action

Fix / Workaround

After proof / workaround: exclude memory tools

I found one concrete provider-compat problem while isolating this: memory tool corpus schemas used Type.Union([Type.Literal(...)]), which compiles to anyOf. The repo already has a flat stringEnum helper for this class of provider compatibility issue. I patched memory-core to use flat string enums and added a regression test that asserts no anyOf is emitted for those corpus fields.

That cleanup is useful but does not fully resolve this Azure stall: after the enum patch, enabling memory_get still reproduced the pre-first-event stall. The remaining bug is likely either another part of the memory tool schema/description or Azure's handling of this specific memory tool surface.

Code Example

OPENCLAW_DEBUG_MODEL_TRANSPORT=1 \
OPENCLAW_DEBUG_SSE=events \
OPENCLAW_DEBUG_MODEL_PAYLOAD=summary \
AZURE_OPENAI_API_VERSION=2025-04-01-preview \
pnpm openclaw agent --local \
  --session-id devdiv-memoryget-062004 \
  --model azure-openai-responses-devdiv/gpt-5.4-pro \
  --thinking off \
  --timeout 35 \
  --message 'Reply with only: ok' \
  --json

---

[responses] start provider=azure-openai-responses-devdiv api=azure-openai-responses model=gpt-5.4-pro ... tools=count=16 sample=edit,exec,memory_get,process,read,... reasoningEffort=undefined ...
[model-fetch] start ... method=POST url=https://devdiv-test-playground.openai.azure.com/openai/responses timeoutMs=180000 proxy=none policy=custom
[model-fetch] response ... status=200 elapsedMs=1154 contentType=text/event-stream; charset=utf-8
[responses] headers ... elapsedMs=1175

---

[responses] start ... tools=count=21 sample=cron,edit,exec,image,image_generate,memory_get,memory_search,...
[model-fetch] response ... status=200 ... contentType=text/event-stream; charset=utf-8
[responses] headers ...
[agent/embedded] embedded run timeout ... timeoutMs=90000

---

"tools.allow": [
  "read", "edit", "write", "exec", "process", "update_plan",
  "sessions_list", "sessions_history", "sessions_send", "sessions_spawn",
  "sessions_yield", "subagents", "session_status", "web_search", "web_fetch"
]

---

OPENCLAW_DEBUG_MODEL_TRANSPORT=1 \
OPENCLAW_DEBUG_SSE=events \
OPENCLAW_DEBUG_MODEL_PAYLOAD=summary \
AZURE_OPENAI_API_VERSION=2025-04-01-preview \
pnpm openclaw agent --local \
  --session-id devdiv-web-060949 \
  --model azure-openai-responses-devdiv/gpt-5.4-pro \
  --thinking off \
  --timeout 45 \
  --message 'Reply with only: ok' \
  --json

---

[responses] start ... tools=count=15 sample=edit,exec,process,read,session_status,sessions_history,sessions_list,sessions_send,sessions_spawn,sessions_yield,subagents,update_plan ...
[model-fetch] response ... status=200 elapsedMs=2955 contentType=text/event-stream; charset=utf-8
[responses] headers ... elapsedMs=2975
[responses] first_event ... elapsedMs=5 type=response.created
[responses] stream_done ... events=11 ... stopReason=stop contentBlocks=2

---

pnpm test extensions/memory-core/src/tools.test.ts -- --reporter=verbose
# 1 file passed, 7 tests passed

pnpm build
# passed

pnpm openclaw config validate
# Config valid: ~/.openclaw/openclaw.json
RAW_BUFFERClick to expand / collapse

Summary

Azure OpenAI Responses can return HTTP 200 headers for an OpenClaw embedded run and then never deliver a first parsed Responses event when memory tools are exposed. The same Azure deployment streams normally for direct Responses API calls, and the same OpenClaw run succeeds when the memory tools are excluded from tools.allow.

This is related to the broader Responses stream-stall reports in #76305 and Azure Responses adapter problems in #79570, but the trigger isolated here is the OpenClaw tool surface, specifically memory_get / memory tools.

Environment

  • OpenClaw checkout: 7116cc8a2c plus local debug/fix branch
  • Runtime: Node 24.15.0
  • Provider API: azure-openai-responses
  • Azure API version used by the process: 2025-04-01-preview
  • Azure route shape verified: /openai/responses?api-version=2025-04-01-preview
  • Model/deployment: gpt-5.4-pro

Before proof: memory tool stalls before first event

With memory_get enabled, OpenClaw reaches Azure and receives streaming response headers, but no [responses] first_event log arrives before timeout.

Command shape:

OPENCLAW_DEBUG_MODEL_TRANSPORT=1 \
OPENCLAW_DEBUG_SSE=events \
OPENCLAW_DEBUG_MODEL_PAYLOAD=summary \
AZURE_OPENAI_API_VERSION=2025-04-01-preview \
pnpm openclaw agent --local \
  --session-id devdiv-memoryget-062004 \
  --model azure-openai-responses-devdiv/gpt-5.4-pro \
  --thinking off \
  --timeout 35 \
  --message 'Reply with only: ok' \
  --json

Observed transport lines:

[responses] start provider=azure-openai-responses-devdiv api=azure-openai-responses model=gpt-5.4-pro ... tools=count=16 sample=edit,exec,memory_get,process,read,... reasoningEffort=undefined ...
[model-fetch] start ... method=POST url=https://devdiv-test-playground.openai.azure.com/openai/responses timeoutMs=180000 proxy=none policy=custom
[model-fetch] response ... status=200 elapsedMs=1154 contentType=text/event-stream; charset=utf-8
[responses] headers ... elapsedMs=1175

No [responses] first_event or [responses] stream_done followed before the run was killed/timeout-bound.

A prior full/default tool-surface run showed the same pattern and eventually timed out:

[responses] start ... tools=count=21 sample=cron,edit,exec,image,image_generate,memory_get,memory_search,...
[model-fetch] response ... status=200 ... contentType=text/event-stream; charset=utf-8
[responses] headers ...
[agent/embedded] embedded run timeout ... timeoutMs=90000

Control proof: endpoint and streaming are valid

A direct non-OpenClaw curl call to the same Azure Responses endpoint with stream=true returned SSE immediately, including response.created, response.output_text.delta, and response.completed.

A direct curl call with cache fields, reasoning include, about 30k input chars, and 21 simple function tools also streamed promptly. That points away from endpoint/auth/API-version and toward the OpenClaw tool payload or tool-conditioned prompt surface.

After proof / workaround: exclude memory tools

With memory tools excluded and core/session/web tools allowed, the same OpenClaw model call succeeds:

"tools.allow": [
  "read", "edit", "write", "exec", "process", "update_plan",
  "sessions_list", "sessions_history", "sessions_send", "sessions_spawn",
  "sessions_yield", "subagents", "session_status", "web_search", "web_fetch"
]

Command shape:

OPENCLAW_DEBUG_MODEL_TRANSPORT=1 \
OPENCLAW_DEBUG_SSE=events \
OPENCLAW_DEBUG_MODEL_PAYLOAD=summary \
AZURE_OPENAI_API_VERSION=2025-04-01-preview \
pnpm openclaw agent --local \
  --session-id devdiv-web-060949 \
  --model azure-openai-responses-devdiv/gpt-5.4-pro \
  --thinking off \
  --timeout 45 \
  --message 'Reply with only: ok' \
  --json

Observed success:

[responses] start ... tools=count=15 sample=edit,exec,process,read,session_status,sessions_history,sessions_list,sessions_send,sessions_spawn,sessions_yield,subagents,update_plan ...
[model-fetch] response ... status=200 elapsedMs=2955 contentType=text/event-stream; charset=utf-8
[responses] headers ... elapsedMs=2975
[responses] first_event ... elapsedMs=5 type=response.created
[responses] stream_done ... events=11 ... stopReason=stop contentBlocks=2

The final assistant payload was ok.

Local fix branch evidence

I found one concrete provider-compat problem while isolating this: memory tool corpus schemas used Type.Union([Type.Literal(...)]), which compiles to anyOf. The repo already has a flat stringEnum helper for this class of provider compatibility issue. I patched memory-core to use flat string enums and added a regression test that asserts no anyOf is emitted for those corpus fields.

That cleanup is useful but does not fully resolve this Azure stall: after the enum patch, enabling memory_get still reproduced the pre-first-event stall. The remaining bug is likely either another part of the memory tool schema/description or Azure's handling of this specific memory tool surface.

Verification run locally

pnpm test extensions/memory-core/src/tools.test.ts -- --reporter=verbose
# 1 file passed, 7 tests passed

pnpm build
# passed

pnpm openclaw config validate
# Config valid: ~/.openclaw/openclaw.json

Expected behavior

Azure OpenAI Responses should either stream a first event promptly or fail with a bounded, actionable error when a provider cannot handle a tool payload. It should not leave the embedded run in model_call:started until timeout after headers have already arrived.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Azure OpenAI Responses should either stream a first event promptly or fail with a bounded, actionable error when a provider cannot handle a tool payload. It should not leave the embedded run in model_call:started until timeout after headers have already arrived.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Azure OpenAI Responses stalls before first event when memory tools are exposed