openclaw - 💡(How to fix) Fix [Bug]: QMD query mode unusable on macOS Apple Silicon — Metal GPU cleanup crash discards valid search results

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

QMD's query mode (hybrid BM25 + vector + reranking) is completely unusable on macOS Apple Silicon. The node-llama-cpp Metal GPU backend crashes during process exit cleanup, and OpenClaw's QMD manager discards valid search results because of the non-zero exit code.

This forces users to fall back to search (BM25-only) or vsearch (vector-only), both significantly worse than the full query pipeline with reranking.

Root Cause

QMD's query mode (hybrid BM25 + vector + reranking) is completely unusable on macOS Apple Silicon. The node-llama-cpp Metal GPU backend crashes during process exit cleanup, and OpenClaw's QMD manager discards valid search results because of the non-zero exit code.

Fix Action

Fix / Workaround

  • query mode with reranking scores 0.93 on test queries vs 0.81 (BM25) and 0.00 (vsearch for niche terms)
  • Active Memory plugin depends on QMD search quality
  • All macOS Apple Silicon users are affected (Metal is the only GPU framework on Apple Silicon)
  • No workaround exists for the in-process QMD path — we tried QMD_FORCE_CPU=1, a wrapper script replacing the qmd binary, and memory.qmd.command override. None fix the in-process path.

Code Example

ggml-metal-device.m:612: GGML_ASSERT([rsets->data count] == 0) failed

---

qmd query "test term" --json 2>/dev/null | cat
# Returns valid JSON with results scoring 0.93, exit code 0
RAW_BUFFERClick to expand / collapse

Summary

QMD's query mode (hybrid BM25 + vector + reranking) is completely unusable on macOS Apple Silicon. The node-llama-cpp Metal GPU backend crashes during process exit cleanup, and OpenClaw's QMD manager discards valid search results because of the non-zero exit code.

This forces users to fall back to search (BM25-only) or vsearch (vector-only), both significantly worse than the full query pipeline with reranking.

Environment

  • OpenClaw: 2026.5.18
  • QMD: 2.5.1 (@tobilu/qmd)
  • node-llama-cpp: 3.18.1
  • macOS: Darwin 25.4.0 (arm64, Apple Silicon M5 Pro)
  • Node: v22.22.2
  • Memory config: memory.backend: "qmd", memory.qmd.searchMode: "query"

What happens

  1. OpenClaw spawns qmd query "search term" --json -c collection1 -c collection2
  2. QMD successfully computes results (BM25 + vector + reranking via Qwen3-Reranker)
  3. Valid JSON is output to stdout (e.g., results scoring 0.93)
  4. During process exit, the Metal GPU cleanup crashes:
    ggml-metal-device.m:612: GGML_ASSERT([rsets->data count] == 0) failed
  5. Process exits with non-zero code (SIGABRT)
  6. OpenClaw's QMD manager sees the non-zero exit code and discards the valid JSON results
  7. Falls back to builtin SQLite engine, which has nothing indexed when using QMD backend
  8. User gets zero search results

Proof that results are valid before crash

Piping through cat captures stdout before the crash and returns clean results:

qmd query "test term" --json 2>/dev/null | cat
# Returns valid JSON with results scoring 0.93, exit code 0

The same command without the pipe crashes with SIGABRT after outputting the same valid JSON.

Impact

  • query mode with reranking scores 0.93 on test queries vs 0.81 (BM25) and 0.00 (vsearch for niche terms)
  • Active Memory plugin depends on QMD search quality
  • All macOS Apple Silicon users are affected (Metal is the only GPU framework on Apple Silicon)
  • No workaround exists for the in-process QMD path — we tried QMD_FORCE_CPU=1, a wrapper script replacing the qmd binary, and memory.qmd.command override. None fix the in-process path.

Proposed fix

When the QMD subprocess returns a non-zero exit code, check if stdout contains valid JSON search results before discarding them. If valid results were captured, use them. The crash is a known upstream issue in llama.cpp's Metal cleanup code that occurs after all computation is complete.

This is safe because:

  • The crash only happens during process exit static destructor cleanup
  • The JSON output is complete and well-formed before the crash
  • stderr contains the crash dump, stdout contains the results
  • This is documented upstream with no fix timeline

Alternatively, consider an option to force the QMD subprocess to run with QMD_FORCE_CPU=1 in its environment, which prevents Metal from loading entirely.

Related issues

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: QMD query mode unusable on macOS Apple Silicon — Metal GPU cleanup crash discards valid search results