openclaw - ✅(Solved) Fix memory_search vector query uses non-indexed full scan instead of sqlite-vec KNN — ~190× speedup available [2 pull requests, 2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#69666Fetched 2026-04-22 07:49:38
View on GitHub
Comments
2
Participants
2
Timeline
5
Reactions
1
Participants
Timeline (top)
commented ×2cross-referenced ×2referenced ×1

memory_search (tool call path, in-gateway) is ~200× slower than it should be because searchVector uses a non-indexed SQL pattern against sqlite-vec. The query calls vec_distance_cosine(...) in SELECT with ORDER BY dist LIMIT 10, which causes sqlite-vec to do a full table scan over every vector in chunks_vec. Switching to sqlite-vec's native KNN operator (WHERE embedding MATCH ? AND k = 10) on the same DB returns identical results ~190× faster.

On a workspace with 10,883 chunks (Qwen3-Embedding-8B, 4096 dims) this is the difference between ~48 ms and ~8,490 ms per memory search.

Root Cause

memory_search (tool call path, in-gateway) is ~200× slower than it should be because searchVector uses a non-indexed SQL pattern against sqlite-vec. The query calls vec_distance_cosine(...) in SELECT with ORDER BY dist LIMIT 10, which causes sqlite-vec to do a full table scan over every vector in chunks_vec. Switching to sqlite-vec's native KNN operator (WHERE embedding MATCH ? AND k = 10) on the same DB returns identical results ~190× faster.

Fix Action

Fixed

PR fix notes

PR #24: fix: use KNN index for memory_search vector queries (#69666)

Description (problem / solution / changelog)

Summary

Fixes #69666 — memory_search uses non-indexed full table scan, ~190× slower than KNN.

Root Cause

in used in the column list with , which forces sqlite-vec to perform a full table scan on every query (no index usage).

Fix

Changed the query to use sqlite-vec's native KNN syntax:

-- Before (full table scan):
SELECT ... vec_distance_cosine(v.embedding, ?) AS dist
  FROM chunks_vec v
  JOIN chunks c ON c.id = v.id
 WHERE c.model = ?...
 ORDER BY dist ASC
 LIMIT ?

-- After (index-accelerated KNN):
SELECT ... vec_distance_cosine(v.embedding, ?) AS dist
  FROM chunks_vec v
  JOIN chunks c ON c.id = v.id
 WHERE v.embedding MATCH ? AND k = ?
   AND c.model = ?...
 ORDER BY dist ASC

The clause activates the HNSW index for O(log n) retrieval instead of O(n) linear scan. The parameter replaces the clause in the parameter order.

Impact

  • 10,883 chunks: linear scan ~8,490 ms → KNN ~45 ms (~190× improvement)
  • No change to result ordering or score computation
  • Backward compatible: same result shape and semantics

Testing

  • Added 3 regression tests in covering KNN ordering, source filtering, and limit behavior
  • All existing tests continue to pass
  • KNN tests gracefully skip when module is unavailable (consistent with FTS trigram test pattern)

Closes #69666

Changed files

  • .agents/maintainers.md (removed, +0/-1)
  • .agents/skills/openclaw-parallels-smoke/SKILL.md (modified, +15/-0)
  • .agents/skills/openclaw-qa-testing/SKILL.md (modified, +10/-10)
  • .agents/skills/openclaw-release-maintainer/SKILL.md (modified, +12/-0)
  • .agents/skills/openclaw-secret-scanning-maintainer/SKILL.md (modified, +31/-12)
  • .agents/skills/openclaw-secret-scanning-maintainer/scripts/secret-scanning.mjs (modified, +274/-15)
  • .agents/skills/openclaw-test-performance/SKILL.md (added, +134/-0)
  • .agents/skills/openclaw-test-performance/agents/openai.yaml (added, +6/-0)
  • .github/actionlint.yaml (modified, +2/-0)
  • .github/actions/setup-node-env/action.yml (modified, +6/-6)
  • .github/actions/setup-pnpm-store-cache/action.yml (modified, +3/-16)
  • .github/instructions/copilot.instructions.md (modified, +3/-3)
  • .github/workflows/ci.yml (modified, +1247/-353)
  • .github/workflows/codeql.yml (modified, +8/-7)
  • .github/workflows/control-ui-locale-refresh.yml (modified, +2/-2)
  • .github/workflows/docker-release.yml (modified, +23/-15)
  • .github/workflows/docs-sync-publish.yml (modified, +2/-2)
  • .github/workflows/install-smoke.yml (modified, +14/-11)
  • .github/workflows/macos-release.yml (modified, +0/-1)
  • .github/workflows/openclaw-cross-os-release-checks-reusable.yml (added, +472/-0)
  • .github/workflows/openclaw-live-and-e2e-checks-reusable.yml (added, +658/-0)
  • .github/workflows/openclaw-npm-release.yml (modified, +12/-3)
  • .github/workflows/openclaw-release-checks.yml (modified, +113/-37)
  • .github/workflows/openclaw-scheduled-live-checks.yml (added, +74/-0)
  • .github/workflows/parity-gate.yml (modified, +11/-2)
  • .github/workflows/plugin-clawhub-release.yml (modified, +0/-3)
  • .github/workflows/plugin-npm-release.yml (modified, +0/-3)
  • .github/workflows/sandbox-common-smoke.yml (modified, +4/-1)
  • .github/workflows/workflow-sanity.yml (modified, +3/-1)
  • .oxlintrc.json (modified, +29/-2)
  • .pre-commit-config.yaml (modified, +2/-2)
  • .vscode/settings.json (modified, +2/-1)
  • AGENTS.md (modified, +199/-318)
  • CHANGELOG.md (modified, +283/-1)
  • CONTRIBUTING.md (modified, +5/-2)
  • Dockerfile (modified, +7/-1)
  • Dockerfile.sandbox (modified, +1/-1)
  • Dockerfile.sandbox-browser (modified, +1/-1)
  • README.md (modified, +252/-384)
  • SECURITY.md (modified, +5/-0)
  • appcast.xml (modified, +116/-0)
  • apps/android/app/build.gradle.kts (modified, +2/-2)
  • apps/android/app/src/main/java/ai/openclaw/app/gateway/GatewayDiscovery.kt (modified, +36/-35)
  • apps/android/app/src/main/java/ai/openclaw/app/ui/CanvasScreen.kt (modified, +2/-8)
  • apps/ios/CHANGELOG.md (modified, +16/-0)
  • apps/ios/Config/Version.xcconfig (modified, +2/-2)
  • apps/ios/Sources/Gateway/GatewayConnectionController.swift (modified, +2/-1)
  • apps/ios/Sources/Gateway/GatewaySettingsStore.swift (modified, +3/-1)
  • apps/ios/Sources/HomeToolbar.swift (modified, +4/-1)
  • apps/ios/Sources/Model/NodeAppModel.swift (modified, +79/-48)
  • apps/ios/Sources/Onboarding/OnboardingWizardView.swift (modified, +3/-1)
  • apps/ios/Sources/Services/WatchConnectivityTransport.swift (modified, +7/-4)
  • apps/ios/Sources/Services/WatchMessagingService.swift (modified, +11/-3)
  • apps/ios/Sources/Voice/TalkModeManager.swift (modified, +6/-2)
  • apps/ios/fastlane/metadata/en-US/release_notes.txt (modified, +1/-1)
  • apps/ios/version.json (modified, +1/-1)
  • apps/macos/Sources/OpenClaw/AppState.swift (modified, +52/-66)
  • apps/macos/Sources/OpenClaw/ChannelsStore+Lifecycle.swift (modified, +2/-1)
  • apps/macos/Sources/OpenClaw/CommandResolver.swift (modified, +7/-3)
  • apps/macos/Sources/OpenClaw/ExecApprovalCommandDisplaySanitizer.swift (modified, +15/-1)
  • apps/macos/Sources/OpenClaw/ExecApprovalsSocket.swift (modified, +2/-4)
  • apps/macos/Sources/OpenClaw/GeneralSettings.swift (modified, +3/-1)
  • apps/macos/Sources/OpenClaw/NodeMode/MacNodeModeCoordinator.swift (modified, +1/-0)
  • apps/macos/Sources/OpenClaw/NodeMode/MacNodeRuntime.swift (modified, +32/-4)
  • apps/macos/Sources/OpenClaw/NodeMode/MacNodeRuntimeMainActorServices.swift (modified, +22/-0)
  • apps/macos/Sources/OpenClaw/NodeMode/MacNodeScreenCommands.swift (modified, +9/-0)
  • apps/macos/Sources/OpenClaw/NodePairingApprovalPrompter.swift (modified, +1/-2)
  • apps/macos/Sources/OpenClaw/OnboardingView+Pages.swift (modified, +3/-1)
  • apps/macos/Sources/OpenClaw/RemoteGatewayProbe.swift (modified, +21/-12)
  • apps/macos/Sources/OpenClaw/RemotePortTunnel.swift (modified, +1/-3)
  • apps/macos/Sources/OpenClaw/Resources/Info.plist (modified, +2/-2)
  • apps/macos/Sources/OpenClaw/ScreenSnapshotService.swift (added, +109/-0)
  • apps/macos/Sources/OpenClawProtocol/GatewayModels.swift (modified, +18/-0)
  • apps/macos/Tests/OpenClawIPCTests/AppStateRemoteConfigTests.swift (modified, +56/-49)
  • apps/macos/Tests/OpenClawIPCTests/CommandResolverTests.swift (modified, +3/-0)
  • apps/macos/Tests/OpenClawIPCTests/ExecApprovalCommandDisplaySanitizerTests.swift (modified, +33/-0)
  • apps/macos/Tests/OpenClawIPCTests/MacNodeRuntimeTests.swift (modified, +101/-0)
  • apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatComposer.swift (modified, +37/-25)
  • apps/shared/OpenClawKit/Sources/OpenClawKit/ScreenCommands.swift (modified, +25/-0)
  • apps/shared/OpenClawKit/Sources/OpenClawProtocol/GatewayModels.swift (modified, +18/-0)
  • apps/shared/OpenClawKit/Tests/OpenClawKitTests/ChatComposerTextViewTests.swift (added, +15/-0)
  • docs/.generated/config-baseline.sha256 (modified, +4/-4)
  • docs/.generated/plugin-sdk-api-baseline.sha256 (modified, +2/-2)
  • docs/.i18n/glossary.zh-CN.json (modified, +24/-0)
  • docs/automation/cron-jobs.md (modified, +7/-1)
  • docs/automation/hooks.md (modified, +3/-1)
  • docs/automation/tasks.md (modified, +1/-1)
  • docs/channels/bluebubbles.md (modified, +49/-0)
  • docs/channels/groups.md (modified, +2/-2)
  • docs/channels/index.md (modified, +1/-1)
  • docs/channels/matrix.md (modified, +5/-3)
  • docs/channels/pairing.md (modified, +6/-0)
  • docs/channels/telegram.md (modified, +6/-2)
  • docs/channels/troubleshooting.md (modified, +10/-8)
  • docs/channels/wechat.md (added, +168/-0)
  • docs/ci.md (modified, +42/-28)
  • docs/cli/config.md (modified, +28/-0)
  • docs/cli/devices.md (modified, +9/-2)
  • docs/cli/gateway.md (modified, +21/-7)
  • docs/cli/hooks.md (modified, +1/-0)

PR #69680: fix(memory): use sqlite-vec KNN for searchVector (~190× speedup)

Description (problem / solution / changelog)

Summary

Replace searchVector's full-table-scan SQL with sqlite-vec's native KNN operator. Keeps vec_distance_cosine() in the SELECT so the returned score stays in the expected cosine [0, 1] range.

Fixes #69666.

Benchmark

Measured on a real 10,827-chunk workspace (4096-dim Qwen3-Embedding-8B):

PatternTime per query
Before (vec_distance_cosine(...) AS dist + ORDER BY dist LIMIT)~8,490 ms
Naive KNN (v.distance AS dist + MATCH ? AND k)~48 ms (but returns 0 results — see below)
After (this PR: vec_distance_cosine + MATCH ? AND k)~50 ms

~190× speedup, same result set.

Why the naive fix doesn't work

sqlite-vec creates chunks_vec with L2 distance by default, not cosine:

CREATE VIRTUAL TABLE chunks_vec USING vec0(id TEXT PRIMARY KEY, embedding FLOAT[4096])

So v.distance is the squared L2 distance, which can exceed 1. score = 1 - dist then goes negative for any non-trivial query, and the downstream minScore filter drops every result.

The correct fix uses MATCH ? AND k = ? only for candidate selection (this is where the speedup lives — sqlite-vec's vec0 index walks the shards), and keeps vec_distance_cosine() in the SELECT for the score, matching the existing semantics.

Implementation notes

  • The query vector is bound twice now: once for vec_distance_cosine(v.embedding, ?) and once for MATCH ?.
  • LIMIT ? is removed; AND k = ? caps the KNN candidate pool to the same count.
  • ORDER BY dist ASC still sorts by cosine distance — sqlite-vec's KNN ordering (L2) is only used for candidate pruning; final ordering is unchanged.
  • No change to the fallback path (listChunks(...).map(cosineSimilarity)) when sqlite-vec isn't available.

Testing

  • Local gateway running against a 10,827-chunk store returns identical top-K ids to the previous implementation for all test queries (spot-checked across semantic, keyword-heavy, and low-overlap queries).
  • Search latency dropped from 8-30s (observed with multiple concurrent tool calls queuing) to ~2s end-to-end; the remaining ~2s is merge/MMR/decay, not the vector SQL (separate optimization opportunity, out of scope for this PR).

Related

  • Filed #69667 (configurable contextSize for local embedding provider) in the same debug session. Independent change; will send a follow-up PR.

Alternative considered

Creating chunks_vec with distance_metric=cosine at schema time would let us use v.distance directly. That's a cleaner long-term shape but requires a migration for existing installs, so I opted for the source-compatible SELECT-side cosine which needs zero schema change and no reindex.

Changed files

  • extensions/memory-core/src/memory/manager-search.ts (modified, +13/-5)

Code Example

async function searchVector(params) {
    if (params.queryVec.length === 0 || params.limit <= 0) return [];
    if (await params.ensureVectorReady(params.queryVec.length)) return params.db.prepare(`SELECT c.id, c.path, c.start_line, c.end_line, c.text,
       c.source,
       vec_distance_cosine(v.embedding, ?) AS dist
  FROM ${params.vectorTable} v
  JOIN chunks c ON c.id = v.id
 WHERE c.model = ?${params.sourceFilterVec.sql}
 ORDER BY dist ASC
 LIMIT ?`).all(vectorToBlob(params.queryVec), params.providerModel, ...params.sourceFilterVec.params, params.limit).map(/* ... */);
    // fallback (pure JS cosine over listChunks) ...
}

---

import { DatabaseSync } from "node:sqlite";
const db = new DatabaseSync("/home/<you>/.openclaw/memory/main.sqlite",
  { readOnly: true, allowExtension: true });
db.loadExtension("<openclaw>/node_modules/sqlite-vec-linux-arm64/vec0.so");

const blob = /* any 4096-float Buffer (can pull one from chunks_vec itself) */;
const MODEL_TAG = "hf:Qwen/Qwen3-Embedding-8B-GGUF/Qwen3-Embedding-8B-Q8_0.gguf";

// A) OpenClaw's current pattern (linear scan)
console.time("A_linear");
db.prepare(`
  SELECT c.id, vec_distance_cosine(v.embedding, ?) AS dist
    FROM chunks_vec v JOIN chunks c ON c.id = v.id
   WHERE c.model = ?
   ORDER BY dist ASC LIMIT 10`).all(blob, MODEL_TAG);
console.timeEnd("A_linear");

// B) sqlite-vec KNN
console.time("B_knn");
db.prepare(`
  SELECT v.id, v.distance
    FROM chunks_vec v
   WHERE v.embedding MATCH ? AND k = 10
   ORDER BY v.distance`).all(blob);
console.timeEnd("B_knn");

// C) KNN + join chunks (drop-in replacement for OpenClaw's current query)
console.time("C_knn_join");
db.prepare(`
  SELECT c.id, c.path, c.start_line, c.end_line, c.text, c.source, v.distance
    FROM chunks_vec v JOIN chunks c ON c.id = v.id
   WHERE v.embedding MATCH ? AND k = 10 AND c.model = ?
   ORDER BY v.distance`).all(blob, MODEL_TAG);
console.timeEnd("C_knn_join");

---

A) OpenClaw pattern (linear scan via vec_distance_cosine):
  run 1: 8494ms, 10 results
  run 2: 8488ms, 10 results
  run 3: 8485ms, 10 results

B) sqlite-vec MATCH operator (uses KNN index):
  run 1: 52ms, 10 results
  run 2: 48ms, 10 results
  run 3: 45ms, 10 results

C) MATCH + join to chunks (filtered by model):
  run 1: 44ms, 10 results
  run 2: 45ms, 10 results
  run 3: 44ms, 10 results

---

async function searchVector(params) {
    if (params.queryVec.length === 0 || params.limit <= 0) return [];
    if (!(await params.ensureVectorReady(params.queryVec.length))) {
        // keep the existing pure-JS fallback
        return /* existing listChunks(...).map cosineSimilarity path */;
    }
    return params.db.prepare(`
        SELECT c.id, c.path, c.start_line, c.end_line, c.text, c.source,
               v.distance AS dist
          FROM ${params.vectorTable} v
          JOIN chunks c ON c.id = v.id
         WHERE v.embedding MATCH ?
           AND k = ?
           AND c.model = ?${params.sourceFilterVec.sql}
         ORDER BY v.distance`
    ).all(
        vectorToBlob(params.queryVec),
        params.limit,
        params.providerModel,
        ...params.sourceFilterVec.params,
    ).map((row) => ({
        id: row.id,
        path: row.path,
        startLine: row.start_line,
        endLine: row.end_line,
        score: 1 - row.dist,
        snippet: truncateUtf16Safe(row.text, params.snippetMaxChars),
        source: row.source,
    }));
}
RAW_BUFFERClick to expand / collapse

Bug type

Performance regression (linear-scan vector search)

Summary

memory_search (tool call path, in-gateway) is ~200× slower than it should be because searchVector uses a non-indexed SQL pattern against sqlite-vec. The query calls vec_distance_cosine(...) in SELECT with ORDER BY dist LIMIT 10, which causes sqlite-vec to do a full table scan over every vector in chunks_vec. Switching to sqlite-vec's native KNN operator (WHERE embedding MATCH ? AND k = 10) on the same DB returns identical results ~190× faster.

On a workspace with 10,883 chunks (Qwen3-Embedding-8B, 4096 dims) this is the difference between ~48 ms and ~8,490 ms per memory search.

Environment

  • OpenClaw: 2026.4.15 (041266a)
  • Node.js: v25.9.0
  • OS: Ubuntu 24.04 / Linux 6.17 (aarch64 / NVIDIA GB10)
  • SQLite extension: sqlite-vec-linux-arm64/vec0.so (bundled with OpenClaw)
  • Embedding provider: local (Qwen3-Embedding-8B-Q8_0.gguf via node-llama-cpp 3.18.1, fully GPU-offloaded)
  • Memory store: ~/.openclaw/memory/main.sqlite, 10,883 chunks, 4096-dim vectors
  • Install: npm install -g openclaw

Root cause (code location)

File: dist/manager-cQ8cHF3H.js (the build is minified; matching symbol is searchVector).

Current implementation:

async function searchVector(params) {
    if (params.queryVec.length === 0 || params.limit <= 0) return [];
    if (await params.ensureVectorReady(params.queryVec.length)) return params.db.prepare(`SELECT c.id, c.path, c.start_line, c.end_line, c.text,
       c.source,
       vec_distance_cosine(v.embedding, ?) AS dist
  FROM ${params.vectorTable} v
  JOIN chunks c ON c.id = v.id
 WHERE c.model = ?${params.sourceFilterVec.sql}
 ORDER BY dist ASC
 LIMIT ?`).all(vectorToBlob(params.queryVec), params.providerModel, ...params.sourceFilterVec.params, params.limit).map(/* ... */);
    // fallback (pure JS cosine over listChunks) ...
}

vec_distance_cosine() is a plain SQL function; sqlite-vec only activates its KNN ANN index when a query uses the special MATCH ? AND k = N syntax. Without MATCH, the planner has to evaluate the distance for every row in chunks_vec, then sort.

Reproduction

Minimal repro script that isolates OpenClaw's SQL (runs against the real main.sqlite):

import { DatabaseSync } from "node:sqlite";
const db = new DatabaseSync("/home/<you>/.openclaw/memory/main.sqlite",
  { readOnly: true, allowExtension: true });
db.loadExtension("<openclaw>/node_modules/sqlite-vec-linux-arm64/vec0.so");

const blob = /* any 4096-float Buffer (can pull one from chunks_vec itself) */;
const MODEL_TAG = "hf:Qwen/Qwen3-Embedding-8B-GGUF/Qwen3-Embedding-8B-Q8_0.gguf";

// A) OpenClaw's current pattern (linear scan)
console.time("A_linear");
db.prepare(`
  SELECT c.id, vec_distance_cosine(v.embedding, ?) AS dist
    FROM chunks_vec v JOIN chunks c ON c.id = v.id
   WHERE c.model = ?
   ORDER BY dist ASC LIMIT 10`).all(blob, MODEL_TAG);
console.timeEnd("A_linear");

// B) sqlite-vec KNN
console.time("B_knn");
db.prepare(`
  SELECT v.id, v.distance
    FROM chunks_vec v
   WHERE v.embedding MATCH ? AND k = 10
   ORDER BY v.distance`).all(blob);
console.timeEnd("B_knn");

// C) KNN + join chunks (drop-in replacement for OpenClaw's current query)
console.time("C_knn_join");
db.prepare(`
  SELECT c.id, c.path, c.start_line, c.end_line, c.text, c.source, v.distance
    FROM chunks_vec v JOIN chunks c ON c.id = v.id
   WHERE v.embedding MATCH ? AND k = 10 AND c.model = ?
   ORDER BY v.distance`).all(blob, MODEL_TAG);
console.timeEnd("C_knn_join");

Observed on a 10,827-vector store (local repro, 3 runs each, DB warm):

A) OpenClaw pattern (linear scan via vec_distance_cosine):
  run 1: 8494ms, 10 results
  run 2: 8488ms, 10 results
  run 3: 8485ms, 10 results

B) sqlite-vec MATCH operator (uses KNN index):
  run 1: 52ms, 10 results
  run 2: 48ms, 10 results
  run 3: 45ms, 10 results

C) MATCH + join to chunks (filtered by model):
  run 1: 44ms, 10 results
  run 2: 45ms, 10 results
  run 3: 44ms, 10 results

Impact

Observed end-to-end memory_search tool latency under OpenClaw 2026.4.15 on a machine with no background load:

  • Fresh gateway, sequential queries: 19.6s → 28.1s → 36.7s → 36.7s (appears monotonically worse because parallel tool calls queue behind each other; each individual vec scan is ~8.5s and calls serialize on the embedding context's withLock).
  • With direct node-llama-cpp: embed = 43–80ms; sqlite-vec FTS = 5–14ms.
  • All remaining wall-clock time (~8.5s/query) is spent inside the non-indexed SQL.

Scales O(chunks × dims). On larger workspaces (or after sessions grow), this is the dominant cost of every search.

Proposed fix

Drop-in replacement for searchVector (pattern C above):

async function searchVector(params) {
    if (params.queryVec.length === 0 || params.limit <= 0) return [];
    if (!(await params.ensureVectorReady(params.queryVec.length))) {
        // keep the existing pure-JS fallback
        return /* existing listChunks(...).map cosineSimilarity path */;
    }
    return params.db.prepare(`
        SELECT c.id, c.path, c.start_line, c.end_line, c.text, c.source,
               v.distance AS dist
          FROM ${params.vectorTable} v
          JOIN chunks c ON c.id = v.id
         WHERE v.embedding MATCH ?
           AND k = ?
           AND c.model = ?${params.sourceFilterVec.sql}
         ORDER BY v.distance`
    ).all(
        vectorToBlob(params.queryVec),
        params.limit,
        params.providerModel,
        ...params.sourceFilterVec.params,
    ).map((row) => ({
        id: row.id,
        path: row.path,
        startLine: row.start_line,
        endLine: row.end_line,
        score: 1 - row.dist,
        snippet: truncateUtf16Safe(row.text, params.snippetMaxChars),
        source: row.source,
    }));
}

Notes / caveats:

  • sqlite-vec requires k as a literal or bound param in the WHERE clause, not as LIMIT.
  • MATCH + AND c.model = ? works; sqlite-vec runs KNN first, then filters by the SQL predicates.
  • Results are identical (same ids, same order) to the current query on all queries tested.
  • If multiple models ever share a chunks_vec table (I didn't see that here, but defensively), you may want to filter the KNN candidate pool larger and re-trim after the model filter.

Other small observations (not blockers)

  • manager-cQ8cHF3H.jsembedBatch still uses Promise.all(texts.map(async text => ctx.getEmbeddingFor(text))). This is the exact pattern from closed issue #7548, and while our current KV-cache/concurrency behavior under node-llama-cpp 3.18.1 didn't deadlock, every concurrent call serializes on node-llama-cpp's internal withLock anyway. Consider switching to a sequential loop to (a) match the documented behavior and (b) avoid re-introducing the old deadlock if upstream changes behavior.
  • No gpuLayers is passed to llama.loadModel() in engine-embeddings-Bk3B82BS.js. On systems with available VRAM the default is fine, but an explicit value (or gpuLayers: "max") would make the config more predictable and avoid surprises on mixed-GPU hosts.

How I'd like to help

Happy to open a PR with the searchVector change and a small test that asserts sub-100ms on a seeded 10k-chunk DB if that helps speed things along.


Related prior art: issue #7548 (closed not-planned) flagged the Promise.all embedding pattern but didn't surface the SQL issue, which is the actual hot path.

extent analysis

TL;DR

The proposed fix is to replace the searchVector function with a new implementation that uses the MATCH operator provided by sqlite-vec, which enables the use of an index and significantly improves query performance.

Guidance

  • Identify the current implementation of searchVector in dist/manager-cQ8cHF3H.js and replace it with the proposed drop-in replacement.
  • Verify that the new implementation produces the same results as the current one by running the provided reproduction script and comparing the output.
  • Test the performance of the new implementation using the reproduction script and measure the query time to ensure it is significantly improved.
  • Consider adding a test to assert that the query time is sub-100ms on a seeded 10k-chunk DB to ensure the fix is effective.

Example

The proposed replacement for searchVector is provided in the issue body:

async function searchVector(params) {
    // ...
    return params.db.prepare(`
        SELECT c.id, c.path, c.start_line, c.end_line, c.text, c.source,
               v.distance AS dist
          FROM ${params.vectorTable} v
          JOIN chunks c ON c.id = v.id
         WHERE v.embedding MATCH ?
           AND k = ?
           AND c.model = ?${params.sourceFilterVec.sql}
         ORDER BY v.distance`
    ).all(
        vectorToBlob(params.queryVec),
        params.limit,
        params.providerModel,
        ...params.sourceFilterVec.params,
    ).map((row) => ({
        id: row.id,
        path: row.path,
        startLine: row.start_line,
        endLine: row.end_line,
        score: 1 - row.dist,
        snippet: truncateUtf16Safe(row.text, params.snippetMaxChars),
        source: row.source,
    }));
}

Notes

  • The proposed fix assumes that the MATCH operator is supported by the sqlite-vec extension and that the k parameter is a literal or bound parameter in the WHERE clause.
  • The fix may not be applicable if multiple models share a chunks_vec table, in which case additional filtering may be necessary.

Recommendation

Apply the proposed workaround by replacing the searchVector function with the new implementation, as it has been shown to significantly improve query performance.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix memory_search vector query uses non-indexed full scan instead of sqlite-vec KNN — ~190× speedup available [2 pull requests, 2 comments, 2 participants]