openclaw - 💡(How to fix) Fix [Suggestion/Dev Tool] Code-Memory DB for OpenClaw development use [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#84295Fetched 2026-05-20 03:41:41
View on GitHub
Comments
1
Participants
2
Timeline
3
Reactions
1
Timeline (top)
closed ×1commented ×1labeled ×1

code-memory is a local MCP server that indexes a Git repo into a SQLite database using tree-sitter AST parsing + sentence-transformer embeddings, then exposes semantic code search, doc search, and Git history search as MCP tools. It's a genuinely useful development accelerator for a codebase the size of OpenClaw.

This post documents: a patched fork that fixes issues we found while indexing OpenClaw, the indexing process, and a real workflow example using the resulting DB to analyze a live bug report.


Error Message

Bug C — dist/index.js error poisons doctor exit code:

Root Cause

A companion repo documents the patches in detail (diffs, root cause analysis, PR text):

Fix Action

Fix / Workaround

This post documents: a patched fork that fixes issues we found while indexing OpenClaw, the indexing process, and a real workflow example using the resulting DB to analyze a live bug report.

Patched fork

Until the upstream author merges our PR, anyone wanting to run this against OpenClaw should use our full patched fork of code-memory rather than the upstream install:

Code Example

git clone https://github.com/jimdawdy-hub/code-memory.git
cd code-memory
uv sync

---

{
  "mcpServers": {
    "code-memory": {
      "command": "uvx",
      "args": ["code-memory"]
    }
  }
}
RAW_BUFFERClick to expand / collapse

Summary

code-memory is a local MCP server that indexes a Git repo into a SQLite database using tree-sitter AST parsing + sentence-transformer embeddings, then exposes semantic code search, doc search, and Git history search as MCP tools. It's a genuinely useful development accelerator for a codebase the size of OpenClaw.

This post documents: a patched fork that fixes issues we found while indexing OpenClaw, the indexing process, and a real workflow example using the resulting DB to analyze a live bug report.


Patched fork

Until the upstream author merges our PR, anyone wanting to run this against OpenClaw should use our full patched fork of code-memory rather than the upstream install:

https://github.com/jimdawdy-hub/code-memory — full patched fork, ready to run

A companion repo documents the patches in detail (diffs, root cause analysis, PR text):

https://github.com/jimdawdy-hub/code-memory-patches — patch documentation and standalone indexer script

The upstream code-memory v1.0.32 has three bugs that only surface on large repos. We hit all three while indexing OpenClaw (17,212 source files, 945 doc files). The patches are drop-in replacements for two files in the installed package:

  1. Cross-thread SQLite crash (~30% of files silently skipped) — the parallel file parser accessed a shared SQLite connection from multiple threads simultaneously, causing InterfaceError on roughly a third of files. Fix: pre-fetch file metadata in the main thread before the worker pool starts.

  2. Duplicate symbol crash kills DB write — tree-sitter's AST extraction can produce duplicate symbol entries for a single file, crashing the DB write phase after all parsing and GPU embedding had already completed. Fix: INSERT OR IGNORE with correct new-row detection via cursor.rowcount (not cursor.lastrowid, which is unreliable after a no-op ignore).

  3. Embedding insert crash on sqlite-vec virtual table — a follow-on from the above fix: the symbol_embeddings table is a sqlite-vec virtual table that rejects conflict-resolution clauses. Fix: only queue embeddings for freshly inserted symbols.

We submitted a PR to the upstream author (#11) with the full details. Once that merges, the standard uvx code-memory install will work without any patching.

The patches repo also includes run-code-memory-index.py — a standalone script that bypasses the MCP connection entirely. This is important for OpenClaw: the full index takes ~60 minutes on GPU, which outlasts a typical MCP stdio session. Running standalone under nohup lets it complete regardless of session state.


Indexing OpenClaw

Install from the patched fork:

git clone https://github.com/jimdawdy-hub/code-memory.git
cd code-memory
uv sync

Or use the standalone script from the patches repo with the patched uvx tool install. Hardware used: HP Z640, NVIDIA RTX 5060 Ti (16 GB VRAM), CUDA 13.2.

Result: 17,212 code files, 945 doc files, 111,000 symbols, 2.6M cross-references, 750 MB SQLite DB — completed in ~59 minutes. The GPU handled embedding generation in ~8 minutes; the remainder was tree-sitter parsing and DB writes.

Once indexed, add to your Claude Code MCP config:

{
  "mcpServers": {
    "code-memory": {
      "command": "uvx",
      "args": ["code-memory"]
    }
  }
}

Then index_codebase("/path/to/openclaw") picks up the existing DB incrementally on subsequent sessions.


Workflow example — issue #84252

To demonstrate the DB in use: issue #84252 (doctor/status can leave openai-codex OAuth sidecar auth partially repaired) was filed today. Using semantic search against the index, we traced the report to three specific source locations in about 2 minutes:

Bug A — models status reports broken profiles as configured: src/gateway/server-methods/models-auth-status.ts counts a profile as present if it exists in the store, without attempting runtime credential resolution. An oauthRef-only profile (no access/refresh) passes the existence check but fails at resolveApiKeyForProvider mid-run. Status should surface has_access=false && has_oauthRef=true as auth-broken.

Bug B — gateway auth cache not cleared after doctor --fix: src/commands/doctor/repair-sequencing.ts:135 calls maybeRepairLegacyOAuthSidecarProfiles() and writes migrated credentials to disk, but never calls clearLoadedAuthStoreCache() (src/agents/auth-profiles/store-cache.ts:48) or invalidateModelAuthStatusCache() (src/gateway/server-methods/models-auth-status.ts:90) afterward. The running gateway keeps the stale cached store until restart. Both functions exist and are called elsewhere — they're just missing from the repair path.

Bug C — dist/index.js error poisons doctor exit code: A post-repair subprocess invokes the built CLI, which doesn't exist on source-checkout/LaunchAgent installs. Non-fatal to the actual repair, but causes a nonzero exit that makes the repair appear to have failed when it succeeded.

Full findings posted as a comment on #84252.

The DB made this workflow fast: semantic queries like "gateway auth cache invalidate reload" and "oauthRef sidecar migration inline credentials" returned the exact file:line locations without needing to know the codebase structure in advance.


Kudos

Hat tip to @kapillamba4 for building code-memory. The tool works exactly as advertised — the bugs we found are edge cases that only appear on unusually large repos, and the core architecture (tree-sitter + sentence-transformers + sqlite-vec + hybrid BM25/vector search) is solid. OpenClaw at 17k files is probably one of the larger TypeScript codebases it's been run against. The PR is upstream if he wants to merge the fixes.


Interested in feedback from the maintainer team on whether a pre-built DB snapshot (updated on releases) would be worth hosting, or if there's a preferred path for contributors to run their own index.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Suggestion/Dev Tool] Code-Memory DB for OpenClaw development use [1 comments, 2 participants]