openclaw - ✅(Solved) Fix Bug: memory_search mislabels session transcript hits as corpus=memory [2 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#72885Fetched 2026-04-28 06:30:53
View on GitHub
Comments
0
Participants
1
Timeline
2
Reactions
0
Author
Participants
Timeline (top)
cross-referenced ×2

When memory_search returns session-transcript hits from the built-in memory backend, the tool response labels those results with corpus: "memory" even when the underlying hit has source: "sessions".

This makes session-backed hits indistinguishable from durable memory-file hits for downstream consumers that rely on corpus.

Root Cause

In extensions/memory-core/src/tools.ts, the built-in search path maps all surfaced memory results to corpus: "memory" without checking result.source.

Fix Action

Fixed

PR fix notes

PR #72886: fix(memory-search): label session hits with sessions corpus

Description (problem / solution / changelog)

Summary

Fix memory_search so built-in session transcript hits are surfaced with corpus: "sessions" instead of always being flattened to corpus: "memory".

Closes #72885.

What changed

  • preserve corpus: "sessions" when a built-in memory_search hit has source: "sessions"
  • keep durable memory-file hits labeled as corpus: "memory"
  • add a regression test covering the session-hit path

Reproduction

  1. Enable session transcript indexing.
  2. Query memory_search for content that exists only in a session transcript.
  3. Inspect the returned hit.

Before

A transcript-backed hit could be returned as:

{
  "corpus": "memory",
  "source": "sessions"
}

After

The same hit is returned as:

{
  "corpus": "sessions",
  "source": "sessions"
}

Testing

  • node scripts/run-vitest.mjs run --config test/vitest/vitest.unit.config.ts extensions/memory-core/src/tools.citations.test.ts

Changed files

  • .agents/skills/openclaw-release-maintainer/SKILL.md (modified, +5/-0)
  • CHANGELOG.md (modified, +1337/-1599)
  • docs/.generated/plugin-sdk-api-baseline.sha256 (modified, +2/-2)
  • extensions/browser/browser-profiles.ts (modified, +1/-0)
  • extensions/browser/src/browser/config.ts (modified, +1/-0)
  • extensions/memory-core/src/tools.citations.test.ts (modified, +34/-0)
  • extensions/memory-core/src/tools.ts (modified, +7/-3)
  • extensions/qa-lab/src/web-runtime.test.ts (modified, +1/-0)
  • extensions/qa-lab/src/web-runtime.ts (modified, +39/-1)
  • package.json (modified, +4/-4)
  • qa/scenarios/ui/control-ui-qa-channel-image-roundtrip.md (modified, +33/-12)
  • scripts/e2e/parallels-npm-update-smoke.sh (modified, +13/-3)
  • scripts/release-check.ts (modified, +38/-10)
  • scripts/run-vitest.mjs (modified, +31/-1)
  • scripts/test-projects.test-support.mjs (modified, +5/-1)
  • src/agents/pi-embedded-runner/run/trigger-policy.test.ts (added, +22/-0)
  • src/agents/pi-embedded-runner/run/trigger-policy.ts (modified, +3/-3)
  • src/agents/runtime-auth-refresh.ts (modified, +2/-2)
  • src/channels/plugins/module-loader.test.ts (modified, +2/-2)
  • src/gateway/call.ts (modified, +2/-1)
  • src/gateway/client.ts (modified, +3/-2)
  • src/gateway/gateway-codex-harness.live-helpers.ts (modified, +2/-0)
  • src/gateway/gateway-models.profiles.live.test.ts (modified, +4/-1)
  • src/gateway/probe.ts (modified, +3/-2)
  • src/gateway/server-chat.ts (modified, +3/-3)
  • src/gateway/server-methods/agent-job.ts (modified, +4/-5)
  • src/gateway/server-methods/agent-wait-dedupe.ts (modified, +2/-2)
  • src/infra/heartbeat-runner.scheduler.test.ts (modified, +18/-0)
  • src/infra/heartbeat-runner.timeout-warning.test.ts (added, +70/-0)
  • src/infra/heartbeat-runner.ts (modified, +11/-1)
  • src/plugin-sdk/facade-loader.test.ts (modified, +2/-2)
  • src/plugins/bundled-runtime-deps.ts (modified, +40/-0)
  • src/plugins/bundled-runtime-root.ts (modified, +16/-0)
  • src/plugins/contracts/plugin-sdk-subpaths.test.ts (modified, +1/-0)
  • src/plugins/doctor-contract-registry.test.ts (modified, +2/-2)
  • src/plugins/loader.test.ts (modified, +206/-0)
  • src/plugins/loader.ts (modified, +249/-58)
  • src/plugins/public-surface-loader.test.ts (modified, +2/-2)
  • src/plugins/sdk-alias.test.ts (modified, +6/-6)
  • src/plugins/sdk-alias.ts (modified, +1/-1)
  • src/plugins/setup-registry.test.ts (modified, +2/-2)
  • src/utils/timer-delay.test.ts (added, +34/-0)
  • src/utils/timer-delay.ts (added, +19/-0)
  • test/release-check.test.ts (modified, +35/-1)
  • test/scripts/parallels-npm-update-smoke.test.ts (modified, +3/-3)
  • test/scripts/run-vitest.test.ts (modified, +34/-0)
  • ui/src/ui/app-render.ts (modified, +6/-6)
  • ui/src/ui/app.ts (modified, +1/-1)
  • ui/src/ui/controllers/chat.ts (modified, +93/-1)

PR #72926: fix(memory_search): surface session-transcript hits with corpus=sessions, not corpus=memory (#72885)

Description (problem / solution / changelog)

Fixes #72885.

Summary

`memory_search` was hardcoding `corpus: "memory" as const` for every surfaced hit, even when the underlying `MemorySearchResult.source` was `"sessions"`. That flattened provenance at the last step: downstream consumers couldn't distinguish durable memory-file hits from session-transcript hits in the tool response, even though the corpus selection contract (`memory` / `sessions` / `wiki` / `all`) and the hit's `source` field both already carry the distinction.

Fix

Drive the surfaced `corpus` from the hit's `source`:

```diff

  •          corpus: \"memory\" as const,
  •          corpus: result.source === \"sessions\" ? (\"sessions\" as const) : (\"memory\" as const),

```

Widens the `surfacedMemoryResults` element type to `corpus: "memory" | "sessions"` accordingly.

Tests

4 existing memory_search tests still pass. Added 3 new regression tests in `tools.test.ts` under `describe("memory_search corpus surfacing (#72885)")`:

  • Session-source hits surface as `corpus: "sessions"`.
  • Memory-source hits surface as `corpus: "memory"`.
  • Mixed-source result sets keep `corpus === source` for every hit.

Visibility filtering is still enforced upstream by `filterMemorySearchHitsBySessionVisibility`. The new tests mock that helper to a no-op so the corpus-surfacing assertion isolates the bug; visibility behavior itself is covered by `session-search-visibility.test.ts`.

``` Test Files 3 passed (tools.test.ts: 7, tools.citations.test.ts: 12, tools.recall-tracking.test.ts: 5) Tests 24 passed (24) ```

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • extensions/memory-core/src/tools.test.ts (modified, +84/-1)
  • extensions/memory-core/src/tools.ts (modified, +10/-2)

Code Example

{
  "corpus": "sessions",
  "source": "sessions"
}

---

{
  "corpus": "memory",
  "source": "sessions"
}
RAW_BUFFERClick to expand / collapse

Summary

When memory_search returns session-transcript hits from the built-in memory backend, the tool response labels those results with corpus: "memory" even when the underlying hit has source: "sessions".

This makes session-backed hits indistinguishable from durable memory-file hits for downstream consumers that rely on corpus.

Why this is a bug

The tool already distinguishes corpus selection (memory, sessions, wiki, all), and session transcript hits carry source: "sessions" internally. Returning corpus: "memory" for those hits is inconsistent with the tool contract and with the visible provenance of the result.

Reproduction

  1. Enable session transcript indexing so memory_search can return session-backed hits.
  2. Ensure a query exists that matches only a session transcript.
  3. Call memory_search with a query that returns that transcript hit.
  4. Inspect the returned result object.

Expected

A transcript-backed hit is surfaced with:

{
  "corpus": "sessions",
  "source": "sessions"
}

Actual

The same hit is surfaced with:

{
  "corpus": "memory",
  "source": "sessions"
}

Root cause

In extensions/memory-core/src/tools.ts, the built-in search path maps all surfaced memory results to corpus: "memory" without checking result.source.

Impact

  • Clients cannot reliably distinguish durable memory-file hits from session-transcript hits.
  • corpus: "sessions" becomes misleading in tool output even when the underlying search result is correct.
  • Debugging session-memory behavior is harder because provenance is flattened at the last step.

Proposed fix

When surfacing built-in memory_search results, map:

  • source === "sessions" -> corpus: "sessions"
  • otherwise -> corpus: "memory"

Regression coverage

Add a unit test that injects a built-in search result with source: "sessions" and asserts the surfaced tool result keeps corpus: "sessions".

extent analysis

TL;DR

Update the memory_search result mapping to set corpus based on the source field, specifically setting corpus to "sessions" when source is "sessions".

Guidance

  • Review the extensions/memory-core/src/tools.ts file to locate the search path mapping that currently sets all surfaced memory results to corpus: "memory".
  • Modify this mapping to check the result.source field and set corpus accordingly, using the proposed fix logic.
  • Add a unit test to cover the regression, injecting a built-in search result with source: "sessions" and asserting the correct corpus value in the surfaced tool result.
  • Verify that downstream consumers can correctly distinguish between session-backed hits and durable memory-file hits after applying the fix.

Example

// Proposed fix logic
if (result.source === "sessions") {
  result.corpus = "sessions";
} else {
  result.corpus = "memory";
}

Notes

The fix relies on the source field being accurately set to "sessions" for session transcript hits, which is assumed to be the case based on the issue description.

Recommendation

Apply the proposed workaround by updating the memory_search result mapping to set corpus based on the source field, as this directly addresses the inconsistency and restores the expected behavior.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix Bug: memory_search mislabels session transcript hits as corpus=memory [2 pull requests, 1 participants]