openclaw - 💡(How to fix) Fix Track core session/transcript SQLite migration via accessor seam

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Track the core session/transcript runtime-state migration to SQLite as a sequence of small, reviewable PRs using a branch-by-abstraction seam.

The goal is to avoid landing the session/transcript migration as one large, high-risk rewrite. The prior full migration changed hundreds of callers at the same time as the storage backend changed. This issue separates those concerns: first stabilize a storage-agnostic accessor shape over the current file store, then migrate existing direct callers onto that shape by subsystem, then add SQLite storage, then flip the implementation behind that seam, then modernize callers incrementally.

Root Cause

The previous full branches did not fail because SQLite storage was fundamentally wrong. They failed because several independently risky changes were fused together:

Fix Action

Fix / Workaround

loadSessionEntry(scope): SessionEntry
upsertSessionEntry(scope, patch): void
listSessionEntries(agentId): SessionEntrySummary[]
loadTranscriptEvents(scope): TranscriptEvent[]
appendTranscriptEvent(scope, event): void
  • gateway session core: src/gateway/session-utils*, session keys, session patch/history helpers, and central session row loading
  • gateway transcript/history surfaces: src/gateway/server-methods/chat*, sessions-history-http, session-transcript-files.fs, session-compaction-checkpoints, and lifecycle/event helpers
  • auto-reply: src/auto-reply/reply/session*, dispatch-from-config*, get-reply-*, agent-runner*, command handlers, follow-up runner, directive persistence, and usage updates
  • agent command/runtime: src/agents/command/*, agent-command, acp-spawn, live model switching, subagent control/list/registry, and command session store helpers
  • embedded agent transcript paths: src/agents/embedded-agent-runner/**, compaction hooks, transcript rewrite/truncation, session-manager init, and session lock integration
  • cron/infra/commands: isolated-agent delivery/session paths, session reaper, heartbeat runner, approval-account binding, status/export/health commands, and doctor state-integrity reads
  • plugin runtime and SDK: src/plugins/runtime/*, src/plugin-sdk/*, packages/memory-host-sdk/*, and public session runtime facades
  • bundled channel/plugin consumers: active memory, Telegram native commands/message dispatch, Discord native commands, voice-call response generation, and Codex native execution policy
  • transcript event/mirror helpers: src/sessions/user-turn-transcript.ts, transcript events, mirror resolution, append helpers, stream readers, and related tests

Code Example

loadSessionEntry(scope): SessionEntry
upsertSessionEntry(scope, patch): void
listSessionEntries(agentId): SessionEntrySummary[]
loadTranscriptEvents(scope): TranscriptEvent[]
appendTranscriptEvent(scope, event): void
RAW_BUFFERClick to expand / collapse

Summary

Track the core session/transcript runtime-state migration to SQLite as a sequence of small, reviewable PRs using a branch-by-abstraction seam.

The goal is to avoid landing the session/transcript migration as one large, high-risk rewrite. The prior full migration changed hundreds of callers at the same time as the storage backend changed. This issue separates those concerns: first stabilize a storage-agnostic accessor shape over the current file store, then migrate existing direct callers onto that shape by subsystem, then add SQLite storage, then flip the implementation behind that seam, then modernize callers incrementally.

Why prior attempts failed

The previous full branches did not fail because SQLite storage was fundamentally wrong. They failed because several independently risky changes were fused together:

  1. storage changed from file-backed session/transcript state to SQLite rows
  2. caller-facing shape changed from file-entry/session-file oriented data to SQLite row/scope oriented data
  3. hundreds of callers were updated in one large diff
  4. doctor migration scaffolding pulled in unrelated migration slices
  5. plugin SDK session surfaces rippled through bundled and external plugin-facing paths

The migration is landable only if storage, shape, caller adoption, and public SDK modernization are reviewed as separate steps.

Problem to solve

OpenClaw policy is database-first runtime state: no runtime dual-read, no runtime fallback reader, and legacy file reads only inside openclaw doctor --fix migrations.

The hard part is that the session/transcript read path is consumed broadly across core surfaces:

  • gateway session methods, chat history, transcript files, and compaction checkpoints
  • auto-reply session reset/update, fast-path, directive, goal, export, and runner paths
  • agent command, embedded runner, subagent, ACP spawn, compaction, and transcript-rewrite paths
  • cron, heartbeat, status, export, health, and doctor state integrity commands
  • plugin runtime, plugin SDK session surfaces, and bundled plugin consumers
  • transcript readers, append paths, mirrors, events, and file-semantic helpers

The storage flip itself should be much smaller if callers already depend on a stable domain shape. The missing step is that caller adoption must also be sliced; it cannot be one giant 3.1 PR.

Proposed approach

Use branch-by-abstraction and land the migration as ordered, incremental PRs.

3.1a. Introduce a stable accessor seam over current file storage

Add a canonical internal accessor surface implemented on top of the current file store. This PR should not add SQLite schema, SQLite store modules, caller rewrites, or runtime dual-read behavior.

Target shape:

loadSessionEntry(scope): SessionEntry
upsertSessionEntry(scope, patch): void
listSessionEntries(agentId): SessionEntrySummary[]
loadTranscriptEvents(scope): TranscriptEvent[]
appendTranscriptEvent(scope, event): void

The important part is that SessionEntry, SessionEntrySummary, and TranscriptEvent are domain shapes, not raw file store internals and not raw Kysely rows.

3.1b+. Adopt the seam by subsystem before the storage flip

Move direct runtime callers of the file-backed store/transcript helpers onto the stable accessor seam in small subsystem PRs. This is required before the storage flip; it is not optional post-flip cleanup.

This step is what prevents the old 300-file cascade from reappearing inside 3.1. Each subsystem PR should be behavior-neutral and file-backed, and should leave unrelated direct callers alone until their own slice.

Candidate subsystem batches from the current direct-call inventory:

  • gateway session core: src/gateway/session-utils*, session keys, session patch/history helpers, and central session row loading
  • gateway transcript/history surfaces: src/gateway/server-methods/chat*, sessions-history-http, session-transcript-files.fs, session-compaction-checkpoints, and lifecycle/event helpers
  • auto-reply: src/auto-reply/reply/session*, dispatch-from-config*, get-reply-*, agent-runner*, command handlers, follow-up runner, directive persistence, and usage updates
  • agent command/runtime: src/agents/command/*, agent-command, acp-spawn, live model switching, subagent control/list/registry, and command session store helpers
  • embedded agent transcript paths: src/agents/embedded-agent-runner/**, compaction hooks, transcript rewrite/truncation, session-manager init, and session lock integration
  • cron/infra/commands: isolated-agent delivery/session paths, session reaper, heartbeat runner, approval-account binding, status/export/health commands, and doctor state-integrity reads
  • plugin runtime and SDK: src/plugins/runtime/*, src/plugin-sdk/*, packages/memory-host-sdk/*, and public session runtime facades
  • bundled channel/plugin consumers: active memory, Telegram native commands/message dispatch, Discord native commands, voice-call response generation, and Codex native execution policy
  • transcript event/mirror helpers: src/sessions/user-turn-transcript.ts, transcript events, mirror resolution, append helpers, stream readers, and related tests

This map is intentionally a starting inventory, not a finished per-file plan. Before each 3.1b slice, re-run the direct-call inventory for that subsystem and classify every hit as one of:

  • move to the seam in this PR
  • leave temporarily because another subsystem owns it
  • keep as file-only doctor/migration code
  • keep as public SDK compatibility surface with a named deprecation plan
  • treat as semantic coupling that needs a separate design before SQLite

3.2. Add SQLite schema and store modules, unused or test-only

Add the SQLite tables and store modules that implement the same seam, covered by unit tests against a temp DB, but do not wire them into runtime paths yet.

Expected tables include core session/transcript tables such as:

  • sessions
  • session_entries
  • transcript_events
  • transcript_event_identities
  • transcript_snapshots

If run/tool artifacts are truly FK-coupled to sessions, decide that explicitly in this slice instead of letting them drift into an unrelated migration.

3.3/3.4. Flip storage and migrate legacy files release-atomically

Once runtime callers use the stable seam, switch the seam implementation to SQLite and import legacy session/transcript files through openclaw doctor --fix migration code.

The runtime must not read both stores. After the flip, runtime reads SQLite only, and legacy file reads belong only in doctor migration code.

Important release constraint: no shipped build should read empty SQLite session/transcript tables before legacy data has been imported. If upgrade migration is not automatic and unskippable before runtime reads SQLite, the storage flip and importer must land together or be otherwise release-gated so users do not see empty session history after upgrade.

3.5+. Modernize callers by subsystem after the flip

After the SQLite flip, callers can be modernized in separate PRs to use richer SQLite-native queries and scopes.

This is separate from 3.1b adoption. 3.1b is required before the flip so callers stop depending directly on file storage. 3.5+ is follow-through after the flip. Either treat the seam as a real permanent internal API with a named contract, or commit to this modernization/removal work as required cleanup. Do not frame the seam as optional cleanup if it is only a temporary shim.

Why this satisfies the atomic migration policy

This approach keeps the runtime on exactly one storage backend at each point:

  • before the flip, the seam reads current file storage only
  • after the flip, the seam reads SQLite only
  • legacy file import exists only in doctor migration code

The stable seam is not a fallback stack. It is an internal boundary that allows storage migration and caller adoption to land separately. Its long-term status must be explicit: either a permanent domain API or a temporary boundary removed by required follow-through work.

Acceptance criteria

  • Session/transcript storage migration lands through small, ordered PRs rather than a monolithic caller rewrite.
  • 3.1a introduces an additive file-backed seam with no storage behavior change.
  • 3.1b+ migrates direct runtime callers onto the seam by subsystem before the SQLite flip.
  • Each 3.1b slice includes a direct-call inventory for its subsystem and preserves current file-backed behavior.
  • The SQLite store implementation satisfies the same seam before being wired into runtime.
  • The runtime storage flip is atomic and does not leave runtime dual-read or fallback behavior.
  • Legacy session/transcript file import is implemented only in openclaw doctor --fix migration code.
  • No released build can read SQLite session/transcript state before required legacy import has run.
  • Post-flip caller modernization is either required cleanup or the seam is justified as permanent architecture.

Risks and constraints

3.1 landability

The seam PR alone does not solve landability if all direct callers move in one PR. The caller-adoption step must be a stack of small subsystem PRs.

Shape drift

If the stable domain shape leaks file-specific or SQLite-specific details, the seam will not reduce the migration blast radius. The first PRs should keep the seam narrow and verify that callers can keep their current shape expectations.

Silent semantic drift

A type-stable seam can still change behavior when the implementation flips. SQLite ordering must be explicit anywhere the file store currently relies on append order. Inventory and tests should cover ordering, truncation/partial-write behavior, file permissions, transcript rotation, sessionFile path resolution, mtime/stat use, and file archive semantics.

SQLite concurrency

File append and SQLite writes have different concurrency behavior. The SQLite store slice needs an explicit transaction/WAL/busy-timeout design for gateway, CLI, cron, and isolated-agent writers.

Plugin SDK compatibility

Plugin SDK session surfaces are public contract. Any SDK-facing modernization should be additive-first and should not break bundled or third-party plugins during the storage flip.

ACP ownership

ACP session/runtime work has been moving independently. SQLite session work should avoid taking over ACP-owned tables or behavior and should resolve conflicts toward the current ACP-owned schema and runtime paths.

Doctor migration scope

The doctor migration must stay focused on session/transcript import. It should not become a broad umbrella for unrelated legacy state migrations.

Related context

  • Related companion-facing public seam request: #79902
  • The first PR for this issue should be an additive file-backed accessor seam and should not close this issue.

PRs

  • #88840 - first additive file-backed accessor seam. This does not close the migration tracking issue.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Track core session/transcript SQLite migration via accessor seam