openclaw - 💡(How to fix) Fix Performance: Session list queries trigger full materialization + synchronous transcript reads [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#72165Fetched 2026-04-27 05:33:57
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0
Author
Participants

Session list and lookup queries currently materialize all session rows (including synchronous transcript reads) before applying search, filter, and limit. This causes query latency to grow linearly with the total number of sessions.

Root Cause

Session list and lookup queries currently materialize all session rows (including synchronous transcript reads) before applying search, filter, and limit. This causes query latency to grow linearly with the total number of sessions.

Fix Action

Workaround

Pre-populating totalTokens, inputTokens, outputTokens, cacheRead, cacheWrite, and estimatedCostUsd into sessions.json entries allows buildGatewaySessionRow to skip resolveTranscriptUsageFallback, reducing transcript reads. However, this does not help with readSessionTitleFieldsFromTranscript or the architectural issues (full scan before limit, getConversation with limit:500).

Code Example

// session-utils L890-899
const sessions = filteredEntries
  .map(([key, entry]) => buildGatewaySessionRow({ cfg, key, entry, ...params }))
  .toSorted((a, b) => (b.updatedAt ?? 0) - (a.updatedAt ?? 0));
// search/limit applied AFTER full materialization

---

// session-utils.fs L491
const content = fs.readFileSync(fd, "utf-8");

---

const entry = Object.entries(store).find(
  ([key, entry]) => entry?.sessionId === sessionId
);

---

const matches = listSessionsFromStore(...).sessions
  .filter((session) => session.sessionId === sessionId || session.key === sessionId);

---

async getConversation(sessionKey) {
    return (await this.listConversations({
        limit: 500,
        includeLastMessage: true
    })).find((conversation) => conversation.sessionKey === normalizedSessionKey) ?? null;
}
RAW_BUFFERClick to expand / collapse

Summary

Session list and lookup queries currently materialize all session rows (including synchronous transcript reads) before applying search, filter, and limit. This causes query latency to grow linearly with the total number of sessions.

Reproduction

Tested on [email protected] with 387 sessions in agents/main/sessions/sessions.json.

Issue 1: Full materialization before pagination

In session-utils (buildGatewaySessionRow), every session entry triggers:

  • resolveTranscriptUsageFallback() → reads the full transcript file to extract token usage
  • readSessionTitleFieldsFromTranscript() → reads the full transcript file to extract derived title and last message preview

These happen for all entries before search and limit are applied:

// session-utils L890-899
const sessions = filteredEntries
  .map(([key, entry]) => buildGatewaySessionRow({ cfg, key, entry, ...params }))
  .toSorted((a, b) => (b.updatedAt ?? 0) - (a.updatedAt ?? 0));
// search/limit applied AFTER full materialization

The transcript reads use synchronous I/O (readFileSync), blocking the event loop:

// session-utils.fs L491
const content = fs.readFileSync(fd, "utf-8");

Issue 2: Single session lookup uses full list scan

In sessions-resolve L53-55:

const entry = Object.entries(store).find(
  ([key, entry]) => entry?.sessionId === sessionId
);

And in gateway-cli, the pattern is:

const matches = listSessionsFromStore(...).sessions
  .filter((session) => session.sessionId === sessionId || session.key === sessionId);

This means looking up a single session by ID still triggers materialization of all sessions.

Issue 3: MCP getConversation uses limit: 500 list reuse

In mcp-cli L179-185:

async getConversation(sessionKey) {
    return (await this.listConversations({
        limit: 500,
        includeLastMessage: true
    })).find((conversation) => conversation.sessionKey === normalizedSessionKey) ?? null;
}

listConversations defaults to includeDerivedTitles: true and includeLastMessage: true, so each of the 500 sessions triggers transcript reads just to find one specific session.

Impact

  • Session list page latency grows linearly with total session count
  • Single-session lookups pay the cost of scanning all sessions
  • Synchronous readFileSync blocks the Node.js event loop during queries
  • With 387 sessions, this is already noticeable; at 1000+ it would be significantly worse

Suggested Fix

Introduce a lightweight SessionSummaryIndex that:

  1. Pre-computes summary fields at write time (title, lastMessage preview, token usage) rather than reading transcripts at query time
  2. Applies filter/search/limit on summaries first, then hydrates only the current page from transcripts if needed
  3. Provides direct lookup by sessionId without scanning all sessions
  4. Changes getConversation to use direct lookup instead of listConversations(limit: 500)

Workaround

Pre-populating totalTokens, inputTokens, outputTokens, cacheRead, cacheWrite, and estimatedCostUsd into sessions.json entries allows buildGatewaySessionRow to skip resolveTranscriptUsageFallback, reducing transcript reads. However, this does not help with readSessionTitleFieldsFromTranscript or the architectural issues (full scan before limit, getConversation with limit:500).

extent analysis

TL;DR

Implement a SessionSummaryIndex to pre-compute summary fields and apply filters before hydrating sessions from transcripts.

Guidance

  • Introduce a SessionSummaryIndex to store pre-computed summary fields, such as title, last message preview, and token usage, to reduce transcript reads.
  • Modify the query pipeline to apply filters and limits on the SessionSummaryIndex before hydrating sessions from transcripts.
  • Implement direct lookup by sessionId to avoid scanning all sessions for single-session lookups.
  • Update getConversation to use direct lookup instead of listConversations(limit: 500).

Example

// Example of pre-computing summary fields at write time
const sessionSummaryIndex = {};
function updateSessionSummary(sessionId, sessionData) {
  const summary = {
    title: extractTitleFromTranscript(sessionData.transcript),
    lastMessagePreview: extractLastMessagePreviewFromTranscript(sessionData.transcript),
    tokenUsage: calculateTokenUsage(sessionData.transcript),
  };
  sessionSummaryIndex[sessionId] = summary;
}

Notes

The suggested fix requires significant changes to the existing architecture, but it addresses the root causes of the performance issues. The workaround can provide some relief by reducing transcript reads, but it does not address the underlying architectural issues.

Recommendation

Apply the suggested fix by introducing a SessionSummaryIndex to pre-compute summary fields and apply filters before hydrating sessions from transcripts. This approach addresses the root causes of the performance issues and provides a more scalable solution.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING