openclaw - 💡(How to fix) Fix [PERF] Gateway scans ALL agent sessions for EVERY message — defeats agent isolation, prevents scaling

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

When a user messages a specific agent (e.g., main), the Gateway currently scans session files across ALL agents, not just the target agent. This creates a critical scalability bottleneck.

Root Cause

When a user messages a specific agent (e.g., main), the Gateway currently scans session files across ALL agents, not just the target agent. This creates a critical scalability bottleneck.

Fix Action

Fix / Workaround

Session explosion from cron jobs using "sessionTarget": "isolated" (created 358 sessions in 4 days) Even after cleanup to 14 sessions, response times are 10-15s (should be 2-3s) Daily auto-cleanup cron is a workaround, not a solution

RAW_BUFFERClick to expand / collapse

Summary

When a user messages a specific agent (e.g., main), the Gateway currently scans session files across ALL agents, not just the target agent. This creates a critical scalability bottleneck.

Problem to solve

User messages (main agent) ↓ Gateway scans: ✅ /agents/main/sessions/.jsonl (needed) ❌ /agents/agen1/sessions/.jsonl (irrelevant!) ❌ /agents/agen2/sessions/.jsonl (irrelevant!) ❌ /agents/agen3/sessions/.jsonl (irrelevant!) ❌ /agents/agen4/sessions/*.jsonl (irrelevant!) ... (all 12+ agents) Real-world impact (my setup):

12 agents × 2 sessions each = 24 files scanned per message After session explosion bug: 358 files scanned → 128-second response times After cleanup: 14 files scanned → 10-15 second response times Expected with fix: 2 files scanned → 2-3 second response times (theoretical)

Expected Behavior

User messages Friday (main agent) ↓ Gateway loads ONLY: ✅ /agents/main/sessions/*.jsonl

Skips entirely: ⏭️ /agents/agen11/sessions/.jsonl ⏭️ /agents/agent2/sessions/.jsonl ⏭️ (all other agents)

Benefits 92% reduction in file I/O (24 files → 2 files in my setup) Linear scaling — adding agents doesn't slow down existing agents True agent isolation — each agent's context is truly separate Faster responses — eliminated unnecessary I/O latency

Proposed solution

Agent-Specific Session Loading:

Use agentId from conversation binding context Load sessions only for that specific agent directory Skip all other agent directories entirely Pseudocode:

Copy // Current (broken): async function loadSessions() { const allSessions = []; for (const agentDir of allAgentDirectories) { // ❌ Scans ALL agents allSessions.push(...loadAgentSessions(agentDir)); } return allSessions; }

// Fixed: async function loadSessions(agentId: string) { const agentDir = getAgentDirectory(agentId); return loadAgentSessions(agentDir); // ✅ Only loads target agent }

Alternatives considered

/home/node/.openclaw/scripts/session-cleanup.sh

Keeps last 2 sessions per agent, archives the rest

But this is a band-aid — the real fix is agent-specific session loading.

Impact

Why This Matters Wasted I/O: Scanning irrelevant session files adds latency to EVERY message No Scaling: 100 agents = 100x more file I/O for every single message Defeats Agent Isolation: Whole point of separate agents is separate contexts! Performance Ceiling: Even with perfect session hygiene, you're stuck with N× overhead

Performance Impact

Scenario | Files Scanned | Expected Latency Current (12 agents) | 24 files | 10-15s Current (100 agents) | 200 files | 60-80s (estimated) Fixed (any # agents) | 2 files | 2-3s (target)

Based on dist file names, likely in:

/app/dist/session-identity-DLyko4hy.js /app/dist/conversation-bindings-DvWx9Azm.js /app/dist/bindings-BJ3B158a.js

Evidence/examples

I have asked my. main agent to help me with diagnostics. She was the one who wrote this bug report.

I also asked the agent to include a footer with every response. Here is what I saw with my request to write the Bug Report. The LLM response time is ok - 3.8 Sec but the elapsed time is 28 seconds.

The openClaw agent traced and claimed that the smoking gun is the gateway reads session files from all agents. That is a security risk and waste of computing resource to limit scalability.

Ready to submit, Benjamin! Just copy/paste this into GitHub issues. 🦞

🧠 Model: gemma4:31b-cloud ⏱️ LLM Response Time: 3.8 seconds 📬 Total Elapsed: 28 seconds 🔍 Web Searches: 0 📊 Tokens: 142/98 🧠 Context: 99k/262k (38%) 🕐 Completed: April 20, 2026 12:44:45 PM PDT

Additional information

My Setup:

OpenClaw 2026.4.19-beta.2 12 agents iMac2020 i7 with 70g of ram with Ollama proxy (gemma4:31b-cloud) Docker Desktop on macOS Related Issues:

Session explosion from cron jobs using "sessionTarget": "isolated" (created 358 sessions in 4 days) Even after cleanup to 14 sessions, response times are 10-15s (should be 2-3s) Daily auto-cleanup cron is a workaround, not a solution

extent analysis

TL;DR

Implement agent-specific session loading to reduce file I/O and improve scalability by loading sessions only for the target agent directory.

Guidance

  • Identify the agentId from the conversation binding context and use it to determine the specific agent directory to load sessions from.
  • Modify the loadSessions function to accept an agentId parameter and load sessions only for that agent directory.
  • Verify the fix by measuring the reduction in file I/O and response times for messages sent to different agents.
  • Consider implementing a daily auto-cleanup cron job as a temporary workaround to mitigate the session explosion issue.

Example

async function loadSessions(agentId: string) {
  const agentDir = getAgentDirectory(agentId);
  return loadAgentSessions(agentDir);  // Only loads target agent
}

Notes

The proposed solution assumes that the getAgentDirectory function is implemented and returns the correct agent directory based on the agentId. Additionally, the loadAgentSessions function should be modified to load sessions only for the specified agent directory.

Recommendation

Apply the workaround by implementing agent-specific session loading, as it addresses the root cause of the issue and provides a significant reduction in file I/O and response times.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING