One-shot gateway model-run sessions should not accumulate like durable human conversation sessions. Possible acceptable designs: 1. Do not persist `modelRun: true` sessions in the main conversation session store, or persist them only as lightweight probe history elsewhere. 2. Give `agent:*:explicit:model-run-*` sessions a short default TTL, e.g. 24h or 48h. 3. Add a session maintenance policy for model-run/probe sessions, e.g. `session.maintenance.modelRunPruneAfter` or `perKindRetention`. 4. Extend `openclaw sessions cleanup` with a safe audited option to prune stale model-run sessions by prefix/kind.

openclaw - 💡(How to fix) Fix [Bug]: gateway model-run sessions accumulate until session maxEntries cap

After #82861, openclaw infer model run --gateway correctly uses isolated explicit sessions such as agent:main:explicit:model-run-<uuid> instead of the default agent lane. However, those one-shot model probe sessions are persisted and retained like normal conversation sessions.

On a long-lived install this caused sessions.json to fill with hundreds of agent:main:explicit:model-run-* rows until the default session.maintenance.maxEntries=500 cap was effectively saturated. Native openclaw sessions cleanup --enforce only capped the store back to 500, leaving almost no healthy headroom because the model-run rows were still younger than the default pruneAfter=30d.

This looks like a lifecycle gap for ephemeral model-run sessions, not a failure of the #82861 isolation fix itself.

Root Cause

The current behavior converts one-shot provider/model probes into long-lived session-store entries.
The default global cap can be consumed almost entirely by probe sessions.
Operators have to write ad hoc direct sessions.json scripts to remove stale model-run rows, which is risky and not discoverable.
This is adjacent to but distinct from orphan transcript cleanup (#77941), per-label retention (#76827), and cleanup cap/stale enforcement (#83124).

Code Example

openclaw status
Sessions: 549 active
Tasks: 0 active · 0 queued · 0 running

---

{
  "total": 549,
  "buckets": {
    "other": 13,
    "group": 8,
    "cron": 21,
    "model-run": 507
  },
  "older7d": 70,
  "older1d": 513,
  "active4h": 11
}

---

{
  "beforeCount": 549,
  "afterCount": 500,
  "missing": 0,
  "dmScopeRetired": 0,
  "pruned": 0,
  "capped": 49,
  "applied": true,
  "appliedCount": 500
}

---

{
  "total": 500,
  "buckets": {
    "other": 9,
    "group": 8,
    "cron": 20,
    "model-run": 463
  },
  "older7d": 21,
  "older1d": 464,
  "active4h": 10
}

---

openclaw status
Sessions: 61 active

---

{
  "total": 61,
  "buckets": {
    "other": 9,
    "group": 8,
    "cron": 20,
    "model-run": 24
  },
  "older7d": 1,
  "older1d": 25,
  "active4h": 10
}

---

{
  "ok": true,
  "eventLoop": {
    "degraded": false,
    "reasons": []
  },
  "sessionCount": 61
}

Bug type

Behavior bug (incorrect persisted state / maintenance gap)

Beta release blocker

Summary

This looks like a lifecycle gap for ephemeral model-run sessions, not a failure of the #82861 isolation fix itself.

Environment

OpenClaw: 2026.5.26 (installed stable)
npm latest observed locally: 2026.5.28
OS: macOS 15.7.4 arm64
Node: 25.8.1
Gateway: LaunchAgent local gateway
Affected session keys: agent:main:explicit:model-run-*

Evidence from production install

Before cleanup:

openclaw status
Sessions: 549 active
Tasks: 0 active · 0 queued · 0 running

Session store composition, counted from ~/.openclaw/agents/main/sessions/sessions.json:

{
  "total": 549,
  "buckets": {
    "other": 13,
    "group": 8,
    "cron": 21,
    "model-run": 507
  },
  "older7d": 70,
  "older1d": 513,
  "active4h": 11
}

Native cleanup result:

{
  "beforeCount": 549,
  "afterCount": 500,
  "missing": 0,
  "dmScopeRetired": 0,
  "pruned": 0,
  "capped": 49,
  "applied": true,
  "appliedCount": 500
}

After native cleanup, the store was still dominated by model-run rows:

{
  "total": 500,
  "buckets": {
    "other": 9,
    "group": 8,
    "cron": 20,
    "model-run": 463
  },
  "older7d": 21,
  "older1d": 464,
  "active4h": 10
}

Manual TTL cleanup of only stale model-run entries restored healthy headroom:

Removed: 439 rows matching agent:main:explicit:model-run-* with updatedAt > 24h old
Kept: 24 recent model-run rows
Preserved: group/direct/cron sessions

After manual model-run TTL cleanup:

openclaw status
Sessions: 61 active

{
  "total": 61,
  "buckets": {
    "other": 9,
    "group": 8,
    "cron": 20,
    "model-run": 24
  },
  "older7d": 1,
  "older1d": 25,
  "active4h": 10
}

Gateway health after cleanup:

{
  "ok": true,
  "eventLoop": {
    "degraded": false,
    "reasons": []
  },
  "sessionCount": 61
}

Expected behavior

One-shot gateway model-run sessions should not accumulate like durable human conversation sessions.

Possible acceptable designs:

Do not persist modelRun: true sessions in the main conversation session store, or persist them only as lightweight probe history elsewhere.
Give agent:*:explicit:model-run-* sessions a short default TTL, e.g. 24h or 48h.
Add a session maintenance policy for model-run/probe sessions, e.g. session.maintenance.modelRunPruneAfter or perKindRetention.
Extend openclaw sessions cleanup with a safe audited option to prune stale model-run sessions by prefix/kind.

Actual behavior

model-run-<uuid> rows are retained under the same global pruneAfter=30d and maxEntries=500 policy as durable conversation sessions.

When there are many model probes, the global cap becomes dominated by probe sessions. Native cleanup caps overflow but still leaves the store near 500, so the install can quickly hit the same pressure again.

Why this matters

The current behavior converts one-shot provider/model probes into long-lived session-store entries.
The default global cap can be consumed almost entirely by probe sessions.
Operators have to write ad hoc direct sessions.json scripts to remove stale model-run rows, which is risky and not discoverable.
This is adjacent to but distinct from orphan transcript cleanup (#77941), per-label retention (#76827), and cleanup cap/stale enforcement (#83124).

#82861 introduced the explicit model-run-<uuid> session isolation for gateway model runs. This issue is about lifecycle/retention for those isolated one-shot sessions after they are created.
#77941 asks for audited orphan/unindexed transcript archive/prune.
#76827 asks for per-label retention.
#83124 covers a cleanup enforce regression for cap-overflow/prune-stale, but in this production case native cleanup did apply; it just could not create healthy headroom because model-run rows were still within the default age retention window.

Workaround

Backup the session store, then remove only old agent:main:explicit:model-run-* entries by TTL. This is effective but should not be the long-term operator workflow.

Acceptance criteria

Repeated openclaw infer model run --gateway invocations do not cause agent:main:explicit:model-run-* rows to dominate sessions.json under default maintenance settings.
openclaw sessions cleanup --dry-run --json or another supported command can show and safely prune stale model-run/probe sessions without direct store editing.
Durable direct/group/cron sessions remain protected by the normal retention rules.

FAQ

Expected behavior

One-shot gateway model-run sessions should not accumulate like durable human conversation sessions.

Possible acceptable designs:

Do not persist modelRun: true sessions in the main conversation session store, or persist them only as lightweight probe history elsewhere.
Give agent:*:explicit:model-run-* sessions a short default TTL, e.g. 24h or 48h.
Add a session maintenance policy for model-run/probe sessions, e.g. session.maintenance.modelRunPruneAfter or perKindRetention.
Extend openclaw sessions cleanup with a safe audited option to prune stale model-run sessions by prefix/kind.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix [Bug]: gateway model-run sessions accumulate until session maxEntries cap

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Workaround

Code Example

Bug type

Beta release blocker

Summary

Environment

Evidence from production install

Expected behavior

Actual behavior

Why this matters

Related

Workaround

Acceptance criteria

FAQ

Expected behavior

Still need to ship something?

TRENDING